Hi, I am working with a blast xml as the one below:
<Hit_num>2</Hit_num>
<Hit_id>gnl|BL_ORD_ID|79</Hit_id>
<Hit_def>NODE_82_length_56137_cov_22.281775</Hit_def>
<Hit_accession>79</Hit_accession>
<Hit_len>56195</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>80.5295</Hsp_bit-score>
<Hsp_score>46</Hsp_score>
<Hsp_evalue>3.65762e-15</Hsp_evalue>
<Hsp_query-from>274</Hsp_query-from>
<Hsp_query-to>515</Hsp_query-to>
<Hsp_hit-from>17046</Hsp_hit-from>
<Hsp_hit-to>17290</Hsp_hit-to>
<Hsp_query-frame>1</Hsp_query-frame>
<Hsp_hit-frame>1</Hsp_hit-frame>
<Hsp_identity>182</Hsp_identity>
<Hsp_positive>182</Hsp_positive>
<Hsp_gaps>13</Hsp_gaps>
<Hsp_align-len>250</Hsp_align-len>
<Hsp_qseq>ATGCGCGAAAAAGACGAGAAAATTCCGCTGCGTCACGGTGTGCTGGGTCAGGAAACCTGCGGCCCGGGTGGGATCTCTTACGGCATGCGCTCCAT-CGGCGGCGTACTGG-A-A-CTGGTGGATTATATGGAGCAGTACTC-CCCG--AACGCGTGGATGCTGAACTACTCCAACCCGGCGGCGATTGTGGCGGAAG-CGACGCGCCGCCTGCGTCCGCATGCCAAAATTCTCAACATCTGCGATATGCC</Hsp_qseq>
<Hsp_hseq>ATGCGCGAGCAGGATGAAAAGATCCCGCTGCGCCACGGGGTGGTCGGCCAGGAAACCTGCGGACCCGGCGGGCTGGCATATGGCCTGCGTACTATCCTGC--CG-A-TGGTAGAGCTGATCGATCGGGTGGAACGCTACGCGCACGAAAAGGCGTGGATCGTGAACTACTCCAACCCGGCGGCGATTGTCGCTGAGGGCGT-GCGTCGCCTGCGCCCCAACGCACGCGTGCTGAACATCTGCGATATGCC</Hsp_hseq>
<Hsp_midline>|||||||| | || || || || |||||||| ||||| ||| | || |||||||||||||| || || ||| | | || ||| |||| | || | || || | ||| | | ||| | ||| |||| | ||| | | || || |||||||| |||||||||||||||||||||||||||| || || | || ||| |||||||| || | || | || |||||||||||||||||</Hsp_midline>
</Hsp>
I am using a script based on the one provide by biopython Cook's book (below) but this is what I get;
(None, None)
from Bio.Blast import NCBIXML
blast_record = NCBIXML.read(result_handle)
for alignment in blast_record.alignments:
... for hsp in alignment.hsps:
print hsp.strand
Can anyone help me to extract in which strand the subj.sequence is, please? I would like to do this so I can work with the sequence. What I would like to do is to get to the contigs and add the missing bases but if it is in the strand -1 I need to reversed_complement() the sequence. Thanks
Thanks, Unluckily, I cannot used grep as I wanted it as part of a python programme I was working on. By now I had finished it. I worked out the strand by the start and end positions.