How to get organism description from BLAST output using biopython
1
1
Entering edit mode
5.1 years ago

Hello, I woud like to ask you... I have file of like 1000 sequences from BLAST results with clasic header like this...

">PLN78092.1 putative endo-1,3(4)-beta-glucanase [Aspergillus taichungensis]"

I would like to change the header of every sequence to contain only name of organisms... using biopython.

">Aspergillus taichungensis"

When I download results in fasta format and I parse it using biopython I can find organism name only in description, but in desctiption there is a whole header

from Bio import SeqIO

records = list(SeqIO.parse("sequence.fasta", "fasta"))

for x in range (len(records)):

print(records[x].description)

PLN78092.1 putative endo-1,3(4)-beta-glucanase [Aspergillus taichungensis] ...

Ofcourse I could just extract text in brackets "[ ]", but is there any way how to get only the name for example by parsing .xml format of results? Something like this:

from Bio.Blast import NCBIXML

result_handle = open("sequences.xml")

blast_records = NCBIXML.parse(result_handle)

blast_records = list(blast_records)

print(blast_records[0].organism) #this is not working

biopython BLAST python BLAST output parsing • 2.2k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better.
code_formatting

Thank you!

ADD REPLY
0
Entering edit mode
5.1 years ago

Hi, this is done as follows:

result_handle = open("sequences.xml", 'r')
blast_records = list(NCBIXML.parse(result_handle))
one_query = blast_records[0]            # one blast query
one_hit = one_query.alignments[0]       # this is one hit
print(one_hit.hit_def)                  # this is probably what you want
print(one_hit.hit_id)                   # you might also want this
ADD COMMENT
0
Entering edit mode

Thank you! Actually this didn't solved my problem, but now I see where is problem... BLAST output has no attribute describing only organism... in hit_def there is again whole description ... so I'll just extract text in brackets...

ADD REPLY

Login before adding your answer.

Traffic: 1850 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6