How Can I Save A Blast Record (.Xml) Into A Suitable Format For Excel (.Xls)?
3
1
Entering edit mode
12.2 years ago

I have made a BLAST search through the NCBIWWW.qblast command in Biopython. After having my results saved in a .xml format (by result_handle way, following Biopython tutorial chapter 7) I have parsed them, also following chapter 7.3, but when I have them at this point, I don't know how I can see results outside Biopython, for instance, in any Microsoft Office program (Excel, Access) to see what my compared sequences are similar to.

Thanks.

blast biopython output • 8.2k views
ADD COMMENT
1
Entering edit mode

You could ask BLAST to give you tabular output - which is easy to work with in Excel and in Python.

ADD REPLY
4
Entering edit mode
12.2 years ago
norlingjr ▴ 90

I'd think a bit about what you want in your output and parse it directly from BioPython.

in your script you probably have a line something like

result_handle = NCBIWWW.qblast("blastn", "nt", ...)

this row fetches the xml result from NCBI that you're writing to file. Instead of just writing it to an xml file though, we can parse it!

start by importing NCBI's XML-parser

from Bio.Blast import NCBIXML

now we can make a parser object!

try somthing like:

blast_records = NCBIXML.parse(result_handle)

for record in blast_records:
print record.query
for alignment in record.alignments:
    print alignment.title
    print alignment.hit_id
    print alignment.hit_def
    for hsp in alignment.hsps:
        print hsp.score
        print hsp.bits
        print hsp.expect
        print hsp.query
        print hsp.match
        print hsp.sbjct

to get a feeling of what information you can get out of the blast and decide what you'd like in your excel sheet. Then save it as csv (comma-separated-values, a file-format you can open in excel).

something like this:

separator = ","
header = ['title', 'hit_id', 'score', 'e-value', ...]
with open("my-output.csv", 'w') as f:
    f.write( "%s\n" % (separator.join(header))  )
    for record in blast_records:
        for hsp in alignment.hsps:
            f.write( "%s%s" % (print alignment.title, separator) )
            f.write( "%s%s" % (print alignment.hit_id, separator) )
            f.write( "%s%s" % (print hsp.score, separator) )
            f.write( "%s%s" % (print hsp.expect, separator) )
            :
            f.write( "\n" )

I haven't tested this code, but the principle should be fairly sound.

ADD COMMENT
3
Entering edit mode
12.2 years ago

You can transform your XML results to TSV and import it into excel using XSLT. See:

tools parsing NCBI blast -m 7 xml output format?

PS: But you should learn to use some basic command-line oriented tools instead of using M$ excel.

ADD COMMENT
2
Entering edit mode

...or just use -m 8 output in the first place.

ADD REPLY
0
Entering edit mode
12.2 years ago
Ketil 4.1k

XML is a text format, so if you really want to look at it, you should be able to load it into Word or similar. Possibly renaming it to something.txt. In Biopython, you should be able to query whatever objects you get from parsing the files from the REPL (Python interpreter command line). If you just want to look at alignments, you'd be much better off just using the Web based BLAST service from NCBI, though.

ADD COMMENT

Login before adding your answer.

Traffic: 1561 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6