Is it possible to get local blast output directly in Biopython, without making and reading an XML file?
1
1
Entering edit mode
9.6 years ago
atapee ▴ 10

Hi

I am currently filtering through a large amount of reads from a 454 machine. I would like to blast every read against a local blast database and check if the sequence is indeed from a targeted genus/species and not a contaminant. For that I would like to get a direct output to python, so I can read it directly and filter it out if the top matches aren't from the same genus or if the score is too low.

for seq in SeqIO.parse(fasta_file, "fasta"):
        counter +=1
        blast_cline = NcbiblastnCommandline(query="G:\\454 dataset\\new2.fasta", db="database", evalue=0.001, outfmt=5, out="G:\\454 dataset\\blast2.xml")
        stdout, stderr = blast_cline()
        print(stdout,stderr) 
        if counter == 1: break

Also I would like to give the query sequence from parsing through the file, as it stands it gets the query from the "new2,fasta" file. The counter and the break are temporarily there, till I find a way to get it to accept the seq variable as its query input.

How could I get a stdout output from the local blast? Also is it possible to give the query input a variable and not a file?

blast biopython • 4.0k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
5
Entering edit mode
9.6 years ago
Peter 6.0k

You've asked multiple questions rather than one.

Question: Is it possible to get local blast output directly in Biopython, without making and reading an XML file?

Yes, you could ask for a tab separated table with the taxonomy information included, see http://blastedbio.blogspot.co.uk/2012/05/blast-tabular-missing-descriptions.html

The Bio.SearchIO module can parse this kind of output (as well as BLAST XML).

Question: How could I get a stdout output from the local blast? Also is it possible to give the query input a variable and not a file?

Have a look at the example in the Biopython Tutorial using MUSCLE using stdin and stdout which illustrates how you would use stdin/stdout with the Biopython command line wrappers. The default BLAST+ behaviour is to read the query from stdin, and write the output to stdout, or you can specify this explicitly using query="-" and out="-" in the wrapper.

However, I think it would be better to avoid the loop where you seem to call BLAST with a single query, and instead call BLAST once with the original multiple-query FASTA file.

ADD COMMENT

Login before adding your answer.

Traffic: 2455 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6