Hi,
I want to get top 10 sequences of BLAST results (just the sequences, no alignment or score or e-value etc). I am inputting a text file containing 5 fasta file. So my output should be top 10 blast hits of each fasta file.. therefore my output file will have 50 sequences.
I am reading each of my input fasta file through Bio.SeqIO, writing it as temp.faa and then passing it to command line BLAST through subprocess as
blastp -db nr -query temp.faa -out out.faa -evalue 0.001 -gapopen 11 -gapextend 1 -matrix BLOSUM62 -remote -outfmt 2
the output has lots of other information. Should I parse this output now or there's a better way.
Thanks
P.S XML might be the way, but I didn't find a relavant NCBIXML parser syntax
Makes for much easier/quicker parsing as well in order to grab the full sequences from the relevant sequence file.