I used the following command line for local blast and it worked wonderfully. However, I have the following questions about this command
blastall -p blastn -i $QUERY -d $SUBJECT -b 10 -m 8 -a 8 > blast.out
- The result does not show any headings for the columns, e-value & score columns r easily identifiable but I don't understand the other results. How can I get the headings?
- It doesn't give the positives in percent, how to get it?
thank you Michael. I'll try and let u know.
It worked, thank you so much :)
One more thing, say I am aiming only for the top 1 hit for every query sequence (-b), how to get the protein sequence simultaneously in the output, by modifying this command line?
As far as I'm aware you can not do so simultaneously. However, you can easily setup a simple loop to first run the blast search, parse out the top hit's accession ID, and extract the sequence using blastdbcmd (Which I assume exists for legacy blast).
That should be the way but I am not sure how to do that :( This may sound lame but I am a molecular biologist, not a hard core bioinformatics or computer science person, so can you help me here??
Is it possible to modify the first command line I have given and get the headers itself in the tab format??
You can build your custom output format including sequences using blast without any scripting, we have an answer by Istvan for this question already: A: Blast - Formatting Output
This answer is for Blast+, I'm not sure if legacy blast has a similar option, but here's another motivating factor to move on to the new blast.
These options allow you to retrieve only the aligned portions of the subject sequence, if you instead need the full sequences, export only the subject gi's then use
to retrieve the entries in fasta format from the db
Thank you so much Michael :)
Hi Michael,
can u plz help me with Interproscan5 also. I suppose u have used the tool. I asked this question
C: How to calculate the standalone interproscan output functionwise or domainwise?