Blast Output With Species Name
1
0
Entering edit mode
11.1 years ago

Dear all,

my question is very simple. I have reads from a human RNASeq and I would like to check for contamination on a specific virus family. So my pipeline looks like

1) Align on the human genome (tophat2) 2) Keep only unaligned reads 3) Align on the virus genomes 4) Keep only aligned reads 5) BLAST

However, I would like to know the species name for each of my BLAST hits. So, now I'm launching it like this: blastx -query $query -db nr -out blast.txt -num_threads 4 -outfmt 6 -evalue 10e-3 -word_size 2

But the output contains only the GI name. Is there a way to get the species name from a simple BLAST run? Or an alternative way to launch BLAST to obtain this information in a text format?

Thanks a lot :-)

Federico

blast metagenomics • 12k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I would say this is a duplicate of Scientific names in blast output and databases. Please take a look, I asked and responded myself this same question.

ADD REPLY
5
Entering edit mode
11.1 years ago

You may want to use the latest NCBI BLAST 2.2.28, which addresses specifically this issue! So in order to get species id you can change your outfmt format into:

-outfmt  '6 qseqid sseqid pident evalue staxids sscinames scomnames sskingdoms stitle'
ADD COMMENT
1
Entering edit mode

This answer is technically correct, but you won't be able to use for example 'sscinames' without the 'taxdb' database. As I commented above, please check Scientific names in blast output and databases.

ADD REPLY

Login before adding your answer.

Traffic: 1780 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6