Taxid will not function with other databases (including custom)
1
0
Entering edit mode
6.0 years ago
emilyc ▴ 30

Hello/Bonjour

I cannot get taxid to work with nr, or my custom database (the custom database does work); I cannot get the output to include the results for staxids.

Error: Warning: [blastx] Taxonomy name lookup from taxid requires installation of taxdb database with ftp://ftp.ncbi.nlm.nih.gov/blast/db/taxdb.tar.gz

The taxdb.btd and taxdb.bti are both in my BLASTDB dir

Code:

  blastx -query SPAdes/contigs.fasta -db ../../BLASTDB/nr -outfmt "6 qseqid sseqid pident qlen length mismatch gapope evalue bitscore staxids sscinames" -num_threads 24 -out D446_S2_viral_fraction_nr_taxadb_test.blastx -max_target_seqs 20

Any help is appreciated!

blast blast+ linux • 4.5k views
ADD COMMENT
0
Entering edit mode

Did you set the BLASTDB environment variable?

Scientific Names In Blast Output And Databases

ADD REPLY
0
Entering edit mode

It was set when I installed Blast originally, do I need to do it again/another way now that I have added the 2 taxdb files to the same dir?

ADD REPLY
0
Entering edit mode

What is the result of:

echo $BLASTDB

and:

ls -lh $BLASTDB
ADD REPLY
0
Entering edit mode
echo $BLASTDB

:/home/emily/blast/bin/

ls -lh $BLASTDB

Is all the files in my home dir

ADD REPLY
0
Entering edit mode

Does echo $BLASTDB really have : at the beginning?

The result of ls -lh $BLASTDB should be the contents of the folder where your blast databases are located, not your home folder.

ADD REPLY
0
Entering edit mode

Fixed both of those, thank you. It still won't return taxaID with a custom db though.

ADD REPLY
0
Entering edit mode

With custom database I suppose you made a database with the makeblastdb command. Did you used sequences from genbank for that or from an other source? And did you added taxaid's when you made the database?

ADD REPLY
0
Entering edit mode

Yes, it was made the makeblastdb, and the contents are a fraction or rn - so yes GenBank. I did not add taxaIDs, but the taxadb is in the same dir as the custom db.

ADD REPLY
2
Entering edit mode
6.0 years ago
gb ★ 2.2k

You need to add the taxonIDs when you make the database.

I think you first need to download this file: ftp://ftp.ncbi.nih.gov/pub/taxonomy/accession2taxid/prot.accession2taxid.gz

After that you need to extract two columns:

sed '1d' prot.accession2taxid | awk '{print $2" "$3}' > accession_taxonid

Then you make the database like this:

sudo makeblastdb -in yourseqs.fa -dbtype prot -taxid_map accession_taxonid -parse_seqids

I have never done it with protein data, but I think it is the same as the nt.

EDIT: I think the process of adding the taxonIDs consumes a lot of memory. If it does not work blast will not give an error, so keep that in mind. If memory is a problem you first need to extract the accessions that you have from accession_taxonid and try it again.

ADD COMMENT
0
Entering edit mode

This makes sense, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1906 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6