Entering edit mode
4.0 years ago
langziv
▴
70
Hello.
I'm trying to get taxonomic data, such as scientific and common names, and keep getting "N/A" for each taxonomic parameter. I updated the taxdb. It's in the same directory as the nucleotides database. I also tried specifying the path, as suggested in previous posts on similar issues.
The script is
export BLASTDB="/.../biodb/BLAST/nucleotide3/taxdb"
module load blast/blast-2.10.0
cd /.../output/blast
for file in ./*.fa; do \
output=${file#"./scaffold_"}
output=${output%.fa}
output=${output}_blast.txt
blastn -query $file \
-db /bioseq/biodb/BLAST/nucleotide3/nt \
-max_hsps 1 -max_target_seqs 1 -num_threads 20 \
-out $output \
-outfmt "6 qseqid sseqid pident staxids sscinames scomnames qstart qend length sstart send slen evalue mismatch gapopen bitscore"
done
Here's as example of one line form an output file:
scaffold_11 gi|1530013355|ref|XM_010738424.3| 97.059 215358 N/A N/A 68936 68969 34 1372 1339 2896 0.11 1 0 58.4
As can be seen data is retrieved except for where the taxonomic data is expected.
Thanks!
NCBI taxonomy is notoriously inaccurate and incomplete. Are you sure this data exists for this entry? Have you verified any other way?
Thank you for the reply. The "N/A" is in every line in the results. There's not a single line that contains taxonomic data other that the taxid. It's weird because the database in on the computer I work with, not an NCBI server, at the same path as the nucleotides database. I'm guessing it's some bug in the blast software.
I thought I could try to find a software that allows retrieving such data after providing the daxid.
Is it this line?:
export BLASTDB="/.../biodb/BLAST/nucleotide3/taxdb"
That line/filepath looks malformed to me? Starts with
/.../
That's not the whole line. I replaced part of it with "...".
Hey, I am having a similar issue. Was there ever a resolve to this issue?
Most likely the DB was built without the correct flags, it should be something like
the_database_taxids.txt contains the NCBI taxonomy IDs for each ID in the database.
For example, for the Fasta file
the taxid file would be
If you build the database without the taxid_map you will get N/A for taxonomy ID and details
Thank you for your reply, I built the database in the following manner attaching the taxid_map files, as shown below. I also am running BLAST in the same directory holding both the blast indexes and the taxid map file. The FASTA file is the NCBI DB from 2020 with its corresponding taxid map file. I have also tried using more recent pre-made blast dbs: https://ftp.ncbi.nlm.nih.gov/blast/db/, but similar errors develop.
Is there some curation I have to do with the Taxid Map file in order to use this?
I am also attaching my Blast command. As you can imagine, "N/A" results are showing in place of the taxid. And during the run I see, " Warning: [blastn] Taxonomy name lookup from taxid requires installation of taxdb database with ftp://ftp.ncbi.nlm.nih.gov/blast/db/taxdb.tar.gz". So it's not locating the taxid map file, but why?
I really would appreciate any feedback!
aaah OK, it cant find the taxdb in your case. So the taxonomy IDs have been added correctly, but blast can't look them up anywhere.
There are several places blast looks for the taxdb, including your current working directory. have you tried downloading the taxdb.tar.gz and then extracting its contents into your working directory? That might fix it
This worked, thanks!