The task I try to accomplish is to restrict a blast search taxonomicaly to a certain group, but remove all previously found species.
I do so by creating a taxid list using the get_species_taxids.sh script provided by the blast suit. But when I remove certain taxids from the list I get the error:
BLAST Database error: Taxonomy ID(s) not found. This could be because the ID(s) provided are not at or below the species level. Please use get_species_taxids.sh to get taxids for nodes higher than species (see https://www.ncbi.nlm.nih.gov/books/NBK546209/).
Yet it works when I use the unmodified file created by get_species_taxids.sh.
Which seems odd as I did not add anything but only removed certain entries. While trying to get my head around the problem got even more confusing. For instance when using the script with 7214 (Drosophilidae) the first entry in the list is actually 7214. Which is not at species level. And the created file can be used with blast without any issues.
Or when I use the call with 40372 (Drosophila americana texana) a subspecies I also get the above error. While this taxid is clearly below species level.
Interestingly this problems only occurs when I use the ref_euk_rep_genomes DB. I do not seam to have the problem with the nt DB.
Here are the commands I use (Everything up to date at the time of writing).
$ get_species_taxids.sh -t 7214 > taxidlist
$ head -n 1 taxidlist
7214
$ sort taxidlist > taxidlist_sort
$ blastn -db ref_euk_rep_genomes -query input.fasta -word_size 30 -taxidlist taxidlist_sort > /dev/null
# Only to show that theire are no entrys in taxidlist_redu that are not in taxidlist_sort
$ comm -23 taxidlist_redu taxidlist_sort
$ blastn -db ref_euk_rep_genomes -query input.fasta -word_size 30 -taxidlist taxidlist_redu > /dev/null
BLAST Database error: Taxonomy ID(s) not found. This could be because the ID(s) provided are not at or below the species level. Please use get_species_taxids.sh to get taxids for nodes higher than species (see https://www.ncbi.nlm.nih.gov/books/NBK546209/).
$ blastn -db ref_euk_rep_genomes -query input.fasta -word_size 30 -taxids 40372 > /dev/null
BLAST Database error: Taxonomy ID(s) not found. This could be because the ID(s) provided are not at or below the species level. Please use get_species_taxids.sh to get taxids for nodes higher than species (see https://www.ncbi.nlm.nih.gov/books/NBK546209/).