Hi,
I am investigating to see if my query sequence is present in any fish (Teleostei) or Mollusca genomes. I am confused with two search strategies:
- One way is to download Eukaryotic "nt" and "Refseq" databases and blast my query sequence to the whole database and select and download the genomes with blast hits.
- I can also search for the fish (Teleostei) (2,131 genomes) or Mollusca (335 genomes) genomes in the Datasets portal (https://www.ncbi.nlm.nih.gov/datasets/genome/) and download all available genomes and blast my query sequence against them, but I will need lots of computer resources. My question is: Are these genomes already included in the "nt" and "Refseq" databases? If so I won't need to take the second strategy and can just search against the database, which is simpler.
Would you please help me with that?
Cheers,
Mani
So you are not going to get the whole genome shotgun sequences if you use
nt
.That said you should still start with
nt
and limit your searches using taxID for Gnathostomata (7776) and Mollusca (6447).Thanks for your comments