Entering edit mode
5.8 years ago
anasofiamoreira94
▴
80
Hello, I'm trying to filter out some sequences from the nt database from ncbi.This is how I went with:
- 1-Download the prebuilt nt database
- 2-search entriz nucleotide database with query: "taxid3708[ORGN]"
- 3-Select "Send to File" and choose format "GI list"
- 4-Use the list of GIs from the previous step with: blastdb_aliastool -gilist sequence.txt -db nt_v5 -out nt_allergen -dbtype nucl
However, when I use this command line I get this error: BLAST Database error: No GIs were found in BLAST database
Here are some IDS from my GIlist: 1376310040 1179788179 1464315148 1551319539 1534512279
Am I retrieving the GI list the righ way? Thanks for the help.
If you are looking to restrict blast results to that taxID then why not use the new blast+ option (available with v.2.8.1):
According to ftp://ftp.ncbi.nlm.nih.gov/blast/db/README
nt
is:Your first GI corresponds to MF401153.1, which is identical to MF401152.1, except in position 146 MF401153.1 has N and MF401152.1 has T. The latter is thus objectively of higher quality and probably present in
nt
BTW, I thought NCBI had phased out GI's already..
Yes they have phased out GI for most purposes. So using accession numbers is definitely the preferred way. I will quote the following from NCBI.