Extract subset of Nr database
1
0
Entering edit mode
7.6 years ago
sbchua.1990 ▴ 50

I have downloaded and format (makeblastdb) the nr database from NCBI for my own local database (February 2017) . I want to extract subset from the nr database.

I have tried below method:

  1. Download GIlist from NCBI
  2. Use blastdb_aliastool to extract the subset.

Above method worked well for older database (2013) but show error "BLAST Database error: GI list specified but no ISAM file found for GI" for my recently download database. My understanding is NCBI no longer support GI.

I have also tried download target sequences as fasta file from NCBI directly as shown in http://www.ionsource.com/tutorial/db/tips_for_creating_species_specif.htm but download seem to be failed every time before complete.

Any other suggestion? I need to download nr database for txid5204[ORGN].

blast gene • 4.8k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
4
0
Entering edit mode

Thanks for reply, Refer to the tutorial http://bioinf.shenwei.me/taxonkit/tutorial/ In step 2, 'prot.accession2taxid.gz' is just sample data for tutorial? If so, what is the equivalent of 'prot.accession2taxid.gz' for my data? I am a bit confused by that.

ADD REPLY
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks for your help. Good work on both 'taxonkit' and 'seqkit'. Both helped me a lot.

ADD REPLY
0
Entering edit mode

glad it helps, can you give this answer an upvote or and accept it?

ADD REPLY

Login before adding your answer.

Traffic: 1725 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6