Retrieve multiple refseq genomes in seperate fasta files
1
0
Entering edit mode
4.4 years ago
Biogeek ▴ 470

I've had a search and I can't seem to find any relatable questions.

My task is as follows:

  1. I have a list of Refseq accessions in a .txt file.
  2. I want to download all the associated genomes to seperate .fasta files in a local directory.

I note that I can use Entrez or NCBI assembly downloader, but this puts all genomes into the one .fasta file which isn't ideal.

Can anyone help?

Thanks

ncbi Refseq • 781 views
ADD COMMENT
1
Entering edit mode
4.4 years ago
GenoMax 147k

I note that I can use Entrez or NCBI assembly downloader, but this puts all genomes into the one .fasta file which isn't ideal.

How about using a loop and multiple calls to the said programs. That should give you separate files.

ADD COMMENT
0
Entering edit mode

Good call genomax, thanks! I've now installed the Entrez utilities and can obtain my record with efetch. I'll write a loop using the 'list.txt' file I have which contains accession numbers.

One more question, apologies for the ignorance (as required), is there a way I can also obtain my .fasta files with the TaxId on the headers as well?

Thanks!

ADD REPLY
0
Entering edit mode

If you use Entrezdirect then use epost method instead of a loop. It will do the same thing. You will need to post-process the files to add taxID to headers. I don't think there is a way to do this automatically.

ADD REPLY

Login before adding your answer.

Traffic: 2192 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6