The fastest way to download a list of SRR accessions from Sequence Read Archive with sratoolkit
2
0
Entering edit mode
4.9 years ago
Denis ▴ 310

I've installed sratoolkit.2.10.0 at my home on a cluster. I have to download a numerous SRR accessions to my home directory. There are a several options, as i can understand:

  1. fastq-dump
  2. run prefetchutility, then convert resulted sra files to fastq by fastq-dump
  3. fasterq-dump (able to use multi-threading, but if i'm correct can not employ list of SRR accessions as input)

Which option is the fastest? Could you please provide a command line example which will be suitable for my purposes?

I've found very useful a post here: download from SRA However it seems too old.

sequence genome • 9.6k views
ADD COMMENT
2
Entering edit mode
4.9 years ago
GenoMax 147k

Your best bet is to use this: Fast download of FASTQ files from the European Nucleotide Archive (ENA) instead.

ADD COMMENT
2
Entering edit mode

This link covers the (in my opinion) two fastest options. The first is to download directly in fastq format from ENA, and the second is prefetch followed by parallel-fastq-dump. See the thread for details including code examples. Don't use any of the "dump" commands to download data directly, too slow and too unstable in my experience.

ADD REPLY
2
Entering edit mode
3.7 years ago

Assuming you are using bash, you can employ an accession list (or any text file with accessions separated by returns) for faster-dump as follows: cat SraAccList.txt | xargs fasterq-dump - this also takes parameters eg --outdir

ADD COMMENT

Login before adding your answer.

Traffic: 2740 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6