Will the fastq-dump download the sra repetitively?
0
0
Entering edit mode
7.1 years ago
Wenhu_Cao ▴ 100

I wonder if the fastq-dump will download the sra cache file every time even if I have that in my /sra directory. If not, I don't understand why I run the command it still takes a relatively long time.

Would anyone know this? Thanks in advance!

sequencing • 2.3k views
ADD COMMENT
3
Entering edit mode

Are you fastq-dump'ing the same data multiple times? See the answer on this page for more info.

Wherever possible you should bypass SRA altogether and get the data directly from EBI-ENA as fastq.

ADD REPLY
0
Entering edit mode

I have checked ENA for the same SRR id. I don't understand why the fastq file is much smaller in the ENA than the one I directly use fastq-dump to download.

ADD REPLY
1
Entering edit mode

Is your file gzipped?

ADD REPLY
0
Entering edit mode

Oh, it is. Thanks a lot!

ADD REPLY
1
Entering edit mode

I recommend not to use fastq-dump to directly retrieve the fastq data from NCBI. It is slow and sometimes the connection is unstable. Better, use prefetch from the SRA toolkit in combination with Aspera Connect to download the SRA files to your disk. Check the domcumentation of NCBI for more information on how to configure prefetch. With prefetch/aspera you can achieve download rates of about 100Mb/s. Once finished, use fastq-dump to get the fastq files (unless, as genomax said, you cannot get them right away from the ENA).

ADD REPLY
0
Entering edit mode

Thanks! I will try that!

ADD REPLY

Login before adding your answer.

Traffic: 2152 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6