Entering edit mode
7.1 years ago
Wenhu_Cao
▴
100
I wonder if the fastq-dump will download the sra cache file every time even if I have that in my /sra directory. If not, I don't understand why I run the command it still takes a relatively long time.
Would anyone know this? Thanks in advance!
Are you
fastq-dump
'ing the same data multiple times? See the answer on this page for more info.Wherever possible you should bypass SRA altogether and get the data directly from EBI-ENA as fastq.
I have checked ENA for the same SRR id. I don't understand why the fastq file is much smaller in the ENA than the one I directly use fastq-dump to download.
Is your file gzipped?
Oh, it is. Thanks a lot!
I recommend not to use fastq-dump to directly retrieve the fastq data from NCBI. It is slow and sometimes the connection is unstable. Better, use
prefetch
from the SRA toolkit in combination with Aspera Connect to download the SRA files to your disk. Check the domcumentation of NCBI for more information on how to configure prefetch. With prefetch/aspera you can achieve download rates of about 100Mb/s. Once finished, use fastq-dump to get the fastq files (unless, as genomax said, you cannot get them right away from the ENA).Thanks! I will try that!