I am having issues downloading single cell RNA-seq fastq files from SRA Explorer. Files are corrupted and sometimes half the size. I have 200+ files so it becomes hard to check which ones need to be redownloaded. How can I check using md5sum that the files I downloaded are correct. I am using a virtual machine to download this massive amount of data
Thank you
You may need to use
sratoolkit prefetch
and then check usingvdb-validate
before dumping the data out. Ideally you would get the original BAM files (if submitted by submitters) and then use the 10x util (bamtofastq
) locally.Single cell data is all over the place in SRA and unfortunately
sra-explorer
does not help with that.