Entering edit mode
11 months ago
Jeremy Leipzig
22k
How long would 10TB of RNA-Seq data take to submit to NCBI SRA if it's already in AWS S3?
How long would 10TB of RNA-Seq data take to submit to NCBI SRA if it's already in AWS S3?
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Have you already got in touch with SRA help desk, if you are truly planning to upload that much data? They can probably suggest efficient/non-standard way.
They should post here then. Isn't the point of biostars to disseminate this type of siloed knowledge?
Need to upload several TB of data is uncommon so while someone who has done this in the past may post you would save time by proactively contacting SRA help desk. While I see some NCBI folks answer questions on biostars I don't think I have ever seen anyone from SRA team here.
If you do get generally usable info please post that. My guess is solutions at this scale may be tailored for specific situations.
My primary concern here is how to upload the metadata and processed files matched the actual fastq files. SRA can use Aspera so for the fastq files it's probably just go and wait. This terrible metadata spreadsheet from GEO is already a pita for a few dozen samples. Leave alone hundreds or thousands.
I'm not sure Aspera is necessary for an S3-to-S3 transfer