How can I submit paired-end SRA data to NCBI?
0
0
Entering edit mode
6.2 years ago
MAPK ★ 2.1k

Hi All, I tried to submit paired end fastq files (R1 and R2) for a sample to NCBI SRA database. I tried the following steps:

1) Created a bioproject profile by following the link: https://submit.ncbi.nlm.nih.gov/subs/bioproject/SUB4422178/submitter Filled in everything and submitted to get SAMN and PRJNA ids, then I selected FTP uploads

2) Then went to https://submit.ncbi.nlm.nih.gov/subs/sra/ First I moved both R1 and R2 files to a separate directory and cd to that directory. Then in the terminal typed:

ftp -i
open ftp-private.ncbi.nlm.nih.gov

Then on the prompt, typed username and password from link in 2

Username: subftp
Password: w*******

cd to account folder from link in 2 cd uploads/amyname@gmail.com_00YmVxw2

3) Created a new directory as shown below:

mkdir rhizophagus
cd rhizophagus

4) Then transfered both fastq files to ncbi directory by typing: mput *

After transfer was complete, I typed ls to see all files that have been transferred.

5) I then submitted the files using upload folder from https://submit.ncbi.nlm.nih.gov/subs/sra/ once the files were available on the database. I then selected the folder and submitted the folder.

Both of these files are now online here https://trace.ncbi.nlm.nih.gov/Traces/sra/?study=SRP158305. However, when I tried to download these file using prefetch --option-file sratest.txt and extract with fastq-dump --split-files SRP158305, I got two fastq files, but one file is 12.2 gb and the other file is only 303.3 mb. The actual file size of each fastq (R1 and R2) should be 10.9, but the downloaded fastq's are 12.2 gb and 303mb. I am not sure how it should have been submitted, but I would really if somebody could help me figure out where it went wrong. Thanks for your help in advance.

sra ncbi • 6.7k views
ADD COMMENT
0
Entering edit mode

If you go to the link above they appear to be similar sized.

cap

Did you upload them uncompressed? Perhaps SRA has already converted them (to .sra) and/or compressed them further.

ADD REPLY
0
Entering edit mode

@genomax Yes, I did not compress them. I just uploaded the two fastq files. Should I have to compress them and make one compressed file before uploading?

ADD REPLY
0
Entering edit mode

And what's the procedure to remove/update already submitted files? Is it possible to remove them and resubmit?

ADD REPLY
1
Entering edit mode

Doing fastq-dump --split-files SRR7716298 seems to recover asymmetric sized files as you posted above. I suggest that you email SRA support to let them know what is happening and ask them to reset your submission so you can re-upload the data.

ADD REPLY
0
Entering edit mode

Thanks. So when I submit it again, should I just compress both R1 and R2 and make one compressed file and submit?

ADD REPLY
0
Entering edit mode

To further clarify I am seeing this when I do fastq-dump --split-files SRR7716298.

Rejected 41571327 READS because READLEN < 1
Read 42671006 spots for SRR7716298
Written 42671006 spots for SRR7716298
ADD REPLY
0
Entering edit mode

What does that mean? Could you please clarify?

ADD REPLY
1
Entering edit mode

Either a file or the SRA record must have become corrupt. I assume this is original raw data?

ADD REPLY
0
Entering edit mode

That's right, these are raw data. I have emailed SRA support to reset it. So when I re-upload the files, should I just compress both files and submit as one compressed file?

ADD REPLY
0
Entering edit mode

Compress and submit them as a pair.

ADD REPLY
0
Entering edit mode

Sorry still confused- should I submit as one compressed file (with both R1 and R2) or two individually compressed files?

ADD REPLY
1
Entering edit mode

gzip each file separately and submit as two files

ADD REPLY
0
Entering edit mode

Thanks, I will give it a try.

ADD REPLY

Login before adding your answer.

Traffic: 1830 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6