Same GEO Accession, different SRR number, how to download this RNA-seq paired-end data?
0
0
Entering edit mode
17 months ago
ev97 ▴ 40

I am trying to download some public RNA-seq data (paired-end) and I have encountered that there are some samples that have the same GEO Accession but different SRR number (and different sizes). Therefore, when I download them using sra-toolkit and fastq-dump --split-3 I have several files for the same sample.

As you can see in the following screenshot, there are some samples that have different SRR number and GEO_Accession. However, as I said, there also some (highlighted) that have same GEO_Accession, different size and different SRR number.

image

When I use fastq-dump --split-3 for these samples (for example):

a) SRR7774397, I get:

  • SRR7774397_1.fastq

  • SRR7774397_2.fastq

b) SRR7774398, I get:

  • SRR7774398_1.fastq

  • SRR7774398_2.fastq

If you go the NCBI (Run Browser), they appear as two fastq files (_1 and _2):

SRR7774397 (Data access)

image 2

SRR7774398 (Data access)

image3

However, theoretically they belong to the same sample...

How do you usually download this type of data? It seems that the data for some samples is splited but I do not know how to merge them or in general download them.

SRA Run Selector where all the samples appear can be found here (PRJNA488803)

Any help is really appreciated.

Thanks very much in advance

Regards

sra-toolkit fastq RNA-seq SRA • 1.2k views
ADD COMMENT
1
Entering edit mode

My guess is that some samples were resequenced. I'd recommend just merging the respective R1 and R2s and treat them as one sample.

ADD REPLY
0
Entering edit mode

Thanks for your quick reply! How would you merge them?

ADD REPLY
1
Entering edit mode

You can cat the respective R1/R2 files in same order e.g. cat Run1_R1.fq.gz Run2_R1.fq.gz > R1.fq.gz.

ADD REPLY

Login before adding your answer.

Traffic: 2260 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6