Question

Having two SRR records and accordingly four fastq files per sample, how should I deal with this?

0

Entering edit mode

23 months ago

Ngrin • 0

Hello, I am tying to analyze the GSE104131 dataset from GEO. I have written all the scripts from downloading using fasterq-dump to counting reads using featureCounts. However, when I opened the SRA selector page of this dataset, I found per sample I have two SRR records.

fasterq-dump --outdir /data/fastq/ --split-files SRR6059552
fasterq-dump --outdir /data/fastq/ --split-files SRR6059553

After getting the fastq files using fastreq-dump, I have four fastq files. (SRR6059552_1, SRR6059552_2, SRR6059553_1, SRR6059553_2)

I couldnt find any information how should I deal with these four files in terms of thrimming or aligning (I know how to perform these steps when we have two files but with four fastq files all related to the same sample, I did not find how should I proceed)

fastq sra-toolkit rna-seq • 1.0k views

ADD COMMENT • link updated 23 months ago by ATpoint 86k • written 23 months ago by Ngrin • 0

score 2 · Accepted Answer · 2023-01-22

2

Entering edit mode

23 months ago

ATpoint 86k

You can use cat to merge the R1 and R2 fastq files per sample, respectively, prior to alignment. That is a normal step to do when you have sequencing replicates.

ADD COMMENT • link 23 months ago by ATpoint 86k

0

Entering edit mode

Thanks @ATpoint. Which set of commands are correct in this case?

cat SRR6059552_1.fastq SRR6059552_2.fastq > SRR6059552.fastq
cat SRR6059553_1.fastq SRR6059553_2.fastq > SRR6059553.fastq

or

cat SRR6059552_1.fastq SRR6059553_1.fastq > f1.fastq
cat SRR6059552_2.fastq SRR6059553_2.fastq > f2.fastq