merging fastq, sam or bam?
1
0
Entering edit mode
6.8 years ago
ceboral • 0

Hi all! I have some RNA-seq (single-read) datasets divided in two different SRA, one with ~30 million reads and the other with ~15 million reads. I have been reading that I could merge the fastq files, sam or bam files and I would like to know if there is any differences regarding the quality of the final dataset. Thanks!!

sam fastq bam • 2.6k views
ADD COMMENT
1
Entering edit mode

There should not be as long as you process them identically before merging the BAM files.

ADD REPLY
1
Entering edit mode
6.8 years ago
ATpoint 86k

I recommend to quality-trim & align them independently, with the aligner directly piped into SAMtools sort (that avoids the unnecessary SAM files). Then check the alignment rate for every file and keep only those that you feel comfortable with. I had it before that technical replicates (same library over multiple lanes over several years as part of a large published study) had strikingly different quality, with the first replicate showing like 95% alignment rate, and the last one like 40% with a lot of trash reads (maybe sample got degraded over time in the freezer, I don't know). In any case, do not merge too early as you may lose the ability to discard bad samples if necessary. Do not trust that published data are always good quality, there are a lot of junk datasets out there in the SRA.

ADD COMMENT

Login before adding your answer.

Traffic: 2365 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6