Question

Biobambam vs Picard in RNA Sequencing Analysis for BAM to FASTQ

0

Entering edit mode

15 months ago

kcarey • 0

Hello,

I have downloaded TCGA raw counts (gene quantification) using STAR 2 pass method for their mRNA sequencing pipeline, however, I am comparing this data to another dataset. My second dataset I had BAM files, however, I generated my own FASTQ R1 and R2 using Picard. However, I am using the same 2 PASS method for alignment. Is this acceptable, comparing dataset with slightly differently pre-alignment QC? My second dataset had some PCR duplication, so I will be using a Picard code to mark duplicates, and trimmomatic. However, TCGA pre-alignment did not do this (I assume, QC did not yield issues...however, there is no paper associated with the illumina work). Everything post-alignment will be the same. Is this okay?

I did do research on Biobambam (I was unfamiliar) and have read their output is slightly different, but similar. The paper seems to focus on run time more than result differences.

Kaylin

RNA-seq Biobambam Picard • 670 views

ADD COMMENT • link 14 months ago by kcarey • 0

score 0 · Answer 1 · 2024-03-19

0

Entering edit mode

15 months ago

Zhenyu Zhang ★ 1.3k

Let me say even if you had used the exactly same bioinformatics steps, you will still have batch efforts from sample collection, storage and library preparation. So it's impossible to judge whether it's ok or not. I can only say it is better than using completely different pipelines.

Btw, I understand why you want to markduplicate for PCR duplications, but in general it's a bad idea for RNA-Seq quantification.

ADD COMMENT • link 15 months ago by Zhenyu Zhang ★ 1.3k

0

Entering edit mode

Hmmm...okay. I am a graduate student and still learning. I will be using DESeq2 for normalization, and it will account for some of the things mentioned. I am including batch information. This has been racking my brain...and I want to make sure the scientific rigor, is there.

In terms of your comment on PCR duplications. Yes, I have seen that reading. Is there another alternative RNA sequencing specific that addresses the QC? or could this be addressed in the normalization? DESeq2 doesn't directly account for this....based on what I have read. :(

ADD REPLY • link 14 months ago by kcarey • 0