Normalization of RNA-seq data
1
0
Entering edit mode
7.2 years ago
Javad ▴ 150

Hello,

I have few libraries that are not rRNA depleted and in each of them we have high percentage of reads that are aligned to rRNA genes. I want to normalize data in order to perform multiple comparisons to find differentially expressed genes.

My question is that if I normalize data with methods like "estimateSizeFactorsForMatrix" from DESeq, does this high percentage of rRNA gene distort normalization?

Do I have to remove reads aligned to rRNA?

what is the best approach to tackle this problem?

Thanks,

RNA-Seq • 2.0k views
ADD COMMENT
0
Entering edit mode
7.2 years ago
Satyajeet Khare ★ 1.6k

If your libraries are not rRNA depleted, they will be of little use for differential expression irrespective of which method of normalisation you use. As we know, large chunk (>95%) of RNA inside a cell is rRNA. If you do not deplete it or if removal is not efficient, most of the reads from sequencing will be wasted on rRNA loci. As a result coverage of mRNA coding genes will be minimal. I have seen some samples with rRNA contamination and the number of reads were not sufficient enough to identify a knockout from wild type looking at reads aligned to the deleted locus, let alone the differential expression analysis.

ADD COMMENT
0
Entering edit mode

Thanks for your answer. My libraries are sequenced deep enough and I have sufficient number of reads that are aligned to mRNAs. Of course around 90 percent of my reads are aligned to rRNA but I still have around 2 or 3 million reads that are uniquly aligned to mRNAs and I think this amount is sufficient for downstream analysis. In your opinion which approach should I take for data normalization. The library sizes in my data are different and the percentage of reads aligned to rRNA is also different (between 60 to 90 percent). So I can not just remove rRNA content and then normalize with regular methods. What is your suggestion? Thanks again

ADD REPLY
1
Entering edit mode

Yeah, I was going to add "unless your RNA-Seq is deep sequenced" to my answer :)

I have not faced such situaltion, but if you mask rRNA regions and run DESeq, it might work.

Best,

ADD REPLY

Login before adding your answer.

Traffic: 1632 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6