Hello,
I have few libraries that are not rRNA depleted and in each of them we have high percentage of reads that are aligned to rRNA genes. I want to normalize data in order to perform multiple comparisons to find differentially expressed genes.
My question is that if I normalize data with methods like "estimateSizeFactorsForMatrix" from DESeq, does this high percentage of rRNA gene distort normalization?
Do I have to remove reads aligned to rRNA?
what is the best approach to tackle this problem?
Thanks,
Thanks for your answer. My libraries are sequenced deep enough and I have sufficient number of reads that are aligned to mRNAs. Of course around 90 percent of my reads are aligned to rRNA but I still have around 2 or 3 million reads that are uniquly aligned to mRNAs and I think this amount is sufficient for downstream analysis. In your opinion which approach should I take for data normalization. The library sizes in my data are different and the percentage of reads aligned to rRNA is also different (between 60 to 90 percent). So I can not just remove rRNA content and then normalize with regular methods. What is your suggestion? Thanks again
Yeah, I was going to add "unless your RNA-Seq is deep sequenced" to my answer :)
I have not faced such situaltion, but if you mask rRNA regions and run DESeq, it might work.
Best,