I have RNA immunoprecipitation data that includes an experimental and knockout sample.
It is my understanding that the normalization/pooling at the end of the library prep helps to keep the reads evenly distributed. I am wondering if this can inflate counts for certain samples. In my case, I noticed that the knockouts had a concentration ~1-2 ng/uL prior to library prep and the experimental has ~10-15 ng/uL. However, looking at the distributions for the read counts after sequencing, I see that the WT and KO read count distributions are almost identical (though they do appear in noticeably different groups via PCA and hierarchical clustering after rlog transformation). Many of the experimental RNA with high read counts often have a correspondingly high read count in the knockout.
My question: Could the normalization at the library prep stage be responsible for this and, if so, what can I do about it?
What normalization happens during the library prep stage? Pooling after library prep to an equimolar concentration to achieve even-ish sequencing depth is standard.
I was just going off of the terminology from the TruSeq protocol: "Indexed DNA libraries are normalized to 10 nM in the DCT plate and then pooled in equal volumes in the PDP plate."