Does normalization/pooling in library prep affect read counts after sequencing?
1
0
Entering edit mode
8.2 years ago
snamjoshi87 ▴ 40

I have RNA immunoprecipitation data that includes an experimental and knockout sample.

It is my understanding that the normalization/pooling at the end of the library prep helps to keep the reads evenly distributed. I am wondering if this can inflate counts for certain samples. In my case, I noticed that the knockouts had a concentration ~1-2 ng/uL prior to library prep and the experimental has ~10-15 ng/uL. However, looking at the distributions for the read counts after sequencing, I see that the WT and KO read count distributions are almost identical (though they do appear in noticeably different groups via PCA and hierarchical clustering after rlog transformation). Many of the experimental RNA with high read counts often have a correspondingly high read count in the knockout.

My question: Could the normalization at the library prep stage be responsible for this and, if so, what can I do about it?

library-prep RIP-seq ChIP-Seq normalization • 3.0k views
ADD COMMENT
0
Entering edit mode

What normalization happens during the library prep stage? Pooling after library prep to an equimolar concentration to achieve even-ish sequencing depth is standard.

ADD REPLY
0
Entering edit mode

I was just going off of the terminology from the TruSeq protocol: "Indexed DNA libraries are normalized to 10 nM in the DCT plate and then pooled in equal volumes in the PDP plate."

ADD REPLY
4
Entering edit mode
8.2 years ago

What you're seeing is likely to be the reality in your samples. The "normalization" could also be written, "equal amounts of each sample were pooled prior to sequencing." It's unlikely that the dilution process had much of an effect on read distribution.

ADD COMMENT
0
Entering edit mode

Sorry, I'm very new to this and trying to learn. If there was a difference in concentration prior to library prep, then would I not expect my knockout samples to have a different raw count distribution than my experimental samples? If it was not due to the pooling, then what else could explain why the raw count distributions look so similar? Also, there is the fact that the KO and experimental are in different groups after PCA (after rlog transform) which makes me question if looking at the raw count distributions directly is the best approach.

ADD REPLY
2
Entering edit mode

The point of adjusting the concentration is to get rid of that difference in concentration. Anytime you extract RNA or DNA from a sample it'll have a different concentration due to things like the number of cells and the efficiency of the extraction. No one cares about those things, so you adjust to get rid of them.

There's no point in looking at raw count distributions.

ADD REPLY
0
Entering edit mode

Alright, thanks for your help.

ADD REPLY

Login before adding your answer.

Traffic: 2737 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6