Hello all :)
I'm trying to check/correct my ChIP-Seq reads for GC% bias, and I got the following two plots for my Input and for my H3K27ac pulldown. I have excluded repetitive regions, but not mappability or peaks:
Input:
H3K27ac:
So based on the Input plot, I obviously have a little bias which I should correct for going forwards (unless there's a chance I should have called peaks for the input too?).
However on the plot for the ChIP pull down the result is wacky because I haven't excluded my peaks from the analysis. This makes sense since you can't correct for an overall GC bias if your chip is pulling down GC-rich regions specifically - and I understand what to do going forwards, but it seems like a lot of work. Call peaks, exclude regions, normalize the BAM, call peaks again. Is it therefore acceptable to just normalize the ChIP-Seq data to the bias of the Input (since both were sequenced on the same sequencer using the same kit manufacturer)?
All the best :)
Yeah I remember Fidel mentioning that modern PCR kits have a fairer GC specificity, but I have to check these things and this time it looked a little off. This was sequenced on 1st August 2013, so that's quite a while ago. I am currently trying
computeGCBias
again after excluding all regions with >20 reads, which is a poor-man's-peak-caller, and if that makes my GC plot look flatter, I probably won't bother normalizing. I remember when Fidel first made this tool, some of the PCR kits were really bad -- makes me wonder what would have been the outcome of all the papers published post 2013 if they were re-analyzed these days, hehe.It's a public holiday in Germany today, so double thank-you for taking the time to help me move forwards on this Devon :)