Hello,
I was trying to make sense of the RiP% value that is returned by ChIPQC. I understand that this is basically just a percentage of how many reads are located within the called peaks.
The vignettes indicate that a RiP% of 5% or greater is typically indicative of good enrichment. For some of the samples I processed I am seeing values as great as 22%. for the 30 samples I am analyzing, the RiP% values range from 13.4 to 22.9. My first instinct was, "Great! Higher RiP% value = greater enrichment!". However, as I think about this a little more I have become slightly concerned that the high RiP% values I am observing could be the result of insufficient stringency in peak calling. Having less stringent peak calling parameters might lead to a greater number of reads occurring in peaks simply because a greater number of peaks are being called. Is this a reasonable train of thought? Are these high values something to be concerned about? My thought is that this would definitely have a substantial impact on the identification of differential peaks down the line.
Any information on how to best go about thinking about this or if this is even a problem is appreciated!
Thanks
Hi Jared,
Thank you for the thoughtful response! This is the second time you have provided a clear and helpful answer to a question i have posed. It is greatly appreciated!
The ChIP performed was for H3K27ac, so it sounds like these RiP% values are well within reason, which is reassuring! I have done a fair bit of IGV visualization to look at the called peaks and identified differential peaks. There are definitely major variation in height / strength of called peaks, but they do appear to align with the bam files well enough.
I was curious if you know of what the average peak size is for H3K27ac peaks that tend to validate is. I have never been able to find a clear enough answer regarding expected peak sizes (i.e. sufficient read pile up for a called peak). I see such variation among the peaks that have been called in the data set that I have struggled to reach any sort of meaningful conclusion. I would imagine that there is a lot of variability out there depending on a variety of factors like cell type, organism, xyz.... but was curious if you had any input.
I have never looked at csaw before but will aim to give it a look and see if it validates or invalidates the differential peaks I have identified up to this point.
Thanks again! The help and input is greatly appreciated!
As you said, it's going to vary, but I'd say the average H3K27ac peak is probably 1-5kb in size. If you're getting lots of peaks that are 10kb+ wide, you may want to use PeakSplitter or something similar to break them up. MACS2 is generally pretty good at avoiding huge peaks like that unless you're using the broad peaks setting though. I would be rather surprised if it was a significant issue in your data if you used MACS2.
TFs can well have FRiPs in the 20% or 30% range, it is all on the antibody, expression level of the protein and quality of the chromatin. I saw both published and our own data achieving this % while for some TFs you can be happy to get like 1%. As Jared said, just look at the browser tracks by eye, that is the best diagnostic.