My problem is, when I run this, the resulting plot is a heat map, where almost everything has a value between 0.9 and 1.
I'm hoping the error is in my code and not my data. Does anyone have any suggestions?
Why do you think there is an error ? Correlations values between 0.9 and 1 means that your samples are all highly correlated, which is not necessarily wrong.
I think it is wrong because even my inputs are very highly correlated to my actual targets. When I do peak calling, I still get peaks in the target compared to the input. Additionally, I have many different targets I am comparing and they do not call the same peaks, but still have a very high correlation.
Ok, I understand why you are concerned now. Still, I don't think there is necessarily an error. In the code above, you are computing Pearson correlation on 10000 bp windows between condition. It is possible that the signal (the peaks you called) get buried in such big windows so almost everything is evened out between IP and input. So why do you get such high correlation ? A few outliers regions (subtelomeres, repeated regions, centromeres, ...) could have either very high or very low coverage in every condition. Since Pearson correlation is sensitive to outliers, that would lead to high correlation for all samples. You can verify this hypothesis by plotting a scatterplot instead of a heatmap (--whatToPlot scatterplot) and assess if there are outlier windows.
My suggestion would be to try Spearman (--cormethod spearman) correlation instead, which is much more robust to outliers. If this still does not work, lowering the window size to 1000 bp (--binSize 1000) might increase sensitivity at the cost of being slower to compute.
Why do you think there is an error ? Correlations values between 0.9 and 1 means that your samples are all highly correlated, which is not necessarily wrong.
ps: there is a typo in
--corMethod person
I think it is wrong because even my inputs are very highly correlated to my actual targets. When I do peak calling, I still get peaks in the target compared to the input. Additionally, I have many different targets I am comparing and they do not call the same peaks, but still have a very high correlation.
Ok, I understand why you are concerned now. See my answer below.