Calculate ChIP-seq correlation in promoter regions
1
0
Entering edit mode
5.8 years ago
dcheng1 • 0

I read a few papers that calculate Pearson correlation coefficient of ChIP-seq replicates in promoter regions. However, even for active marks like H3K4me3, not 100% of peaks are located within promoter regions.

Would it be more accurate to calculate the correlation using the bed file from MACS?

When calculating correlation, what's the best to look at? reads across the genome? reads in the promoter regions or reads in called peak regions? or anything else? Any comments are highly appreciated!!!

ChIP-Seq sequencing • 1.5k views
ADD COMMENT
1
Entering edit mode
5.8 years ago

I would suggest just using random bins in the genome rather than only those in promoters. That will produce less bias and give you a better overall view of how correlated your samples actually are.

ADD COMMENT
0
Entering edit mode

Hi Devon, thanks for your helps! I have a follow-up question: suppose a bin contains no reads or few background noise reads in both replicates, then the correlation between the two replicates in this bin will be almost 1. But obviously this is not the perfect correlation we want, and the correlation coefficient will be biased to high value. For some histone mark or TF, which only binds to a small fraction of the genome, will there be lots of such no-reads bins or background bins if using random bins? Do you think this make sense?

ADD REPLY
0
Entering edit mode

This is the reason we typically look at both spearman's and pearson's correlations.

ADD REPLY
0
Entering edit mode

I tried spearman correlation, it indeed gives slightly lower correlation than pearson did. Thank a lot!!!

ADD REPLY

Login before adding your answer.

Traffic: 2368 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6