Any one knows any command line tools/methods/scripts to calculate correlation coefficient from two BED files of two ChIP-seq replicates?
thanks
Any one knows any command line tools/methods/scripts to calculate correlation coefficient from two BED files of two ChIP-seq replicates?
thanks
A very similar question was asked previously. I suggested a recent publication by Zhao and Sandelin (2012). They provide an R package called 'GMD' for calculating the "similarity between spatial distributions of read-based sequencing data such as ChIP-seq and RNA-seq". They also provide a detailed set of case studies (vignettes) in CRAN using some ChIP-seq data to illustrate. The authors seem responsive and showed up here on Biostars with some help for the previous poster.
Not quite what you've asked for, but htSeqTools (Bioconductor) can produce a PCA plot of sequence coverage similarity between samples.
A quick and dirty (though space consuming) way is to create per-base counts for the whole genome and then find correlation when you exclude all the places in the genome with read count of 0. This method is outline in in this paper "A computational pipeline for comparative ChIP-seq analyses (2011)" and the shell script to do the correlation can be found here under 'correlation.awk'
BiSA (http://bisa.sourceforge.net) statistical module does this, PloS One manuscript is in press. It is called overlap correlation value (OCV) in BiSA.
abotu PCC
Bailey T, Krajewski P, Ladunga I, Lefebvre C, Li Q, Liu T, Madrigal P, Taslim C, Zhang J (2013) Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data. PLoS Comput Biol 9(11): e1003326. doi:10.1371/journal.pcbi.1003326
http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003326
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.