Hi everyone,
If I want to compare ChIP-seq data from different sequencing projects, say epigenome roadmap vs ENCODE.
How do you normalize across samples? Is it similar to RNA-seq data that one needs to correct batch effect http://simplystatistics.org/2015/05/20/is-it-species-or-is-it-batch-they-are-confounded-so-we-cant-know/
I know MAnorm http://bcb.dfci.harvard.edu/~gcyuan/MAnorm/MAnorm.htm and others can do for samples from same project.
I just want to know how you deal with it for ChIP-seq data.
Thanks,
Ming
How about comparing the same histone mark among different samples from different sequencing centers.
That's pretty tricky. I would only be comfortable doing that for very robust histone modifications like H3K4me3. Otherwise, the variability will be quite high. There are also a few other points to keep in mind.
For a robust histone modification, I would normalise for total read counts (randomly extract reads to match the smallest file), call peaks individually and overlap the peaks.
Thanks for sharing your tips!