I have paired end ChIP-seq data with 101 bp and 2 biological replicates for each one. I have done peak calling with macs2 but I have some questions about it.
I also faced with an warning:
WARNING @ Thu, 07 Jun 2018 17:06:05: #2 Since the d (197) calculated from paired-peaks are smaller than 2*tag length, it may be influenced by unknown sequencing problem!
WARNING @ Thu, 07 Jun 2018 17:06:05: #2 You may need to consider one of the other alternative d(s): 197
WARNING @ Thu, 07 Jun 2018 17:06:05: #2 You can restart the process with --nomodel --extsize XXX with your choice or an arbitrary number. Nontheless, MACS will continute computing.
I have added
--nomodel --extsize 197
;--nomodel --extsize 147
and--nomodel --extsize 202
(separately tomacs2
command) and got the results without any warning? which one is more correct?are broad peaks extended of narrow peaks? if I apply intersect between them i should expect find 100% overlap between narrow peaks and broad ones?
which kind of peak (narrow/broad) is proper for H3k27ac, H3k4me1, H3k4me3,H3k27me3 study?
if there is no control group for using as background, can I use default parameters?
Thanks for any suggestion, in advance!
(1) In my experience, I received this warning when my ChIP signal was not very strong. MACS2 called very few peaks, and it made sense when I checked out the bigwig files for my samples. There was very little difference between my input sample and IP sample.
(2) I believe broad peaks just combines nearby peaks into larger peaks.
(3) I'm not familiar with those factors but narrow peaks are appropriate if they bind in very specific places on the chromatin. Broad peaks are good if the factor binds to many locations and has less specificity.
(4) I have never tried to call peaks without a control input sample. It's pretty important in ChIP-seq to provide background signal in some way.
Hope it helps
Let me add my few cents,
Correct
is something inconsistent. Just try to call peaks different--extsize
and see how many peaks are overlaping with each other. A high percentage of peaks should overlap. (I did something like this last week when I saw the warning with my ChIP data and found large overlap, so I ignored that warning).Most likely. But I never tested it. (@goodez is correct).
narrow peak will work fine. If the signal is spanned over a large region, those regions are also included in this peak file. Unless you are trying to associate peak length with some observations (e.g. like this), narrow peaks will be good enough.
I tested this a couple of times. Most of the times, I found all peaks called with background are overlaping with peaks called without background. But the other way around is not true. When you don't have a background, MACS will call too many peaks which are actually not peaks. A simple workaround would be to use a background of other sample if they belong to one condition.
Hello star!
It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/4448
This is typically not recommended as it runs the risk of annoying people in both communities.