Question

[MACS2] False negative peak calling problem

0

Entering edit mode

7.0 years ago

dude.dude.law • 0

Hi, all. Now I am using MACS2 to analyse my ChIP-seq data. I found that some obvious peaks were not called from MACS2, while some less obvious peaks were recognized by MACS2. In addition, no matter how I loosen the parameters, those obvious peaks were not included. I am wondering the reason behind this. The parameter used to generate the following example: macs2 callpeak -t treatmen.bam -c input.bam -n output_name -g mm --bdg --broad -p 0.01 -m 3 99999 Peaks defined by MACS2:

Peak missed by MACS2:

ChIP-Seq • 3.9k views

ADD COMMENT • link updated 6.8 years ago by ivar.grytten ▴ 40 • written 7.0 years ago by dude.dude.law • 0

0

Entering edit mode

It's difficult to judge things without seeing a couple kb around a false-negative peak (maybe you show that, but since there's no scale in the IGV screenshots...).

ADD REPLY • link 7.0 years ago by Devon Ryan 105k

score 0 · Answer 1 · 2018-03-12

When I have been debugging MACS2 peaks before, I have done something like this:

Check the output treatment bdg file that macs give when you specify --bdg. Locate the given area where you expect a peak
Check the backgrount bdg file for the same area
Check that the computed p-value using that sample and control is low enough (macs uses the background as lambda). The computed p-value should be far below the q value threshold you specify (default is 0.05).

Even though the p-value is significant, remember that macs2 removes peaks smaller than fragment length (at least when not using --broad, I am not sure how it is when using the --broad option). Also, if there are holes inside the peak that are bigger than the read length, the peak will be split, and the two parts might be too small. This might also be a reason why that peak is not detected (visually, it seems there might be a hole in the middle of your peak.

A last thing you should be aware of, is that macs filters out reads mapping to the same position (allowing maximum one read starting at every base pair). Macs claims that it will accept more reads if this is natural (it computes what is natural), but from my experience, this estimation is buggy and macs will never accept more than one read starting at any base pair. Thus, the pileup you are visualising might be very different if removing duplicate mapping (not sure if you have done that before visualising). You could try to specifiy --keep-dup to Macs, and see if that results in the peak being detected.