I found the result of macs2 is not perfect and I want to know why. Firstly, I briefly describe the ChIP I made: The sequencing files are paired-end. Because of poor sequencing depth, I merge two input and two IP sample into one input and one sample. After processing the raw fastq files, I used bowtie2 to map these reads. Then I used macs2 to call peak.
macs2 callpeak -t IP_mm10.bam -c Input.bam --outdir . -n IP -g 2652783500 -q 0.1 -f BAMPE --keep-dup all -B
Because I am interested repeat region, I retaind all duplicated. I found the effective genome size from deeptools for mm10. I think the process is OK, but the result is weird. The input file also has peaks, And the score is high.
The two files are bigwig files converted by bamcoverage. Is my code wrong?
I don't understand what the problem is. Your workflow sounds okay, but what exactly is your question? Also you can use
-g mm
instead of using-g 2652783500
. MACS2 has a predefined size for common genomes like mm10 or hg38.I don't think anyone can help much when we don't know what you mean by "weird". We also don't know what kind of ChIP this is, so I have no idea what to expect.
I mean the good peak is that the IP sample has enrichment and input hasn't enrichment. In the two examples I show, the input sample also has enrichment on the peak called by macs2. I consider it strange. I am sorry for ambiguity.
Can you see the image I upload? The images show the peaks where IP and input both are enriched. This is my first time to upload image.
Yes I do see the images although they are very small. I agree it does look like there is not much enrichment of IP over the Input. However I noticed the scales for your IGV tracks are different. You should highlight both tracks, then right click and select the group autoscale. This way you can actually compare them on the same scale.
You did use
-q 0.1
, which allows more insignificant peaks than the default value setting of-q 0.05
. Even then, I don't think you should get peaks for the regions which you showed. Can you also switch to-g mm
because your genome size seems too big.I order the table by negative log10qvalue, and these examples are the second one and the third one. The first one is also strange. I would not worry about some exception. I do change the parameter as you said and the result remains the same with less peak number. I think the calling peak is more about the peak shape than the reads number. This is the fourth one:
So I really think there is something wrong with my code and I do not know.