difference between macs2 callpeak using bam and bed file
1
0
Entering edit mode
4 weeks ago
QX ▴ 60

Hi all,

I am using macs2 callpeak, which they have options for both bam and bed files, where the bed file can be generated by bam file. Then, do you know which could be difference between the outcome from the two settings?

macs2 atac-seq • 429 views
ADD COMMENT
2
Entering edit mode
4 weeks ago

bam file are just bed files with extra information, so I would not expect better peak calling with bed files than bam files. For the opposite, it seems that macs2 is using the FLAG information to only consider paired reads.

As for BAM file, only FLAG field is checked so MACS won’t check mapping quality

It has been asked here before.

But nothing gives you a better answer than your own expriment ! Let us know what you find.

ADD COMMENT
1
Entering edit mode

Hi, thank for your response. My data is pair-end sequencing. I tested on 4 modes from macs2 callpeak: BAM, BAMPE, BED, BEDPE. While macs2 does not documents on these mode, macs3 does.

Peaks returns are already different: BAM: 401512, BAMPE: 304071, BED: 442285, and BEDPE is error. I check the BAM and BED file and it return exactly same coordinates:

`.bed`

1       792526  792591  +
1       792885  792950  -
1       793201  793266  +
1       793573  793637  -

`.bam`

1       792527  
1       792886  
1       793202 
1       793574  

I think for BAM setting, they use more information in the flag to check the coordinate.

Then I checked visualization for these 3 modes, I see the BED mode seem to return the most correct regions visibly; while BAMPE mode has the tendency to over extend the peak region: enter image description here

I am not sure if this is the correct way to perform, but I will go for BED mode as it is visibly better.

ADD REPLY
2
Entering edit mode

It all depends of how confident you are to include reads not paired end in your peak calling.

  • BED will include reads whenever they are properly paired end or not.
  • BAM will include reads which are only paired ends, but only register the 5' end of the fragment
  • BAMPE will include reads which are only paired ends, but register the 5' end of the fragment + the template length (thus the "overextend" you were talking about).

Discussion between BAM/BAMPE here.

I see the BED mode seem to return the most correct regions visibly

Even if the data "looks" better, does not mean they are more robust. Calling peaks with your BED files will assign non paired reads into your peak calling.

As it is a paired end library, I would advise you to use the BAM files to call your peaks, but the last call is up to you.

ADD REPLY
0
Entering edit mode

that make more sense for me; I would go for BAM file. thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1293 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6