Hello everyone,
I'm seeking clarification regarding the use of MACS2/3 for paired-end (PE) ChIP-seq data analysis. In lesson 6 of the Harvard Chan Bioinformatics Core ChIP-seq "flipped" tutorial, "Peak calling with MACS2", I see the following text:
When PE datasets are analyzed in single-end mode, MACS2 eliminates the second read of each pair (the "R2" read) and then treats the remaining "R1" reads as if they were single-ended. It models the fragment lengths from the "single-end" R1 reads and then extends the read lengths to the average value from the mode. Using this mode with paired-end data enables the use of actual fragment lengths, for a more accurate end result.
My question is about the best practice for analyzing PE data with MACS2/3. Is it generally preferable to run MACS2/3 in -f BAM
mode (effectively single-end mode) rather than -f BAMPE
mode? My understanding from the above snippet is that the fragment length prediction by MACS2/3 (i.e., the determination of d by MACS2/3 run in -f BAM
mode) is more accurate than calculating fragment lengths from the ends of the aligned read pairs (i.e., from running MACS2/3 in -f BAMPE
mode). Could anyone provide insights or experiences on whether this approach leads to more accurate results in PE ChIP-seq data analysis?
Thank you in advance for your guidance.
Best,
Kris