Question

Comparing Peaks Coming From Different Peak Calling Programs

2

Entering edit mode

12.4 years ago

KCC ★ 4.1k

I have a few concerns when comparing peaks between different programs or between different replicates with the same program:

Peak Width (How do I handle the issue that one program might call a region as one peak and another might call the same region as two peaks)? This makes it really difficult to say program A called 10,000 peaks but program B called 9000 and have those numbers have any meaning.
How do I define two peaks as being the same peak (so I can say two replicates called the same peak for instance)? How much should they overlap? Or should they be called the same if they are within a certain distance of each other? How do I define that distance?
What is the most typical and accepted way to measure how significantly two sets of peaks overlap? I know that I could use GSC, the Bioconductor package Co-occur or some version of hypergeometric test. I was hoping for some sense from the community of how typical it is to use any of these approaches. What do most people do?

peak-calling peak-calling • 3.1k views

ADD COMMENT • link updated 12.4 years ago by Stephen 2.8k • written 12.4 years ago by KCC ★ 4.1k

score 2 · Answer 1 · 2013-04-15

2

Entering edit mode

12.4 years ago

Stephen 2.8k

ADD COMMENT • link 12.4 years ago by Stephen 2.8k