Question regarding MACS2 approach of peak calling
0
0
Entering edit mode
15 months ago
rkc5 • 0

I know the way MACS2 uses Poisson distribution to find the p value for the enrichment of a signal peak(observed) as compared to the local control signal(expected). My question is can we compare the p values of different experiments with different sequencing depth.

An experiment with higher sequencing depth will not just have high peak signal but will have high noise too and thus the p value might differ as I have demonstrated below by taking 2 cases.

Case 1) Observed = 40 , expected =30 , p value = 0.03230957

But let’s increase the sequencing depth by 10x in case 2,then the observed and expected signals will also increase roughly by 10x, thus the p value will be much more significant.

Case 2) Observed =400, expected= 300, p value = 0.00000001639443

In my research work, I am in a need to compare different Chip-Seq experiment with different sequencing depth(varying from 100million to 10 billion sequencing depth). How do you think I can compare the signal enrichment between 2 chip-seq experiment with different sequencing depths ?

MACS2 sequencing-depth chipseq • 746 views
ADD COMMENT
1
Entering edit mode

The p-values depend both on the depth and the local signal-to-noise ratio. I generally would not compare them. If you really want to make statements about differences between peaks then use a dedicated framework such as limma for a differential analysis. For example comparing treatment groups.

Can you elaborate what exactly you need, please give details.

ADD REPLY
0
Entering edit mode

I am interested in a meta analysis of two proteins: protein A and protein B . I am interested in finding their binding pattern in sites across the genome. I have collected their chipseq data in all types of cell lines and merged them, so its a meta analysis. The total seqdepth for protein A and protein B differs significantly - protein A has seq depth of 600 million, whereas protein B has seq depth of 2000 million.

I have noted that as I increase my seqdepth to around 500 million, even small slightly elevated noise peak gets significant p value of enrichment, for the same reason as I have noted in my question.

Unlike you said, I am not interested in any kind of treatment groups or context. I am interested in the general/collective overview of all contexts, thus I merge chipset data from all contexts.

ADD REPLY
0
Entering edit mode

Merging data is not a meta-analysis. Meta-analysis means that you for example take the stats from many groups and then apply a method that checks which genomic sites consistenly have high ranks in the individual analysis, RobustRankAggregation for example. Or it combines p-values (not discussing whether this makes sense here) to get a consensus p-value by methods such as Stout or Fisher.

ADD REPLY

Login before adding your answer.

Traffic: 1303 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6