What Is The Meaning Of The Score In Diffbind'S Occupancy/Overlap Analysis?
1
1
Entering edit mode
10.9 years ago
Nick ▴ 290

I have 3 chip-seq biological replicates, each with its own input control. I was interested in using diffBind to performs some IDR-style analysis - e.g. take only the peaks that come up in more than one sample.

experiment <- dba(sampleSheet="exp_samples.csv")
pdf('overlap_venn.pdf')
dba.plotVenn(experiment, experiment$masks$macs)
dev.off()

I got a nice Venn diagram:

enter image description here

I wanted to perform a sanity check and had a look at the peaks occurring only in A (sample1):

experiment.OL = dba.overlap(experiment, experiment$masks$macs)
a_peaks <- as.data.frame(experiment.OL$onlyA, row.names = NULL, optional = FALSE)
> dim(a_peaks)
[1] 500   6
> head(a_peaks)
seqnames   start     end width strand      score
1 GL172637.1  196748  197840  1093      * 0.09639324
..

I decided to check the first peak and, sure enough I found it in the xls file produced for this sample by macs14:

chr    start    end    length    summit    tags    =-10*LOG10(pvalue)    fold_enrichment    FDR(%)
GL172637.1    196748    197840    1093    299    38    62.55    48.57    100

However, what confuses me is the score computed by diffBind (0.09639324). In the diffBind's documentation it says "The scores associated with each site are derived from the peak caller con dence score, and are a measure of con dence in the peak call (occupancy), not a measure of how strong or distinct the peak is." I don't know how to interpret this. Also, it is not clear to me how is this score related to its FDR (100%). Can anyone help?

chipseq macs • 6.4k views
ADD COMMENT
0
Entering edit mode

I have 4 samples each with individual scores in the range of 100s and also a consenus peakset of all 4 samples where i tried to to convert the scores as per the above into decimal range, but it still does not fit. Am i missing something?

ADD REPLY
2
Entering edit mode
10.9 years ago
Nick ▴ 290

Rory (the author of diffbind) kindly replied to a personal message I sent. I'm pasting his reply here:

The default peak score for macs peaks is the "=-10*LOG10(pvalue)" value. It is normalised to a 0..1 scale by dividing the scores by the maximum score (so the max score gets a value of 1).

ADD COMMENT

Login before adding your answer.

Traffic: 2342 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6