Confirmation required on how DiffBind generates union of regions (min. 1bp overlap or gap)?
0
0
Entering edit mode
6 months ago
Ian 6.1k

Firstly, this I admit this is a repeat of a question asked by me on Bioconductor (https://support.bioconductor.org/p/9158062/), or more precisely my response to the answer I was given. I am asking the question again here as my question has remained unanswered for nearly a month. My apologies to the author if there a reasons preventing him answering.

I would like to know if anyone can either confirm or refute my observation about the default mode that DiffBind uses to generate the initial union of peaks. My understanding has always been that a minimum of a 1bp overlap is required. I wanted to be able to change config$mergeOverlap to increase the stringency. However, my tests seem to show that the default is a minimum of a 1bp space between regions. My observation is that changing config$mergeOverlap to 1 gives the same answer as the default/unchanged setting, i.e. a gap not an overlap. My working is show below. I would appreciate any feedback on this.

EDIT: I would really like to get to the bottom of this. If anyone uses DiffBind please could they check whether my observation is correct or not. Thank you!

This is the default where config$mergeOverlap = NULL (how it is without me changing anything)
9 Samples, 134545 sites in matrix (183389 total)

samples_qval <- dba(sampleSheet="sample_sheet_qval.csv", minOverlap=2)  

This is where config$mergeOverlap = 1.
9 Samples, 134545 sites in matrix (183389 total) = same as above.

samples_qval_1bp <- dba(sampleSheet="sample_sheet_qval.csv", minOverlap=2)   
samples_qval_1bp$config$mergeOverlap <- 1  
samples_qval_1bp <- dba(samples_qval_1bp) 

This is where config$mergeOverlap = -1.
9 Samples, 134406 sites in matrix (182919 total):

samples_qval_m1bp <- dba(sampleSheet="sample_sheet_qval.csv", minOverlap=2)  
samples_qval_m1bp$config$mergeOverlap <- -1  
samples_qval_m1bp <- dba(samples_qval_m1bp)  

This is where config$mergeOverlap = -60.
9 Samples, 130414 sites in matrix (171033 total)

samples_qval_30pc <- dba(sampleSheet="sample_sheet_qval.csv", minOverlap=2)  
samples_qval_30pc$config$mergeOverlap <- -60  
samples_qval_30pc <- dba(samples_qval_30pc) 

The DiffBind manual entry for DBA$config$mergeOverlap is:

"The overlap (in basepairs) between peaks to merge when generating a consensus peakset. A positive value controls how many basepairs peaks must overlap to be merged, while a negative value will result in non-overlapping peaks to be merged, If absent, the default value of 1 will result in any peaks overlapping by at least one basepair to be merged into a single interval."

I believe I am seeing the opposite to this.

DiffBind • 675 views
ADD COMMENT
0
Entering edit mode

Tagging: Rory Stark

ADD REPLY
0
Entering edit mode

If anyone can corroborate my observations I would be grateful.

ADD REPLY
0
Entering edit mode

Did you get any answer? Also interested in this point, thanks

ADD REPLY
0
Entering edit mode

Unfortunately not. Are you also seeing the same?

ADD REPLY

Login before adding your answer.

Traffic: 2091 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6