Question

HOMER findPeaks: what does -fdr actually do?

0

Entering edit mode

5 weeks ago

mcsimenc ▴ 20

I refer to the HOMER online documentation for findPeaks here: http://homer.ucsd.edu/homer/ngs/peaks.html

In the "Peak Filtering options" the flag -fdr is listed as:

-fdr <#> (False discovery rate, default = 0.001)

The other information about this setting is given in the section "Identification of Putative Peaks":

HOMER now assumes the local density of tags follows a Poisson distribution, and uses this to estimate the expected peak numbers given the input parameters much more quickly. Using the expected distribution of peaks, HOMER calculates the expected number of false positives in the data set for each tag threshold, setting the threshold that beats the desired False Discovery Rate specified by the user (default: 0.001, "-fdr <#>").

and,

It is important to note that this false discovery rate controls for the random distribution of tags along the genome, and not any other sources of experimental variation. Alternatively, users can specify the threshold using "-poisson <#>" to calculate the tag threshold that yields a cumulative poisson p-value less than provided or "-tagThreshold <#>" to specify a specific number tags to use as the threshold.

From this, I guess that the program adjusts the actual tag depth before calling peaks, but it's not clear to me that that is what happens. Does anybody know how this FDR procedure actually affects the data, and if there is additional documentation explaining it?

homer atac-seq calling peak dap-seq chip-seq • 105 views

ADD COMMENT • link 5 weeks ago by mcsimenc ▴ 20