Question

Bias in FDR distribution

0

Entering edit mode

5 months ago

Daniel ▴ 10

I am using the limma package to model gene expression based on a continuous variable while controlling for technical variables. When applying traditional p-value cutoffs (FDR < 0.05), I do not identify any differentially expressed genes (DEGs). However, using a more relaxed FDR threshold of <0.1, I find over 1,700 DEGs.

This prompted me to investigate the p-value distribution, which reveals a substantial deviation from the null hypothesis, indicated by a peak near zero.

enter image description here

Upon further examining the FDR distribution, I observe the following pattern:

enter image description here

FDR < 0.05: 0 DEGs
FDR < 0.06: 196 DEGs
FDR < 0.07: 585 DEGs
FDR < 0.08: 884 DEGs

Has anyone encountered this pronounced increase in the number of DEGs within such a narrow FDR interval? What could be causing this spike?

Here is the structure of the code I am using:

design <- model.matrix(~1 + ContinuousVariable + TechnicalVariables, data = sample.info)
    vwts <- voomWithQualityWeights(dge, design = design, plot = FALSE)
    vfit2 <- lmFit(vwts)
    vfit2 <- eBayes(vfit2)
    res.p <- topTable(vfit2, coef = 2, sort.by = "logFC", number = "inf", p.value = 0.1)

Thank you very much!!

DEG distribution p-value limma • 332 views

ADD COMMENT • link 5 months ago by Daniel ▴ 10

0

Entering edit mode

Is there a typo in your first paragraph? Your plot doesn't match up with their being 17,000 DEGs.