Bias in FDR distribution
0
0
Entering edit mode
5 months ago
Daniel ▴ 10

I am using the limma package to model gene expression based on a continuous variable while controlling for technical variables. When applying traditional p-value cutoffs (FDR < 0.05), I do not identify any differentially expressed genes (DEGs). However, using a more relaxed FDR threshold of <0.1, I find over 1,700 DEGs.

This prompted me to investigate the p-value distribution, which reveals a substantial deviation from the null hypothesis, indicated by a peak near zero.

enter image description here

Upon further examining the FDR distribution, I observe the following pattern:

enter image description here

FDR < 0.05: 0 DEGs
FDR < 0.06: 196 DEGs
FDR < 0.07: 585 DEGs
FDR < 0.08: 884 DEGs

Has anyone encountered this pronounced increase in the number of DEGs within such a narrow FDR interval? What could be causing this spike?

Here is the structure of the code I am using:

design <- model.matrix(~1 + ContinuousVariable + TechnicalVariables, data = sample.info)
    vwts <- voomWithQualityWeights(dge, design = design, plot = FALSE)
    vfit2 <- lmFit(vwts)
    vfit2 <- eBayes(vfit2)
    res.p <- topTable(vfit2, coef = 2, sort.by = "logFC", number = "inf", p.value = 0.1)

Thank you very much!!

DEG distribution p-value limma • 332 views
ADD COMMENT
0
Entering edit mode

Is there a typo in your first paragraph? Your plot doesn't match up with their being 17,000 DEGs.

ADD REPLY
0
Entering edit mode

Yes, thanks. It was supposed to be 1,700 and has been fixed.

ADD REPLY

Login before adding your answer.

Traffic: 1506 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6