Question

DESeq2 results vs. shrinkage results

1

Entering edit mode

4 months ago

hagl ▴ 20

Dear community,

I am working with DESeq2 for RNA sequencing analysis. I recently discovered a large disparity between the DGE results and the results after shrinkage procedures. This only appears, if I apply a filter for LFC and adjusted P values. I imagine a difference in the algorithm since I filtered the shrunken results manually?

I am grateful for any explanations and an advice regarding which results I should use for my downstream analysis.

res_padj0.05_LFC_0.5 <- results(dds2, alpha = 0.05, lfcThreshold = 0.5,
                   contrast = c("disease", "Ulcerative colitis", 
                                "non-IBD"))

out of 17808 with nonzero total read count
adjusted p-value < 0.05
LFC > 0.50 (up)    : 1, 0.0056%
LFC < -0.50 (down) : 2, 0.011%
outliers [1]       : 0, 0%
low counts [2]     : 5, 0.028%
(mean count < 0)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

res_shrink <- lfcShrink(dds2, 
                        contrast = c("disease", "Ulcerative colitis", 
                                     "non-IBD"), 
                        type = "normal")

LFC_threshold <- 0.5
padj_threshold <- 0.05 

res_shrink_LFC0.5_padj0.05 <- res_shrink[which(res_shrink$padj
                                              <= padj_threshold 
                              & abs(res_shrink$log2FoldChange) 
                                              >= LFC_threshold),]
res_sign <- res_shrink_LFC0.5_padj0.05

summary(res_sign)

out of 319 with nonzero total read count
adjusted p-value < 0.1
LFC > 0 (up)       : 141, 44%
LFC < 0 (down)     : 178, 56%
outliers [1]       : 0, 0%
low counts [2]     : 0, 0%
(mean count < 2)
[1] see 'cooksCutoff' argument of ?results
[2] see 'independentFiltering' argument of ?results

Shrinkage DESeq2 Sequencing RNA • 602 views

ADD COMMENT • link updated 4 months ago by i.sudbery 20k • written 4 months ago by hagl ▴ 20

0

Entering edit mode

Also note that your alpha values for each method are different.

ADD REPLY • link 4 months ago by jared.andrews07 ★ 18k

0

Entering edit mode

Altough it looks like they are different in the resulting code, they should be the same. You are probably referring to the "adjusted p-value < 0.1" in the last section. But I pre filtered the results with the threshold "padj_threshold <- 0.05" which I thought was somehow equal to the approach above.

ADD REPLY • link 4 months ago by hagl ▴ 20

1

Entering edit mode

No, he refers to alpha which has a default of 0.1. Since you're not giving an alpha in lfcShrink the internal results call uses 0.1 whereas in the first call you use 0.05. Not a big difference tough. The critical part is, as noted by others, that your first test is much more stringent, maybe too stringent if you don't have sufficient power. Note also that normal shrinkage is nolonger recommended by the developers, see DESeq2 vignette.

ADD REPLY • link 4 months ago by ATpoint 85k

score 2 · Answer 1 · 2024-07-22

(Not a conclusive answer in any way...) From top of my head I would say that the two procedures are fairly different and it is not surprising that there is some discrepancy. In the first case you test against the null hypothesis that |lfc| > 0.5. In the second, you test against the null of |lfc| > 0 and afterwards you set a threshold on shrunk lfc. Consider a gene with unshrunk lfc = 1 which is significantly different from 0 but not from 0.5. Say after shrinkage this lfc becomes = 0.6. This gene will not pass the first method but it will pass the second.

I guess the faulty intuition is that a shrunk lfc 0.6 ought to result in an unshrunk lfc significantly different from 0.5, but shrinkage and statistical significance are quite different beasts. In my understanding the shrunk lfc is the fold-change you should be believe the most given data and assumptions. But this doesn't mean that a shrunk (most believable) lfc of 0.6 is incompatible with the hypothesis of the true, unshrunk lfc being < 0.5. (Note that a p-value can be interpreted as a measure of incompatibility with the null so that a small p-value means "very incompatible" [citation needed]).