Question

Huge difference between P and adj P value

3

Entering edit mode

4.3 years ago

Dr SKY ▴ 30

Hello friends, I recently analysed a RNA-Seq data and yielded DEGs using DESeq2. Everything was fine until I figured out that on applying P value 0.05, i am left with 1,700 DEGs whereas on considering and applying Padj 0.05, I am left with only 127 DEGs??? isn't this surprising ? How can there be such a huge difference in number of significant data on just considering adj P value?

Thank you to anyone who can afford any little energy and time to guide me. I am stuck!!

rna-seq • 1.7k views

ADD COMMENT • link updated 4.3 years ago by i.sudbery 21k • written 4.3 years ago by Dr SKY ▴ 30

1

Entering edit mode

Check the expression of your DEG. The ones with a robust expression give more robust statistics (padj). Does this mean that the lesser expressed genes aren't differentially expressed? No, but we can't assure it statistically. A way of dealing with this issue is by adding more replicates (look for more experiments in public databases like GEO)

ADD REPLY • link 4.3 years ago by jordi.planells ▴ 480

0

Entering edit mode

Thank you for your reply

ADD REPLY • link 4.3 years ago by Dr SKY ▴ 30

score 8 · Accepted Answer · 2021-02-01

8

Entering edit mode

4.3 years ago

i.sudbery 21k

Its completely expected. Under classic multiple testing correction you'd expect your adjusted P-values to be 20,000 times larger than you non-adjusted values. Obviously its not as bad as this because we don't use Bonferroni in RNAseq, but the multiple testing burden is still high.

Another way of looking at it is that under a standard hypothesis test, we would expect 5% of tests to give a false positive, thats around 1000 genes, so we would expect the number of genes passing 5% threshold to be at least 1000 less for an adjusted p-value.