DESeq2: Multiples groups & Cook's distance cutoff
1
1
Entering edit mode
4.8 years ago

Hello everyone,

I've been looking in many topics, but none answer clearly my question. In a classical multiple groups RNA-seq analyse, does the flagging with the Cook's distance take in account the groups you're looking at?

Here's my experiement: 4 different groups (3 replicates, no batch effect, etc...) [DESeq2 pipeline] res(dds, contrast=c('condition','group1','group2') ) I saw that a gene of interest was getting NA as pvalue and padj. To understand why, I decided to investigate on normalized count, raw count and Cook's distance. On the last metrics, one of the Group 3 samples is definitely considered as an outlier. So is that why I can't get pvalue for this gene, even if I'm currently working on the group 1 and 2, not the 3 ?

Has anyone a clue to avoid this effect ?

Thanks!

RNA-Seq Deseq2 • 1.6k views
ADD COMMENT
1
Entering edit mode
4.8 years ago

From my understanding[*], DESeq2 manages the "multiple testing problem" by performing an initial filtering of the genelist using (what should be) a distinct statistical test, i.e. the genelist is "independently filtered". The idea is that genes that are unlikely to produce a low p-value are removed from the analysis beforehand. Genes that have been independently filtered are given a p-value of NA.

You can turn off independent filtering in the call to the results() function, or increase the alpha threshold.

I naively expect you should lose power by doing either of those things.

[*] My understanding of DESeq2 is, in its entirety: independent filtering and negative binomial Wald test with sample/condition-blind estimates of genewise dispersion

ADD COMMENT

Login before adding your answer.

Traffic: 2261 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6