DESeq2 compare all levels
1
7
Entering edit mode
6.5 years ago
firestar ★ 1.6k

For RNA-Seq DGE analysis using DESeq2, if I have a factor with let's say 5 levels:

condition
A
A
B
B
C
C
D
D
E
E

The standard workflow

d <- DESeqDataSetFromMatrix(countData=counts,colData=meta,design=as.formula(~condition))
d <- DESeq2::estimateSizeFactors(d,type="ratio")
d <- DESeq2::estimateDispersions(d)
d1 <- nbinomWaldTest(d)
resultsNames(d1)

gives the following contrast combinations:

condition_A_vs_B
condition_A_vs_C
condition_A_vs_D
condition_A_vs_E

and then:

results(d1,name="condition_A_vs_B")

But, how do I get all pairwise combinations of all levels? i.e., the first combinations as well as these below?

B - C
B - D
B - E
C - D
C - E
D - E
differential-gene-expression RNA-Seq DESeq2 • 18k views
ADD COMMENT
0
Entering edit mode

Hi Kevin,

I am using your following code but getting error. I don't no why it is showing. Any help will be appreciated.

log2cutoff <- 2
qvaluecutoff <- 0.05

sigGenes <- unique(c(
  rownames(subset(res1, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
  rownames(subset(res2, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
  rownames(subset(res3, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
))

heat <- assay(rld)[sigGenes,]

Error:
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'unique': argument 5 is empty
ADD REPLY
0
Entering edit mode

Hi friend,

Can you run either of these commands on their own to see what is produced?

rownames(subset(res1, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff))
rownames(subset(res2, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff))
rownames(subset(res3, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff))

or, even:

subset(res1, padj<=qvaluecutoff & abs(log2FoldChange))
subset(res2, padj<=qvaluecutoff & abs(log2FoldChange))
subset(res3, padj<=qvaluecutoff & abs(log2FoldChange))

What is the output of head(res1), head(res2), and head(res3)?

ADD REPLY
0
Entering edit mode

Hi Kevin,

The output of head (res1), res2, and res3 are the comparison that I have taken for analysis, but when I put this all together and assigned to one variable like same in your code i.e sigGenes and it's giving an error.

This is my comparison:

res1 <- results(dds, contrast=c("condition", "TUB08", "TUB00"))
res2 <- results(dds, contrast=c("condition", "TUB10", "TUB00"))
res3 <- results(dds, contrast=c("condition", "TUB12", "TUB00"))

output example of head(res1):

log2 fold change (MLE): condition TUB08 vs TUB00 
Wald test p-value: condition TUB08 vs TUB00 
DataFrame with 6 rows and 6 columns
baseMean log2FoldChange     lfcSE       stat      pvalue        padj
thrL 5547.2000     0.00245135  0.242725  0.0100993 9.91942e-01 9.94555e-01

I want to generate heatmap of all significantly expressed genes from all the above comparisons but can not proceed because of the error.

ADD REPLY
1
Entering edit mode

Can you please use ADD REPLY to keep this thread organized rather than the answer field which is for, well, answers.

ADD REPLY
31
Entering edit mode
6.5 years ago

To get all pairwise comparisons, it should be as easy as:

BC <- results(dds, contrast=c("condition", "B", "C"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
BC <- lfcShrink(dds, contrast=c("condition", "B", "C"), res=BC)

BD <- results(dds, contrast=c("condition", "B", "D"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
BD <- lfcShrink(dds, contrast=c("condition", "B", "D"), res=BD)

BE <- results(dds, contrast=c("condition", "B", "E"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
BE <- lfcShrink(dds, contrast=c("condition", "B", "E"), res=BE)

CD <- results(dds, contrast=c("condition", "C", "D"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
CD <- lfcShrink(dds, contrast=c("condition", "C", "D"), res=CD)

CE <- results(dds, contrast=c("condition", "C", "E"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
CE <- lfcShrink(dds, contrast=c("condition", "C", "E"), res=CE)

DE <- results(dds, contrast=c("condition", "D", "E"),
  independentFiltering=TRUE, alpha=0.05, pAdjustMethod="BH", parallel=TRUE)
DE <- lfcShrink(dds, contrast=c("condition", "D", "E"), res=DE)

Note that I use lfcShrink, which you should also use if you had previously used the function DESeq() with betaPrior=FALSE

You can also do a LRT (likelihood ratio test if you want to compare all levels simultaneously).

What you can do after this is then concatenate all of the statistically significant genes into a unique list and cluster your samples using these, which should bring out good separation in a sample dendrogram (using the regularised log or variance-stabilised counts).

Kevin

ADD COMMENT
0
Entering edit mode

What you can do after this is then concatenate all of the statistically significant genes into a unique list and cluster your samples >using these, which should bring out good separation in a sample dendrogram (using the regularised log or variance-stabilised counts)

I am very much interested in doing that, but I have no idea how I would go about doing that step-by-step. Could you give me any pointers as to which tools to use for the concatenating and clustering? I would appreciate this very much.

ADD REPLY
5
Entering edit mode

Hey paul, you can do something like this. If you have conducted 3 different differential expression comparisons, you can subset the results objects (res1, res2, res3) to include the statistically significant genes. You then subset your original data matrix for these genes.

log2cutoff <- 2
qvaluecutoff <- 0.05

sigGenes <- unique(c(
  rownames(subset(res1, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
  rownames(subset(res2, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
  rownames(subset(res3, padj<=qvaluecutoff & abs(log2FoldChange)>=log2cutoff)),
))

heat <- assay(rld)[sigGenes,]

Makes sense?

ADD REPLY
1
Entering edit mode

Oh, yes! Absolutely. I did not think it could be that easy. Thank you very much!

ADD REPLY
0
Entering edit mode

in case of LRT what i get is we do get genes that vary across condition. Is it possible to know if a gene is deferentially expressed which condition is it coming from ?

ADD REPLY
1
Entering edit mode

The pairwise comparisons should reveal that, no? - or, could simply plot a box-and-whiskers plot.

ADD REPLY
0
Entering edit mode

okay that sounds simple and straightforward

ADD REPLY
0
Entering edit mode

Hi Kevin

Thanks for the solution you offered for pairwise comparison. During lfcshrinkage, when I try using apeglm method it throws the following error

AL_exp1_vs_SE_exp1 <- lfcShrink(dds, contrast=c("Comparison", "AL_exp1", "SE_exp1"), res=AL_exp1_vs_SE_exp1, type="apeglm")

Error in lfcShrink(dds, contrast = c("Comparison", "AL_exp1", "SE_exp1"),  : 
  type='apeglm' shrinkage only for use with 'coef'

I am unable to understand why I can't use it. I am trying to follow DESeq2 manual where I learned it's better than other shrinkage methods.

ADD REPLY
0
Entering edit mode

Hello, Did you solve this? It happens to me as well! Thanks!

ADD REPLY
2
Entering edit mode

See A: type='apeglm' shrinkage only for use with 'coef'

apeglm needs coefs, explanation in my answer and the DESeq2/apeglm vignettes.

ADD REPLY
0
Entering edit mode

Thank you for the help!

ADD REPLY

Login before adding your answer.

Traffic: 1918 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6