why my resultsNames(deseq) is not the same as model.matrix ?
1
0
Entering edit mode
8.4 years ago
unique379 ▴ 120

Dear all, I have three question about RNASeq analysis of my data using DESeq2. Lets say i have 4 groups and wanna see DE for each combinations of the groups (By specifying the contrast in results function of DESeq2). So i used the default setting of DESeq considering intersect/base level (while betaPrior=TRUE; default). But i am wondering why my model.matrix is not the same as condition in dds object. ?

question 1:

dds <- DESeqDataSetFromMatrix(countData= filtered, colData=DataFrame(group), design = ~ group)
deseq <- DESeq(object = dds, fitType = "local", betaPrior=TRUE)
## get contrast
resultsNames(deseq)

[1] "Intercept"     "groupA"   "groupB"   "groupC"
[5] "groupD"

> model.matrix(~group)
   (Intercept)    groupB    groupC   groupD
1            1           0             0           0
2            1           0             0           0
3            1           0             0           0
4            1           1             0           0
5            1           1             0           0
6            1           1             0           0
7            1           0             1           0
8            1           0             1           0
9            1           0             1           0
10           1           0             0           1
11           1           0             0           1
12           1           0             0           1
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")$group
[1] "contr.treatment"

I was hoping group A must be my base level but after seeing resultsNames(deseq) output, got surprised !!! Why ??? Please someone explain ??

question 2:

can i use design = ~0+group in dds object with betaPrior = FALSE (without using base level) in order to use different different contrast in results function ?? This way i will get all 6 combination of contrast of my 4 groups (A;B;C;D).

question 3:

If Answer is YES for question 1, so I also want to remove batch from my dataset using SVA package. Therefore i need design.matrix as the same as fro deseq design, i tried as follow but got error:

> dds <- DESeqDataSetFromMatrix(countData= filtered, colData=DataFrame(group), design = ~ 0+group)
converting counts to integer mode
> dds <- estimateSizeFactors(dds)
> dat <- counts(dds, normalized=TRUE)
> idx <- rowMeans(dat) > 1
> dat <- dat[idx,]
> ## log2 transform in order to use sva instead svaseq
> dat <- log2(dat)
> mod <- model.matrix(~0+group, colData(dds))
> #mod0 <- cbind(mod[,1])
> sva.dat <- sva(dat, mod, mod0=NULL, n.sv=3)
Number of significant surrogate variables is:  3 
Iteration (out of 5 ):1  
Error in solve.default(t(mod) %*% mod) : 
  system is computationally singular: reciprocal condition number = 1.38778e-17

Any idea, how to use sva having without base model/NULL model ?? I assume this error happened due to model.matrix i.e. mod.

Thanks

RNA-Seq DESeq2 R sva • 4.5k views
ADD COMMENT
0
Entering edit mode

This was cross posted on Bioconductor support:

https://support.bioconductor.org/p/84725/

ADD REPLY
0
Entering edit mode
8.4 years ago
  1. DESeq2 uses an extended model matrix, which has the benefit of making shrinkage symmetric when you're doing contrasts. So the base level in the comparisons will be whatever you want.
  2. I don't think the prior prevents you from using contrasts, nor does the extended matrix intercept (in fact, that's a convenient part of the extended model matrix).
  3. You might try a full model of ~group and a reduced model of ~1, though presumably the SVA documentation has more information on that.
ADD COMMENT
0
Entering edit mode

Thank you Ryan for your kind and efficient answer. I tried with full model then ran deseq wald and LRT test to see the differneces but got very surprising results for the same contrast. How it possible that, Wald Test gave me tags 0 at padj <0.05 while LRT test gave 345 tags at the same level of cut-off. This is really i dont understand why it happened, could you please explain me why or where i am making mistake ?? Here is my code:

dds <- DESeqDataSetFromMatrix(countData= filtered, colData=DataFrame(group), design = ~group)
dds <- estimateSizeFactors(dds)
dat <- counts(dds, normalized=TRUE)
idx <- rowMeans(dat) > 1
dat <- dat[idx,]
## log2 transform in order to use sva instead svaseq
dat <- log2(dat)
## full model
mod <- model.matrix(~group, colData(dds))
## taking first group as intersept
mod0 <- model.matrix(~1, colData(dds))
sva.dat <- sva(dat, mod, mod0, n.sv=3)
## batch corrected dds object
ddssva <- dds
ddssva$SV1 <- sva.dat$sv[,1]
ddssva$SV2 <- sva.dat$sv[,2]
ddssva$SV3 <- sva.dat$sv[,3]
design(ddssva) <- as.formula(~ SV1 + SV2 + SV3 + group)

## Using Wald test (default)
deseq.wald <- DESeq(object = ddssva)
## get contrast
resultsNames(deseq.wald)
**1] "Intercept"     "SV1"           "SV2"           "SV3"          
[5] "groupA"   "groupB"   "groupC" "groupD"**  
res.wald <- results(deseq.wald,contrast = c("group", "B", "A"))
## order
res.wald <- as.data.frame(res.wald)[order(as.data.frame(res.wald)[6]),]
## how many significant tags?
sum(res.wald$padj < 0.05, na.rm=TRUE)
**0**

**## Using LRT test**
deseq.lrt <- DESeq(object = ddssva, test = "LRT", reduced = ~1)
## get contrast
resultsNames(deseq.lrt)
**[1] "Intercept"                "SV1"                     
[3] "SV2"                      "SV3"                     
[5] "group_B_vs_A"   "group_C_vs_A"
[7] "group_D_vs_A"**  
res.lrt <- results(deseq.lrt,contrast = c("group", "B", "A"))
## order
res.lrt <- as.data.frame(res.lrt)[order(as.data.frame(res.lrt)[6]),]
## how many significant tags?
sum(res.lrt$padj < 0.05, na.rm=TRUE)
**345**

I understand that there is problem in contrast of resultsNames(deseq.wald) and resultsNames(deseq.lrt), but why ?? using two different tests made different different contrasts ??

Thank once again for your kind reply.

ADD REPLY
0
Entering edit mode

I'm very much not sure about this, but I suspect that because you used an LRT with a reduced model of ~1 there were no appropriate coefficients for the contrast. Perhaps that's just my naivete, since I only ever think of contrasts in the context of Wald tests (but I'm not a statistician, so perhaps that's why). You might ask Mike Love over on the bioconductor site. He's much more active there.

ADD REPLY

Login before adding your answer.

Traffic: 2209 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6