Problem is to design model matrix for edgeR
1
1
Entering edit mode
10.1 years ago

Dear all,

I am trying edgeR to detect DE gene for my time course data. I have one experimental condition with 4 time point, But while using design matrix its give me a error.

My code is: (in short:)

> label

 [1] WT WT WT WT WT WT WT WT WT WT WT WT WT WT WT WT WT
Levels: WT
> Time
 [1] T1 T1 T1 T1 T1 T1 T2 T2 T2 T2 T3 T3 T3 T3 T4 T4 T4
Levels: T1 T2 T3 T4

# Before we fit GMLs, we need to define our design matrix
design <- model.matrix (~Time + label)

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more levels

Any advice? Thanks in advance guys

edgeR Differential-expression deseq R RNA-Seq • 7.3k views
ADD COMMENT
0
Entering edit mode

As Devon say below label serves no purpose, when trying to generate the design.matrix you need more than one level, other wise R cannot create the model representation.

ADD REPLY
4
Entering edit mode
10.1 years ago

Since label serves no purpose, try design <- model.matrix(~Time)and let us know if that doesn't fix things.

ADD COMMENT
0
Entering edit mode

Thank you very much Ryan....you are fabulous. its work :)))

ADD REPLY
0
Entering edit mode

Dear Ryan:

I am wondering where I have Time 1 as I have 4 time in my series ??

> design
        (Intercept) TimeT2 TimeT3 TimeT4
WT.T1.1           1      0      0      0
WT.T1.2           1      0      0      0
WT.T1.3           1      0      0      0
WT.T1.4           1      0      0      0
WT.T1.5           1      0      0      0
WT.T1.6           1      0      0      0
WT.T2.1           1      1      0      0
WT.T2.2           1      1      0      0
WT.T2.3           1      1      0      0
WT.T2.4           1      1      0      0
WT.T3.1           1      0      1      0
WT.T3.2           1      0      1      0
WT.T3.3           1      0      1      0
WT.T3.4           1      0      1      0
WT.T4.1           1      0      0      1
WT.T4.2           1      0      0      1
WT.T4.3           1      0      0      1
attr(,"assign")
[1] 0 1 1 1
attr(,"contrasts")
attr(,"contrasts")$Time
[1] "contr.treatment"
ADD REPLY
1
Entering edit mode

You specified to keep the intercept, so it's not an explicit coefficient (everything is being compared to it). If you want to use contrasts instead, then use model.matrix(~0+Time).

ADD REPLY
0
Entering edit mode

Dear Devon,

Sorry for late comments for the same question: Advised model.matrix(~0+Time) may I use coef =2:4?

lrt <- glmLRT(fit,coef=2:4) is it ok?

If yes, then I got about 700 genes (miRNAs) with hugest logFC out of ~1200. is it right? I have doubt? Something is wrong!

#Estimate Dispersion
design <- model.matrix (~0 +Time)
rownames(design) <- colnames(y)
design
y <- estimateGLMCommonDisp(y, design, verbose =T)
y <- estimateGLMTrendedDisp(y,design)
y <- estimateGLMTagwiseDisp(y,design)
fit <- glmFit(y,design)
lrt <- glmLRT(fit,coef=2:4)
topTags(lrt)

Coefficient:  TimeT2 TimeT3 TimeT4
                 logFC.TimeT2 logFC.TimeT3 logFC.TimeT4   logCPM       LR          PValue      FDR
mmu-miR-5619-5p    ** -20.45565**    -20.22821    -19.54368 1.159783 43579141     0              0
mmu-miR-28c        ** -20.45565 **   -19.69322    -19.92847 1.161204 43576050        0              0
mmu-miR-6395        **-20.45565**    -19.93606    -20.08351 1.163118 43334784       0              0
.....
ADD REPLY
0
Entering edit mode

If you're going to use model.matrix(~0+Time) then you really need to use contrasts. Otherwise, you end up testing whether you have signal from an miRNA at T2, T3 and/or T4...which is probably not what you wanted to test. If you want to use glmLRT(fit,coef=2:4), then use model.matrix(~Time) instead. This would then ask the question, "Is there an effect of time on a given miRNA." This is likely what you actually want to know.

ADD REPLY
0
Entering edit mode

Thank Devon for you generous advice. I tried the same and got 0 significant miRNA of FDR < 0.05 while having about 51 significant raw pvalue, could you please tell me how it happened.
# noemalization
y <- DGEList(counts=wt,group=label)
#Estimate Dispersion
design <- model.matrix (~Time)
rownames(design) <- colnames(y)
design
y <- estimateGLMCommonDisp(y, design, verbose =T)
y <- estimateGLMTrendedDisp(y,design)
y <- estimateGLMTagwiseDisp(y,design)
fit <- glmFit(y,design)
lrt <- glmLRT(fit,coef=2:4)
topTags(lrt)
> sum(lrt$table$PValue < 0.05)
[1] 51
> sum(lrt$table$FDR < 0.05)
[1] 0

ADD REPLY
0
Entering edit mode

If there were only 51 genes with raw p-values <0.05, then it's entirely expected that you'd have no significant findings. It seems that time has no effect in your dataset.

ADD REPLY
0
Entering edit mode

Do you think, gene filtering step can affect to find the significant genes ??? May i try DESeq2 packages as he has gene filtering steps ??? Do you think its worth to try ??? what edgeR does not have it.

ADD REPLY
0
Entering edit mode

Sure, give it a try. In your case I doubt it'll make a difference, but it won't hurt to try.

ADD REPLY
0
Entering edit mode

Dear Dev,

Could you help me once again?

I tried but got error while using DESeqDataSeq

library(DESeq2)
x= read.table("counts_filt.txt", header=T)
rownames(x) <- x$miRNA
x <- x [, -1]
design=read.table("design_deseq.txt", header=T)
>design
strain time.m replicates        id
WT.T1.1     wt      1         r1        wk
WT.T1.2     wt      1         r1        wk
WT.T1.3     wt      1         r1        wk
WT.T2.1     wt     12         r2      year
WT.T2.2     wt     12         r2      year
WT.T2.3     wt     12         r2      year
WT.T2.4     wt     12         r2      year
WT.T3.1     wt     21         r3   two_years
.......
.................
WT.T4.6     wt     36         r4       old
ddsTc <- DESeqDataSet(x, design= ~design$strain + design$time.m + design$strain:design$time.m)

Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function 'assays' for signature '"data.frame"'
ADD REPLY
0
Entering edit mode

Please read help(DESeqDataSet), noting in particular the DESeqDataSetFromMatrix() function.

ADD REPLY
0
Entering edit mode
##countData; I called edgeR's DGEList bcz got error in counts function of DESEQ2

countData <- DGEList(counts=x,genes=x[,1],)

dds <- DESeqDataSetFromMatrix(countData= countData, colData = colData, design = ~ Time + label + Time:label)

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
  contrasts can be applied only to factors with 2 or more level

However, I used only Time

dds <- DESeqDataSetFromMatrix(countData= countData, colData = colData, design = ~ Time)

ddsLRT= DESeq (dds, test="LRT", reduced = ~Time)
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
**NOTE: fitType='parametric', but the dispersion trend was not well captured by the
  function: y = a/x + b, and a local regression fit was automatically substituted.
  specify fitType='local' or 'mean' to avoid this message next time.
final dispersion estimates
fitting model and testing
Error in nbinomLRT(object, full = full, reduced = reduced, betaPrior = betaPrior,  :
  less than one degree of freedom, perhaps full and reduced models are not in the correct order

Why it happened?

ADD REPLY
1
Entering edit mode

It appears you're trying to get through an analysis without reading any of the documentation for the software. When you encounter an error there is a bit of troubleshooting you should do on your part. For one thing read the help for the function that gave an error, here that means type ?DESeq into the R console and read the meaning of the full and reduced formula.

The help files tell you what kind of object each argument is asking for. By typing class(x) you can see if 'x' is the appropriate object to be giving for each argument.

ADD REPLY
0
Entering edit mode

I seriously doubt that using a DGElist as the input matrix will work properly. Also, you need to show what colData is.

ADD REPLY
0
Entering edit mode

wait I will update it soon, I got it. Thanks for advice.

ADD REPLY

Login before adding your answer.

Traffic: 2175 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6