Entering edit mode
8.4 years ago
biolab
★
1.4k
Dear all,
Our RNA-seq data has two samples, each of which has three biological replicates. I have two edgeR scripts as follows. I found that they produced different results. My question is: which model (classical linear or GLM) to choose in edgeR analysis?
I wish biostars experts on R analysis could provide me some comments or suggestions. Thank you very much .
- R script 1
d <- read.delim("input", sep = ' ', header=F, row.names=1)
colnames(d) <- c("w1", "w2", "w3", "m1", "m2", "m3")
g <- factor(c(1,1,1,2,2,2))
dge <- DGEList(counts=d,group=g)
dge <- calcNormFactors(dge)
design <- model.matrix(~g)
dge <- estimateDisp(dge, design)
fit <- glmQLFit(dge, design)
qlf <- glmQLFTest(fit, coef=2)
topTags(qlf, n=100000)
R script 2
d <- read.delim("input", sep = ' ', header=F, row.names=1)
colnames(d) <- c("w1", "w2", "w3", "m1", "m2", "m3")
g <- factor(c(1,1,1,2,2,2))
dge <- DGEList(counts=d,group=g)
dge <- estimateCommonDisp(dge)
dge <- estimateTagwiseDisp(dge)
et <- exactTest(dge)
topTags(et, n=100000)
And the difference between glmFit and glmQLFit is described here:
https://support.bioconductor.org/p/76790/
Hi, b.nota,
Thank you very much for your reply.
I just performed a variety of tests, and found that if changing the code
qlf <- glmQLFTest(fit, coef=2)
in the first R script toqlf <- glmQLFTest(fit)
, then the two scripts produce similar results. Would you please indicate to me the difference of usingcoef=2
or not?I really appreciate your help.
Hi,
The
coef=2
argument means that your second group will be subtracted from the intercept of your design. Why you get similar results without, I don't know. If you want an explanation from the developers of this package you should ask this question on the bioconductor support site.https://support.bioconductor.org/t/Latest/
Hi, b.nota, Thank you very much for your reply. I missed your reply these days.
Maybe you misunderstood me. I mean with
coef=2
and withoutcoef=2
produce very different results. I still cannot figure out "the intercept of your design" in your comment. Doescoef=2
mean group2 vs group1? What doesqlf <- glmQLFTest(fit)
mean if omitting coef=2?I really appreciate your answer. Thanks!
You're question is a pure statistical question, concerning linear models and intercept, etc.
I can try to explain it to you, but I am a biologist with programming skills...
Like I suggested before, try the bioconductor support site. Since edgeR is a bioconductor package, support is given on that site. Usually one of the developers or maintainers can explain you how to use the package correctly, and they respond within a day.
Good luck!
Thank you very much, b.nota. Your comments are very helpful!