Please read Section 2.12 "What to do if you have no replicates" of the edgeR User's Guide. You are trying to implement the third suggestion of that section.
Please note the caveats of that section: using edgeR without replicates cannot be considered to be reliable, although it is nevertheless much better than the common practice of using Poisson regression in that circumstance.
The point of the suggestion is that you have to simplify the design matrix temporarily in order to estimate the dispersion, but then you go back to the full design matrix to find DE genes. So you need
group <- factor(c("control","KD"))
design <- model.matrix(~group)
design.reduced <- matrix(1,2,1)
dgList <- estimateGLMCommonDisp(dgList, design.reduced,
method="deviance", robust = TRUE, subset=NULL)
fit <- glmFit(dgList, design)
lrt <- glmLRT(fit)
edgeR_result <- topTags(lrt, n=Inf)
Here is a complete example.
In this simulation, only the first gene is differentially expressed.
> ngenes <- 10000
> mu <- matrix(50, ngenes, 2)
> mu[1,] <- c(10,1000)
> phi <- 0.1
> counts <- matrix(rnbinom(ngenes*2, mu=mu, size=1/phi),ngenes,2)
> dgList <- DGEList(counts)
Then the analysis. Only the first gene is found to be significantly DE:
> dgList <- calcNormFactors(dgList)
> group <- factor(c("control","KD"))
> design <- model.matrix(~group)
> design.reduced <- matrix(1,2,1)
> dgList <- estimateGLMCommonDisp(dgList, design.reduced,
+ method="deviance", robust = TRUE, subset=NULL)
> fit <- glmFit(dgList, design)
> lrt <- glmLRT(fit)
> topTags(lrt)
Coefficient: groupKD
logFC logCPM LR PValue FDR
1 6.82 10.03 56.9 4.58e-14 4.58e-10
2219 3.68 5.93 15.8 7.02e-05 3.19e-01
5944 3.04 6.97 14.6 1.36e-04 3.19e-01
4489 3.15 6.55 14.4 1.44e-04 3.19e-01
4253 -3.06 6.75 14.3 1.60e-04 3.19e-01
2301 -2.92 6.63 13.1 2.99e-04 4.98e-01
5643 2.81 6.77 12.5 4.10e-04 5.29e-01
6091 -2.70 7.20 12.4 4.23e-04 5.29e-01
1834 -2.80 6.25 11.4 7.36e-04 7.04e-01
5112 -2.79 6.09 11.0 8.99e-04 7.04e-01
Dear Gordon - thank you for your complete and comprehensive answer, it helped a lot.
I agree with you that having more replicates s much better than trying edgeR with only two samples - but I have no choice rather than trying this approach.
Thanks again.