I ran DGE on my dataset using DESeq2 and Limma. I used the same dataset, similar settings and same GLM model (~family+condition
). The number of DEGs I get are as below (based on unadjusted p-values <0.05 and BH adjusted q-values <0.05 shown separately):
Tool num_degs_pval num_degs_qval
DESeq2 3155 623
Limma 2853 32
I plotted qqplots using unadjusted pvalues from both DESeq2 and Limma results.
library(snpStats)
qq.chisq(-2*log(deseq2_results$pvalue),df=2,pvals=T,overdisp=T)
qq.chisq(-2*log(limma_results$pvalue),df=2,pvals=T,overdisp=T)
This is what I get:
Now questions;
- Is the qqplot useful in RNA-Seq to estimate DGE quality?
- In this example, does the better fit of Limma mean that the fewer set of DEGs returned by Limma is more reliable?
- Similarly, does the inflated curve for DESeq2 mean that the huge number of DEGs returned by DESeq2 contains more false positives?
Couple of caveats: The function I used qq.chisq
is for chisq tests and I am not sure if that is an issue since my Deseq2 script uses Wald test and Limma uses bayesian eBayes
test. The df
argument in qq.chisq
is set to 2 arbitrarily. I am not sure what value I should plug in here.
Thanks for the input. The sample size is same as before: 24.