Hi all,
I'm running GSEA on RNA-seq differential expression results and I’m wondering what’s the best way to rank genes when using edgeR LRT .
I know that for:
DESeq2, it's common to rank by log2FoldChange after shrinkage.
limma-voom, the t-statistic is often used for ranking.
But edgeR-LRT does not provide a t-stat, so I’m not sure what’s most appropriate. Would using logFC alone be enough?
Don't necessarily need to run
topTags
. Even with theglmLRT
output,lrt
say, the signed LRT statisticis a standard normal z-statistic, and would be a good choice for GSEA ranking analyses.
Checking my understanding is correct: GSEA, at least as implemented in clusterProfiler, only uses the rank of the genes, not the quantitative score used for ranking. This should mean that ranking by p-value is equivalent to ranking by t-statistics (or LRT, in the case glm) and
sqrt(lrt$table$LR)
is the same aslrt$table$LR
. However, occasionally I've heard people discussing whether you should use p-value or -log10(pvalue) or t-statistics... Am I missing something? (It's different for logFC, shrunk-logFC, p-value where indeed the ranking may change).