GSEA preranking metric for RNA Seq
1
1
Entering edit mode
6.9 years ago
bipin ▴ 30

I came across multiple posts regarding the pre-ranking metric for GSEA when using RNA seq data. However, there doesn't seem to be a consensus.

Some of the metrics I came across are:-

  • sign of log fold change * -log10(p-value[not adjusted p-val])
  • logfc shrink values from DESeq2
  • Inbuilt signal2noise from GSEA. However, this cannot be used in case of <3 replicates.

What metric do you use for ranking the genes or you know is widely used?

RNA-Seq gsea deseq2 • 11k views
ADD COMMENT
0
Entering edit mode

How to deal with the data that has some genes with 0 log2fold change? Is it a good idea to prefilter before ranking as they are essentially not modulated by the treatment for example?

ADD REPLY
6
Entering edit mode
6.9 years ago

Edit 31st July, 2019: I gave my original answer (below) assuming that you were referring to the general process of gene enrichment (or 'gene-set enrichment analysis'), and not that you were referring to GSEA, the Broad Institute's PROGRAM that hijacked the term GSEA

GSEA (the Broad Institute program) permits a ranked list of genes, as does topGO (R), fGSEA (see my comment below), and other enrichment programs - there are too many programs.

---------------------------------------------

It makes sense that there is no consensus, as there are countless ways to do this. My own recommendation would be to:

  1. Set an adjusted P value cut-off
  2. Rank genes based on absolute log (base 2) fold change

I believe the most widely used method is to just set an adjusted P value and log (base 2) fold change cut-off, and to then 'throw' the resulting gene list into the GSEA without any ranking.

The lack of consensus on a proper filtering strategy may in part be due to the fact that a substantial proportion of researchers do not pay much attention to the results of GSEA. GSEA results would certainly never stand as the sole evidence in a clinical test, neither would they be sufficient evidence on which conclusions could be made in most reputable journals.

Kevin

ADD COMMENT
0
Entering edit mode

what would be your answer if the question was about the GSEA program of Broad Institute?

ADD REPLY
1
Entering edit mode

Rank by fold-change.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

How bad would it be to use the -log10(p-value) * log2(foldchange)? Note I mean log2(foldchange) and not its sign.

ADD REPLY
0
Entering edit mode

No issue - at least you maintain directionality, in that case.

ADD REPLY
0
Entering edit mode

I have a query regarding the analysis of GSEA Results. I have used GSEA to obtain the dysregulated KEGG pathways. Now, I want to rank the dysregulated KEGG pathways. So, is it logical to use NES * (-log10 Nominal p-value) or NES * (-log10 FDR q-value) for ranking the KEGG pathways?

ADD REPLY

Login before adding your answer.

Traffic: 1008 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6