Question

Using GAGE to Analyze Pathway Enrichment Directly from Fold Change Data

0

Entering edit mode

7.2 years ago

JMallory • 0

I have been using the following tutorial by Stephen Turner and Will Bush to look at some RNA-seq data.

http://www.gettinggeneticsdone.com/2015/12/tutorial-rna-seq-differential.html

Looking into GAGE's documentation, it looks like they are using it in a somewhat non-standard way. Specifically, it looks like they are using it to conduct a GSEA-esque analysis, feeding it a vector of fold changes annotated by Entrez IDs and looking for enrichment within pathways contained in the kegg.sets.hs object.

Were this a standard GSEA analysis, I would order transcripts by log2 fold change prior to analysis. In this use case of GAGE, should transcripts also be rank ordered prior to analysis? Running it both ways appears to make a large difference, at least in the case of my data.

GAGE Gene Ontology Analysis RNA-Seq • 2.8k views

ADD COMMENT • link updated 7.1 years ago by h.mon 35k • written 7.2 years ago by JMallory • 0

score 0 · Answer 1 · 2017-10-04

0

Entering edit mode

7.2 years ago

tarek.mohamed ▴ 370

Hi,

Could you please clarify in details what are the two approaches you used for your analysis.

Tarek

ADD COMMENT • link 7.2 years ago by tarek.mohamed ▴ 370

score 0 · Answer 2 · 2017-10-06

0

Entering edit mode

7.1 years ago

h.mon 35k

Running it both ways appears to make a large difference, at least in the case of my data.

By "running it both ways" you mean you tried with ordered and unordered logFC vectors? I had the same doubt in the past, when I tried GAGE with ordered and unordered logFC vectors results were the same.

ADD COMMENT • link 7.1 years ago by h.mon 35k

0

Entering edit mode

Yes. It is my understanding now that this may be caused by geneIDs in a given ontology list mapping to multiple transcripts of the same gene in the data. I believe GAGE is looking for a one-to-one mapping between a measure of DE and a gene, not multiple measures of DE for, say, different isoforms of a gene mapping to common geneIDs in the ontology list.

In this sense, it does not appear to be the best approach for RNA-seq data and certainly doesn't take into account things like read length and expression biases. I have since started to work with ontology enrichment analysis tools such as GOseq specifically tailored to RNA-seq data. It is a shame, because Pathview appeared very nice for generating easily understood figures.

ADD REPLY • link 7.1 years ago by JMallory • 0