interpreting results from pathway analysis
1
1
Entering edit mode
16 months ago
kng ▴ 40

I am performing pathway analysis using results from RNA seq. I am using clusterprofileR from R and using the KEGG pathway. I obtained two sets of results using

gseKEGG(geneList     = my_gene_list,
               organism     = kegg_organism,
               nPerm        = 50000,
               minGSSize    = 3,
               maxGSSize    = 800,
               pvalueCutoff = 0.05,
               pAdjustMethod = "none",
               keyType       = "ncbi-geneid")

(1) my_gene_list was filtered list of top ~400 genes with the highest log2fold change

(2) my_gene_list was the entire gene list ~18K sorted based on log2fold change value.

I was expecting both methods to list the same set of pathways as top activated/suppressed pathways but method one gives me some pathways of interest as top list while the results from method 2 look mostly garbage. How do I interpret results from both of these approach?

RNA-seq kegg clusterprofiler GSEA pathway-analysis • 1.5k views
ADD COMMENT
3
Entering edit mode

you need to use the ranked full gene list for gseKEGG/GSEA analysis for over-repesentation analysis use a subset of the genes, use enrichMKEGG()

I have a video on it too if you want to check out

ADD REPLY
1
Entering edit mode
16 months ago

here are some thoughts;

1- When using GSEA, the input has to be a ranked list of entire genes obtained from RNA-seq, so you should not pass a subset of genes like top ~400 as input to the function.

2- For ranking the genes, metrics that effectively can rank the genes, e.g., Wald statistics (from DESeq2) should be used. LogFC should be avoided in ranking the genes, as it can not consider the direction of dysregulation and also it ignores statistical uncertainty (so you may end up assigning higher ranks to genes with less statistical support). If you like to use logFC for ranking you may define a new metric and combine that with p-value, like : metric = -log10(p-value)/sign(log2FC)

So none of the approaches that you mentioned gives you the correct output.

ADD COMMENT
0
Entering edit mode

Hamid Ghaedi can you lower or even put "exponent = 0" and "eps = 0" in the parameter. I have noticed the change of these values makes a difference in results. Could you comment on the importance of this parameter?

ADD REPLY

Login before adding your answer.

Traffic: 1694 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6