Question

Clusterprofiler gseGO and GSEA

1

Entering edit mode

4.9 years ago

anon ▴ 10

Dear all,

I have been analysing RNA seq data and I wanted to do gene set enrichment analysis with clusterprofiler package. I have used deseq2 to identify differentially expressed genes. I have set lfcThreshold = 1 while calling function results(). I have created a vector with log2 fold changes and Entrez names. I thought that gseGO function from clusterprofiler is the same thing as GSEA function from clusterprofiler is the same thing. Am I wrong? I have run gseGO on my sorted log2 fold changes list and then I ran GSEA function on the same list and specified TERM2GENE to be a gene list downloaded from broad's institute website (c5: GO).

Basicaly this is what I did:

gseaGO1 <- gseGO(geneList = foldchanges, 
                OrgDb = org.Hs.eg.db, 
                ont = 'All', 
                nPerm = 1000, 
                minGSSize = 10, 
                pvalueCutoff = 0.05,
                verbose = FALSE) 

c5 <- read.gmt("c5.all.v7.0.entrez.gmt")
gseaGO2 <- GSEA(foldchanges, 
             TERM2GENE=c5, 
             minGSSize = 10,
             nPerm = 1000, 
             pvalueCutoff = 0.05,
             verbose=FALSE)

The results are very similar but not the same. I can see some of the GO sets in results of both gseaGO1 and gseaGO2 and as far as I can see they have the same enrichment score but different NES value, pvalue, padjusted (however differences are VERY small).

So my questions are: are the gseGO and GSEA functions form clusterprofiler package the same (in a mathematical sense)? Additionally, I have defined c5.all.v7.0.entrez.gmt to be gene set database for GSEA function, but which gene set database is used for gseGO?

Even though I am new to this analysis I have read a lot about it but it still isn't clear to me this. Thank you very much for your time and help.

RNA-Seq rna-seq R • 13k views

ADD COMMENT • link 4.9 years ago by anon ▴ 10

0

Entering edit mode

Thank you very much MatthewP. It is clear to me now.

ADD REPLY • link 4.9 years ago by anon ▴ 10

score 1 · Answer 1 · 2020-01-06

Hello, yes they are the same function. gseGO use geneset in org.Hs.eg.db. I didn't compare GO genesets between org.Hs.eg.db and MsigDB, but their KEGG pathway number are different(org.Hs.eg.db has more KEGG pathway genesets), so GO genesets may also different. I don't know how often MSigDB genesets update, org.Hs.eg.db update 2 times per year(From this post.