GSEA and GO Ontology
1
0
Entering edit mode
19 months ago

I am working on R markdown.

I got the GSEA result. I am using Clusterprofiler R package. I have 927 upregulated gene sets that have NES values equal to or greater than +2. I have 563 downregulated gene sets that have NES values equal to or less than -2. There are 11 columns that have ID, Description, setSize, enrichmentScore, NES, pvalue, p.adjust, qvalue, rank, leading_edge core enrichment. The core enrichment columns contain multiple gene entrez ids divided by slash (4321/7543/2415) for each gene set present in each row.

How to see if multiple gene sets either from upregulated or downregulated gene sets are functionally similar which gives higher confidence by the help of GO Ontology terms or by another method.

Any guide or support will be appreciated. Thank you.

Ontology GO GSEA • 785 views
ADD COMMENT
0
Entering edit mode
19 months ago
LauferVA 4.5k

There are a couple of different approaches.

1) you can run similarity and difference metrics on those pathways themselves. for instance, suppose you have three pathways:

pathwayA=c('A','B','C','D','E','F')
pathwayB=c('A','B','C','D','Q','F')
pathwayC=c('Z','Y','X','W','Q','J')

pathwayA and B are more similar to each other than pathwayA and pathwayC. You do not need to rely on ontologies to know this, you can just make a mathematical description.

2) you can rely on ontologies curated, somehow. For example, look at GO:

######################    GO    ######################
GO: curated relationships between ontologies  ############# https://yulab-smu.top/biomedical-knowledge-mining-book/GOSemSim.html
msigdb has GO in c5 but NOT the complex relationships found in GO.db.
"GO"            "GO_dbconn"     "GO_dbfile"     "GO_dbInfo"     "GO_dbschema"   "GO.db"         "GOBPANCESTOR"  "GOBPCHILDREN"  "GOBPOFFSPRING" "GOBPPARENTS" 
"GOCCANCESTOR"  "GOCCCHILDREN"  "GOCCOFFSPRING" "GOCCPARENTS"  "GOMAPCOUNTS"   "GOMFANCESTOR"  "GOMFCHILDREN"  "GOMFOFFSPRING" "GOMFPARENTS"   "GOOBSOLETE"    "GOSYNONYM"     "GOTERM"
GOdbList <- as.list(GOTERM)     # Pull any of these  ^  out with as.list():

all of these GO ontologies have relationships to each other that you can pull out, as I wrote above. You can find out more by going to the bioconductor package website for GO.db.

3) You can decide not to use published ontologies (answer 2) and decide not to use the content of the pathways themselves (answer 1) and instead work based on coexpression of genes.

Meaning, whether or not the gene content is partially shared (answer 1) you can find modules of genes that are coexpressed (or are NOT expressed together, either one).

For this, consider such R packages as:

libsNeeded<-c('WGCNA', 'GWENA', 'hCoCena', 'clusterProfiler', 'DOSE', 'GO.db', 'BioNERO', 'petal', 'CEMiTool', 'minet')
# install, update, or load all packages.
BiocManager::install(pathwayLibs, update = TRUE, ask = FALSE)
lapply(libsNeeded, library, character.only = TRUE)

which you choose depends on your analytical goal, more than anything else. Does that help?

VAL

ADD COMMENT

Login before adding your answer.

Traffic: 3467 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6