There are a couple of different approaches.
1) you can run similarity and difference metrics on those pathways themselves. for instance, suppose you have three pathways:
pathwayA=c('A','B','C','D','E','F')
pathwayB=c('A','B','C','D','Q','F')
pathwayC=c('Z','Y','X','W','Q','J')
pathwayA and B are more similar to each other than pathwayA and pathwayC. You do not need to rely on ontologies to know this, you can just make a mathematical description.
2) you can rely on ontologies curated, somehow. For example, look at GO:
GO: curated relationships between ontologies
msigdb has GO in c5 but NOT the complex relationships found in GO.db.
"GO" "GO_dbconn" "GO_dbfile" "GO_dbInfo" "GO_dbschema" "GO.db" "GOBPANCESTOR" "GOBPCHILDREN" "GOBPOFFSPRING" "GOBPPARENTS"
"GOCCANCESTOR" "GOCCCHILDREN" "GOCCOFFSPRING" "GOCCPARENTS" "GOMAPCOUNTS" "GOMFANCESTOR" "GOMFCHILDREN" "GOMFOFFSPRING" "GOMFPARENTS" "GOOBSOLETE" "GOSYNONYM" "GOTERM"
GOdbList <- as.list(GOTERM)
all of these GO ontologies have relationships to each other that you can pull out, as I wrote above. You can find out more by going to the bioconductor package website for GO.db
.
3) You can decide not to use published ontologies (answer 2) and decide not to use the content of the pathways themselves (answer 1) and instead work based on coexpression of genes.
Meaning, whether or not the gene content is partially shared (answer 1) you can find modules of genes that are coexpressed (or are NOT expressed together, either one).
For this, consider such R packages as:
libsNeeded<-c('WGCNA', 'GWENA', 'hCoCena', 'clusterProfiler', 'DOSE', 'GO.db', 'BioNERO', 'petal', 'CEMiTool', 'minet')
BiocManager::install(pathwayLibs, update = TRUE, ask = FALSE)
lapply(libsNeeded, library, character.only = TRUE)
which you choose depends on your analytical goal, more than anything else. Does that help?
VAL