Hi all,
I am attempting to use the clusterProfiler function gseGO() to perform gene set enrichment analysis on a very small subset of my DE gene data as a trial run.
#Establishing ranked geneList
gene_list <- c(8.486070, 7.696671, 7.453107) # logfold change data from DESeq2
names(gene_list) <- c("LOC131891903", "LOC131892170", "LOC131889773")
My study organism (Tigriopus californicus) has an OrgDb object available, which I obtained through this code:
library(AnnotationHub)
ah <- AnnotationHub()
query(ah, "Tigriopus")
tig <- ah[["AH115870"]]
I was able to verify that the three genes in my test data exist in the OrgDB object by searching for ENTREZID matches in the database:
AnnotationDbi::select(tig, keys=c("LOC131891903", "LOC131892170", "LOC131889773"), columns=ENTREZID",
keytype = "SYMBOL")
'select()' returned 1:1 mapping between keys and columns
SYMBOL ENTREZID
1 LOC131891903 131891903
2 LOC131892170 131892170
3 LOC131889773 131889773
However, when I go to use gseGO on these data with like so:
test.go <- gseGO(geneList = gene_list,
ont="BP",
OrgDb = tig,
keyType = "SYMBOL")
I get this error:
using 'fgsea' for GSEA analysis, please cite Korotkevich et al (2019).
preparing geneSet collections...
--> Expected input gene ID: COX3,ATP6,ATP6,ATP6,ND5,ATP6
Error in check_gene_id(geneList, geneSets) :
--> No gene can be mapped....
It's not clear to me why mapping is failing, given that the genes appear to be present in the OrgDB object. It also failed when I replaced the names of gene_list
with the EntrezIDs identified by the select
function.
Can anyone provide some insight into what mistake I've made?
Thanks!