Hi!
Apologies for the stupid question! but I think I am doing something wrong but i do not understand what. I would like to do ORA analysis on bulk-RNAseq dataset so I tried both clusterProfiler
and also genekitr
.` However, despite getting the same terms, but I have different p-adjusted value and q-value (practically with clusterprofiler none of the term have a p.adjusted or value <= 0.01 whereas wit the genekitr I have few). why is that? Do I do something wrong with my code?
for clusterProfiler:
# we want the log2 fold change
original_gene_list <- d$log2FC # on the unfiltered dataset
# name the vector
names(original_gene_list) <- d$ENSEMBL
# omit any NA values
gene_list<-na.omit(original_gene_list)
# sort the list in decreasing order (required for clusterProfiler)
gene_list = sort(gene_list, decreasing = TRUE)
# Exctract significant results (padj < 0.05)
sig_genes_df = subset(d, p_value <= 0.05)
# From significant results, we want to filter on log2fold change
genes <- sig_genes_df$log2FC
# Name the vector
names(genes) <- sig_genes_df$ENSEMBL
# omit NA values
genes <- na.omit(genes)
# filter on min log2fold change (log2FoldChange > 1.5)
genes <- names(genes)[abs(genes) > 1.5]
go_enrich <- enrichGO(gene = genes,
universe = names(gene_list),
OrgDb = org.Hs.eg.db,
keyType = "ENSEMBL",
readable = T,
ont = "BP",
pvalueCutoff = 0.05,
qvalueCutoff = 0.01)
and for genekitr
i have used this code (section 1.7 :
# 1st step: get input IDs
id <- c(dpg6$Associated.Gene.Name) # DEGs
# 2nd step: get gene set
gs2 <- geneset::getGO(org = "human",ont = "bp") # biological process
#analysis
ego2 <- genORA(id,
geneset = gs2,
universe = names (d$ENSEMBL), # bakground aka dataset unfiltered
p_cutoff = 0.05,
q_cutoff = 0.01) # bp
What I am doing wrong?
Thank you very much for your help!
Camilla
Thank you! the universe and the genes are, I just used different names because the scripts were written in different times! However I re-run both codes using the same gene/names and the results is the same as before (different p and q values). How do I choose which method? I don`t want to choose genekitr just because it gives me more terms statistically significant that would match my theory if it is not the right approach!
Hi, I'm the author of
genekitr
. Thanks for your feedback. Regarding your question, firstly, bothenrichGO
andgenORA
are based on theenricher
function for statistical calculations. As @chaco001 said, the main difference lies in the input annotation of terms used, which of course is not limited to GO. ClusterProfiler mainly adopts theOrgDb
method, for example, the function uses org.Hs.eg.db to obtain geneset, while genekitr integratesPanther db (v17.0)
andOrgDb
.I love using these tools! They are both easy to use.