I have a question. I want to perform over-representation analysis on GO terms associated with my list of significant genes and get KEGG pathways. I saw there are two tools I can use in R: clusterProfiler and gProfileR. Are they the same? Am I going to get the same results from the two? if not, what is the difference? and why?
The GO terms output by gprofileR are generally quite similar to those
output by clusterProfiler, but there are small differences due to the
different algorithms used by the tools.
The basic principle is similar but each tool has their own way of defining background genes and calculating enrichment stats + multiple testing correction. In my experience this entire world of gene enrichment analysis is a gigantic mess. It is not standardized, results drastically change upon slight changes of the input gene lists and there is no consensus on what (if at all) should be used as a backgrounf set of genes to calculate enrichment over. Gprofiler2 allows to define a custom set of genes as background, so that is a plus, not sure whether the other tool can do that.
https://hbctraining.github.io/DGE_workshop/lessons/functional_analysis_other_methods.html