Hi there,
I am using the enricher()
function from clusterProfiler
to perform a KEGG pathway enrichment analysis. I work on a non-model species, although with a sequenced and annotated genome.
I have a list of 179 unique genes differentially expressed annotated with KOs (e.g. K08448) and a background (all the genes in the genome) of 8873 unique KOs. I created, by retrieving the necessary information with KEGGREST
, a TERM2GENE
table where the KEGG pathways are mapped to the KOs in order to perform a "custom" enrichment. The TERM2GENE
table looks (hereafter named as "mapping_table") like this:
Map KO
map0001 K009567
map0001 K876540
map0003 K286021
Then, used the enricher()
function as follows:
enricher(gene = differentially_expressed_KOs, TERM2GENE= mapping_table)
the input (differentially_expressed_KOs) was the simple list of DEGs and looked like this:
K009567
K007213
K023784
....
Here reported the first line of the output
ID GeneRatio BgRatio pvalue
map01412 5/91 84/8873 0.001658081
However, a few points rose up from the output:
- The denominator of the gene ratio consists of 91 genes instead of 179, which according to the clusterProfiler manual should be instead the size of the list of genes of interest (defined as n).
- By trying of sorting this out and better understanding where the mistake was, I attempted to manually calculate the values using the
phyper
function in R as follows:phyper(5,84,8873-84,91, lower.tail = F)
= 0.0002108303 where 5 is the number of DEGs in that pathway, 84 is the number of genes annotated in that pathway, 8873-84 is the total number of annotated genes (universe) minus those in that pathway and 91 is the size of my genes of interest (DEGs). Instead, by using the actual number of DEGsphyper(5,84,8873-84,179, lower.tail = F)
= 0.006868312 the result is much closer to the output ofenricher
, though still different. - Lastly, by specifying the option
universe=
in the functionenricher
and using all the KOs from the genome, the background changes radically and consider only 3856 genes as background.
Am I missing something here?