Problem with universe argument in enrichKEGG
1
0
Entering edit mode
21 months ago
1215045934 ▴ 80

Hi all,

I am trying to do KEGG enrichment with enrichKEGG, I was wondering if I need to specify universe argument. Could you please help me with it?

  1. In the documentation, it says "universe: background genes. If missing, the all genes listed in the database (eg TERM2GENE table) will be used as background." However, enrichKEGG doesn't take TERM2GENE argument. If the universe is not specified, what will be used as the universe?

It returns unused argument (TERM2GENE = Transcriptome) If I specify TERM2GENE.

  1. Should I use the KEGG column from the "KEGG Orthology to Genes mapping" as the universe?

enr_results <- enrichKEGG(DEG$KEGG, organism='ko', universe = Transcriptome$KEGG, pvalueCutoff = 0.05, pAdjustMethod = "BH", qvalueCutoff = 0.05, minGSSize = 5)

Here are what my files look like:

  1. KEGG to DEG mappings

    > head(DEG)
     KEGG Gene
    1 K17277  FS_gene_3
    2 K14700 FS_gene_11
    3 K14701 FS_gene_11
    
  2. KEGG to whole transcriptome mappings

    head(Transcriptome)
     KEGG Gene
    1 K02727  FS_gene_1
    2 K17277  FS_gene_3
    3 K17307 FS_gene_10
    

Thanks a lot!

enrichment KEGG clusterProfiler • 1.4k views
ADD COMMENT
0
Entering edit mode
21 months ago

The TERM2GENE table is not an argument to enrichKEGG, the documentation is using it as a shorthand for a particular format of table that is used in some other functions in the package to specify the mapping between terms and genes. If want to use the default setting for universe, you should just leave the parameter blank, and the package will automatically fetch the full list of genes that exist in the KEGG database for the organism in question.

However, it is unlikely that this is to correct thing to do. Enrichment results depend quite heavily on the universe supplied. You should supply a list of all the genes that could have been differentially expressed. This is usually not all genes. For example, for some genes, they may have not been expressed at a high enough level to (meaningfully) test for differential expression. Others might have been removed from analysis for having outlier samples. If you are doing enrichment analysis on DEGs, then a good place to start is probably any gene with a non NA p-value in the output table from your DEG tool of choice.

By the way, it looks to me like you are supplying your parameters to enrichKEGG uncorrectly. You seem to be supplying your list of DEGs in the form of gene to pathway mappings, but the function just wants a list of all the DE genes, it doesn't want the mappings. The way you are doing it here (using DEG$Gene), you are listing some genes multiple times. This may cause problems for enrichKEGG, I don't know.

ADD COMMENT
0
Entering edit mode

Thank you so muh for the detailed explanation!

My organisim is a non-model organism and it doesn't exist on KEGG organism list. Would you suggest using enricher instead of enrichKEGG? That way I could provide a customed universe.

I will definitely use the gene with a non NA p-value to start as the universe! Thank you for the suggestion!

ADD REPLY

Login before adding your answer.

Traffic: 1643 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6