How to perform KEGG enrichment analyses on a set of genes
5
2
Entering edit mode
8.1 years ago
miyakokodama ▴ 20

Possibly a silly question...

I work on a non-model plant species that has a sequenced genome. All genes from the map were aligned against KEGG proteins, therefore about ~70% genes on the map are annotated with K-term.

I performed differential expression analyses and obtained a set of interesting genes (230 genes), and I have an input that looks something like this:

Gene_1 hsa:9380 K00049

Gene_3 mdo:100015233 K00081

Gene_4 ath:AT1G30270 K00924

Gene_5 ath:AT1G30270 K00924

Gene_6 ath:AT3G59420 K00924

Gene_7 ath:AT3G59420 K00924

where I have a gene name for my species, locus tag ID (and the species in which the best hit was identified with), and its corresponding K-term.

Now I wanted to do pathway enrichment analyses and get a p-value for enriched pathways, but having to pick a species to do so has been hindering the process. Does anyone know how to deal with a situation like this? Any help would be greatly appreciated.

rna-seq kegg enrichment pathway R • 15k views
ADD COMMENT
0
Entering edit mode

I'm facing the same problem here, have you succeed afterwards? Can you share with me some experience?

ADD REPLY
0
Entering edit mode

i am also facing the same proble, i got 10 different species and each species with several genes, could you tell me how to do the KEGG pathway enrichment analysis?

ADD REPLY
1
Entering edit mode
8.1 years ago
dago ★ 2.8k

There are many tools in Bioconductor, as clusterProfiler. Also, you can look into this post for more details

ADD COMMENT
0
Entering edit mode

Is there any alternate to kegg that is more recently updated?

ADD REPLY
0
Entering edit mode

mmm...metaCyc maybe, but I am not aware of enrichment tool for it

ADD REPLY
0
Entering edit mode

With goseq you can use the reactome db (next to kegg and GO.db). But multiple enrichment only works with one organism, because you need to have the full genome. If you have your own gene2go or gene2pathway file, you can use that instead.

ADD REPLY
0
Entering edit mode

Thanks for your reply! I looked into clusterProfiler but it seems like enrichKEGG() only works when you specify a species, by having to set organism="". Do you agree?

ADD REPLY
0
Entering edit mode
8.1 years ago
EagleEye 7.6k

Gene Set Clustering based on Functional annotation (GeneSCF)

Works for known species/ organisms.

ADD COMMENT
0
Entering edit mode

Thanks! Unfortunately my species is not on the list so it seems like I cannot use it, as I need to set the -org parameter. If you have any other suggestions, please feel free to let me know and thanks!

ADD REPLY
0
Entering edit mode
6.5 years ago

Use enricher() in clusterProfiler R package for this purpose. It shapes the things which are novel like yours.

ADD COMMENT
0
Entering edit mode
6.5 years ago
pablo61991 ▴ 90

I'm also very interested in this topic, I didn't find a really good approach so please if you finally solve that please share the method.

Just if wanna try something which partially works for me: kobas.cbi.pku.edu.cn/help.php

The only problem is you need to repeat the annotation step and maybe you don't want it. For real data you usually need to install the tools in a local machine, instead of use the webtool.

ADD COMMENT
0
Entering edit mode
6.5 years ago
bigmawen ▴ 440

You can use gage package in R/Bioconductor:
http://bioconductor.org/packages/release/bioc/html/gage.html

The K-numbers are KEGG orthology gene IDs. you can generate gene set data for KO using kegg.gsets() function with species='ko'. BTW, KEGG data is constantly updated, so is the gene set data generated using kegg.gsets.

You may also use pathview web server for pathway analysis if you are more used to GUI program.
http://pathview.uncc.edu/

ADD COMMENT

Login before adding your answer.

Traffic: 1558 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6