How to get KEGG pathway names in KEGG enrichment analysis using clusterprofiler
1
1
Entering edit mode
4.6 years ago
tianshenbio ▴ 180

I am doing KEGG enrichment analysis for a non-model organism. First I annotate the genome using KAAS and I got ko numbers associated with each gene (kegg2gene)

K11968  BANY.1.2.t00001
K04834  BANY.1.2.t00003
K13273  BANY.1.2.t00007
K04497  BANY.1.2.t00010
K09210  BANY.1.2.t00011
K10360  BANY.1.2.t00012
K10360  BANY.1.2.t00013

Then I have a list of DE genes (gene) and performed KEGG enrichment using clusterprofiler:

x <- enricher(gene,TERM2GENE=kegg2gene,pvalueCutoff = 0.05, pAdjustMethod = "BH",qvalueCutoff = 0.05)

Then I got a list of enriched ko numbers, then how do I translate the ko numbers to pathway names/descriptions? I know I can add TERM2NAME=kegg2name, but how do I get a list of all kegg numbers vs names?

annotation enrichment KEGG clusterprofiler RNA-Seq • 6.4k views
ADD COMMENT
1
Entering edit mode

https://www.genome.jp/kegg/ko.html

Paste them into the box on that page and translate to pathways.

ADD REPLY
0
Entering edit mode

Hi, Since I hope to do pathway enrichment analysis, I need to get a list of genes vs pathways. Now I have genes vs K numbers so yes, I need to translate K numbers to pathways. But I can't download a list of K number vs pathways from the mapper page u shared....

ADD REPLY
0
Entering edit mode

You don't need to translate them to use clusterProfiler, you can use KOs directly. set organism = "ko"

https://github.com/YuLab-SMU/clusterProfiler/issues/99

ADD REPLY
0
Entering edit mode

Hi NRC

I read that post from github but only enricherKEGG has the organism option right? As I mentioned I used the enricher function not enrichKEGG since I am using a customized dataset. You can also have a look at the last comment from the link you shared, that's written by me.

ADD REPLY
0
Entering edit mode

I am not sure about enricherKEGG, I am only familiar with enrichKEGG. You can use a list of KEGG IDs directly, e.g. K00011, K00678, etc. You set organisms="ko" and that's it, this definitely works for non-model organisms because all you need is a list of KO values

ADD REPLY
0
Entering edit mode

Hi NRC Thank you for your reply. Sorry, I meant enrichKEGG. First of all, I have a list of all kegg2gene for my species: kegg2gene

K11968  BANY.1.2.t00001
K04834  BANY.1.2.t00003
K13273  BANY.1.2.t00007
K04497  BANY.1.2.t00010
K09210  BANY.1.2.t00011
K10360  BANY.1.2.t00012
K10360  BANY.1.2.t00013
K10360  BANY.1.2.t00015
K01530  BANY.1.2.t00016

Then I have a list of DE genes: gene

BANY.1.2.t20473
BANY.1.2.t12787
BANY.1.2.t10473
BANY.1.2.t10472
BANY.1.2.t08098

How do I perform the kegg enrichment analysis?

I tried this:

x <- enrichKEGG(gene,organism="ko",keyType="kegg",pvalueCutoff = 0.05, pAdjustMethod = "BH",kegg2gene,qvalueCutoff = 0.05)

and I got an error message:

--> No gene can be mapped....
--> Expected input gene ID: K01213,K01623,K00846,K00875,K00496,K18106
--> return NULL...
ADD REPLY
0
Entering edit mode

Match your list of DE genes to their KO value, then try again.

x <- enrichKEGG(gene, organism='ko', keyType='kegg', universe = kegg2gene, pSjustMethod="BH")

ADD REPLY
0
Entering edit mode

Hi, I am not sure if I understood correctly. I tried to map my DE genes with K numbers and used the K number only, now gene (a string of characters) is like:

[1] K00901 K01016 K06483 K06585 K00799

Also, only the K numbers in kegg2gene are used, now kegg2gene looks like:

[1] K04834 K13273 K04497 K09210 K10360

x <- enrichKEGG(gene,organism="ko",keyType="kegg",universe="kegg2gene", pAdjustMethod = "BH")

This gave me

No gene set have size > 10 ...
--> return NULL...
ADD REPLY
0
Entering edit mode

hmm what if you take out the p adjusted cutoff?

ADD REPLY
0
Entering edit mode

I tried to adjust p and minGSSize but did not work...I have 300 terms in gene and 9000 terms in kegg2go, should be enough for the analysis...Anyway, thank you for your help!NRC

ADD REPLY
0
Entering edit mode

Hi! How you solved your problem using enrichKEGG( ) function? I have the same error message:

ca_kegg <- enrichKEGG(ca_list, organism = 'ko', keyType = 'kegg', universe = BBRB_KEGG, pAdjustMethod = "BH")

---> No gene can be mapped....
---> Expected input gene ID: K00895,K01810,K21622,K16370,K15779,K01218
---> return NULL...

ca_list is my list of DE gene ID's and BBRB_KEGG is a dataframe of two columns with gene ID's and KEGG annotations that I get with Trinotate.

If you can help me I'll really apreciate!

ADD REPLY
0
Entering edit mode
3.3 years ago
Guangchuang Yu ★ 2.6k

it works. see also https://doi.org/10.1016/j.xinn.2021.100141.

enter image description here

ADD COMMENT

Login before adding your answer.

Traffic: 1728 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6