Hi all,
I'm doing KEGG enrichment of a non-model plant species. I annotated genes of this species using blastKOALA with taxonomy group set to Viridiplantae (ID 33090) and KEGG GENES database set to family_eukaryotes. Then I did enrichment of a specific gene set compared to the overall annotated genes using the enrichKEGG function of clusterProfiler (based on this thread).
enriched <- enrichKEGG(gene,organism="ko",keyType = "kegg", universe = hakea_kegg, pAdjustMethod = "BH", minGSSize = 10, pvalueCutoff = 0.05, qvalueCutoff = 0.05)
In the enriched KEGG terms I see KEGG pathways for non-plant pathways such as 'Axon regeneration', 'Tuberculosis', and 'Neurotrophin signaling pathway'. I'm not sure why this is because I've filtered the genes to only keep ones which have top hits to other plant sequences in NCBI NR. I'm only interested in plant related terms so is there a way to filter these out?
Hi did you figure out this problem? I am also having the same issue with my non-model fungal genes.
Hey sorry for the late reply I didn't see this until now!
I found a script that gives all the KEGG ids associated with plants. I assume it would work if you just changed the mentions of plants to fungi or whatever you want to filter for in the org dataframe
Hi, can you tell from where this script comes from? Thanks
I'm not sure where the script for KEGGrest came from as I don't have the original one anymore sorry. The bash formatting and filtering lines are by me though. I vaguely remember copying it from a KEGGrest guide, but I'm unsure.