clusterprofiler: How to extract genes of a specific GO-term/pathway
2
2
Entering edit mode
4.4 years ago
ccha97 ▴ 60

I was wondering how to get the list of genes grouped in a particular GO-Term of Biological Pathway. I have an enrichGO output (using the clusterprofiler package). From this result, I've into a bar plot (see below). For example, my top enriched pathway is T cell activation, how do I obtain the genes which have been grouped into this pathway?

My code is taken from the clusterProfiler book:

ego2 <- enrichGO(gene         = gene.df$ENTREZID,
            OrgDb         = org.Mm.eg.db,
            keyType       = 'ENTREZID',
            ont           = "BP",
            pAdjustMethod = "BH",
            pvalueCutoff  = 0.01,
            qvalueCutoff  = 0.05)

barplot(ego2, showCategory=10)

Here are some screenshots of what my data looks like in R:

enter link description here enter image description here

clusterprofiler GO-term goterm go term analysis • 7.7k views
ADD COMMENT
6
Entering edit mode
4.4 years ago

ego2@result$geneID should give you a vector of strings containing the Entrez IDs for each gene in each pathway, delimited by "/". You could then split that string into a list via strsplit.

ADD COMMENT
0
Entering edit mode

Thanks so much! This seemed to work, as I'm able to see the number of genes as well as their IDs. Is it possible to also have the name of the pathway they are involved in (right now they are numbered, where I assume pathway 1 = T cell activation) or do I just assume they are in the same order as my bar plot?

ADD REPLY
3
Entering edit mode

They will be ranked by p-value by default, so yes, they should match up correctly. Though you could create a named vector by assigning description as the names for your vector to make things a bit easier , i.e:

go_genes <- ego2@results$geneID
names(go_genes) <- ego2@results$description
ADD REPLY
0
Entering edit mode

Thank you very much!

ADD REPLY
0
Entering edit mode

I had the same problem. thank you :)

ADD REPLY
0
Entering edit mode

thanks, just a short note for anyone checking this nowadays, in case this doesn't work try Description and result, apparently slight changes to words have happened:

go_genes <- ego2@result$geneID
names(go_genes) <- ego2@result$Description
ADD REPLY
1
Entering edit mode
3.4 years ago
Guangchuang Yu ★ 2.6k

see the FAQ and the examples in the clusterProfiler 4.0 article.

ADD COMMENT

Login before adding your answer.

Traffic: 2075 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6