Question

Visualisation of GO Terms

0

Entering edit mode

6.0 years ago

dthorbur ★ 3.0k

I have just finished getting my GO analysis results - or at least the first set of results. I am now trying to interpret the results, and my question is essentially looking at what broad gene functions are over represented in the genome scan I have just done.

For this I would like to classify second-order (for lack of a better term) GO terms. For examples I want to classify "GO:0045087: innate immune response" as Immune System Process, or a similar term. I have 3 sets of analysis at the moment, and around 150-260 GO terms that are significant to p<0.05 using TopGO and a universe of autosomal genes. Is there a way to automate this process? I have seen GO hierarchies(e.g. http://www.informatics.jax.org/vocab/gene_ontology/GO:0007187). But I don't know if this is a consensus, or just what someone has built for their study.

EDIT: I've removed the question about visualisation software. I found that using ggplot with geom_point but in the orientation of a heatmap was a good way of visualising the similarities and differences in my populations for significant GO terms. Though I did have to only take terms with the support of at least 5 significant genes to get a reasonable number.

EDIT2: I am still looking for a tool that can take a list of GO terms, and I can report how terms belong to higher level terms - level 2 is my goal. For example, I want to know how many significant GO terms in my list belong to "GO:0009987 cellular process" or "GO:0008152 metabolic process 68". I have looked over the topGO documentation and if it's available there, I've missed it.

Thanks.

GO Gene Ontology TopGO • 5.4k views

ADD COMMENT • link 6.0 years ago by dthorbur ★ 3.0k

0

Entering edit mode

If you want a tool to find parent-child, or ancestor-offspring relationship between GO terms, take a look at R package GO.db.

One warning though, GO terms are not always one-to-one related, but can be have very complex relationships.

ADD REPLY • link 6.0 years ago by Benn 8.4k

0

Entering edit mode

Thanks, I have seen the GOBPCHILDREN and ancestor tool in that package. But as you said, it's not a 1:1 relationship so I fear I would not get the values correct for each of the parent terms.

ADD REPLY • link 6.0 years ago by dthorbur ★ 3.0k

score 1 · Answer 1 · 2019-04-24

1

Entering edit mode

6.0 years ago

Benn 8.4k

There are many options to visualize GO terms or their similarity. I have made a tool that can do that as well, gogadget.

What gogadget does, is to use results from goseq, and clusters similar GO terms based on overlapping genes. There are two options of similarity distances that can be used, one is the pearson correlation, and the other is just overlap index (percentage overlapping genes between two GO terms).

Another thing gogadget can do is to export the results for cytoscape analysis with enrichmentMap, this will connect similar GO terms in a network.

For some examples of the use of gogadget, look at the papers that use this tool for their analysis.

ADD COMMENT • link 6.0 years ago by Benn 8.4k

0

Entering edit mode

Thanks. I'll have a look at the tool.

ADD REPLY • link 6.0 years ago by dthorbur ★ 3.0k

0

Entering edit mode

I saw one of the paper Integrated proteomic analysis of tumor necrosis factor α and interleukin 1β-induced endothelial inflammation i have query regarding the figure "Fig. 5. TNFα affects immune- and cardiovascular-specific processes in endothelial cells. Graphic representation of enriched GO terms in either of the analyses clustered in gogadget. (A) Based on the overlap of gene sets of a GO-term, GO-terms were clustered. Keywords represent the most notable GO-terms in a group. " i have seen plenty of papers where they cluster all the GO terms ,is that done with your tool ,as i haven;t installed and tried it yet.

ADD REPLY • link 6.0 years ago by 1769mkc ★ 1.3k

1

Entering edit mode

The paper you mention is done with gogadget indeed. You say you have seen plenty of papers, I don't think they were done with this tool. Some of the papers in the google scholar link are the ones I am aware of (some are reviews).

ADD REPLY • link 6.0 years ago by Benn 8.4k

score 0 · Answer 2 · 2019-04-26

0

Entering edit mode

6.0 years ago

jared.andrews07 ★ 18k

GoSemSim is easy to use and another good option. Doesn't have built in visualizations, but can be used in conjuction with clusterProfiler for some good visualizations if you don't want to deal with the output manually.

ADD COMMENT • link 6.0 years ago by jared.andrews07 ★ 18k

score 0 · Answer 3 · 2019-04-29

In the end, I ended up getting the GO profile using a Bioconductor package GOprofiles. I converted the Ensembl stable gene IDs to Entrez IDs using Ensembl's Biomart tool. There is a small amount of information loss here - as with almost every step here - as some of the Ensembl genes did not have an Entrez ID. You then have control over what "level" you want to constrain your analysis to, and the output isn't hard to manipulate.

As for the clustering, I will probably avoid this as it doesn't actually answer my question at all.