I have just finished getting my GO analysis results - or at least the first set of results. I am now trying to interpret the results, and my question is essentially looking at what broad gene functions are over represented in the genome scan I have just done.
For this I would like to classify second-order (for lack of a better term) GO terms. For examples I want to classify "GO:0045087: innate immune response" as Immune System Process, or a similar term. I have 3 sets of analysis at the moment, and around 150-260 GO terms that are significant to p<0.05 using TopGO and a universe of autosomal genes. Is there a way to automate this process? I have seen GO hierarchies(e.g. http://www.informatics.jax.org/vocab/gene_ontology/GO:0007187). But I don't know if this is a consensus, or just what someone has built for their study.
EDIT: I've removed the question about visualisation software. I found that using ggplot with geom_point but in the orientation of a heatmap was a good way of visualising the similarities and differences in my populations for significant GO terms. Though I did have to only take terms with the support of at least 5 significant genes to get a reasonable number.
EDIT2: I am still looking for a tool that can take a list of GO terms, and I can report how terms belong to higher level terms - level 2 is my goal. For example, I want to know how many significant GO terms in my list belong to "GO:0009987 cellular process" or "GO:0008152 metabolic process 68". I have looked over the topGO documentation and if it's available there, I've missed it.
Thanks.
If you want a tool to find parent-child, or ancestor-offspring relationship between GO terms, take a look at R package GO.db.
One warning though, GO terms are not always one-to-one related, but can be have very complex relationships.
Thanks, I have seen the GOBPCHILDREN and ancestor tool in that package. But as you said, it's not a 1:1 relationship so I fear I would not get the values correct for each of the parent terms.