Dear all,
I have been recently working onAffymetrix analysis on GO analysis. I am starting from 0 and I am training and learning ho to analyse these data, and have a few problem with GO analysis.
For training, I am using data from this GEO profile: http://www.ncbi.nlm.nih.gov/geoprofiles/110093622
Associated paper using these data is available and can be read on ResearchGate.
I don't have any problem (until the next time) with pre-processing data (normality/background/filtering). I have a set of 145 genes down-regulated and 316 genes up-regulated, which is, in my opinion, not so big for GO analysis.
The main problem is that in the related paper, they have just written (I don't have access to the supplement, I can't double check) "We have performed GO analysis, and we have found that these biological processes were affected , for example Acture inflammatory response, or ECM organization". Then they selected genes in theses pathway and made a heatmap.
I have tried to use GOstat but I can't get any results. Basically what I am doing is clustering the genes in a tree, and then cutting the tree in different groups and performing enrichment analysis on these groups, but none of my groups contains something significant. I am probably not correctly using GOstat, but this is not my main point. I am now focusing on Cytoscape with BINGO and Enrichment Map. I can get GO pathway now, but I have a lot of pathways (nothing unusual I guess). It's possible to see for example 3-4 genes in a small pathway, such as ossification or differentiation which might interest me. However I am completely lost in front of these big networks/ edges/nodes etc.
I have never had any lectures about GO analysis and I don't really know how I should for example subgate different pathway within this list of pathway within Cytoscape ( with something else that the p-value). And I don't think that the method "Ok I have KO this gene, so I should get these differences, I am going to select pathway only related to these differences" is a good way. I have also tried GATHER and other internet tools, but it's pretty the same thing, I have let's say 100 different pathways (with for example biological process containing 250 genes) and I don't know what to do know.
I know that this is a very broad query, but do you have any method for example, about how to analyze GO results, any tool to recommend that might help me?
Thank you and have a nice day!
I think your question is: how to do GO enrichment analysis with DEGs?
DAVID tool is the easiest for beginners. check here.
Just enter the list of genes on the left side (step1) --> select the appropriate identifier (whether it is affy 3' probe id or enterez gene id) --> for step3 just choose gene list --> then submit.
You will get GO/KEGG enrichment results. For more details check DAVID paper.
Googling GO enrichment analysis/functional enrichment analysis will give you more details.
HTH
Diwan
First of all, thank you for your answer.
I will try with DAVID as you advised me, it looks friendly user and might help me!
Thanks again!
The normalization technique you used assumed (likely) that the expression of most genes is invariant. You are fine with your numbers.You can try http://geneontology.org/page/go-enrichment-analysis as well.
The only thing to keep in mind is that you will certainly find GO terms with a significant p-value if look at a lot of them: say, you have 20 terms with P<0.05 => good chance of false positive(s). The tool you use should however hopefully correct for this (using a method like Bonferroni correction).