Hello. I would like to ask if you believe that DAVID is ok for GO enrichment analysis or should I pick another web based or R tool? Thank you in advance.
Hello. I would like to ask if you believe that DAVID is ok for GO enrichment analysis or should I pick another web based or R tool? Thank you in advance.
You have to choose what kind of analysis you want to do :
1- Over Representation Analysis: (ORA )It is a widely used method to know whether genes in biological pathways are over-represented (enriched) in your list. Your list may come from differential expression analysis. There are a lot of web-tools and also R package you choose for this kind of analysis. As far as I know DAVID, the last update was in 2016. I am still a fan of this tool, but at the same time, other tools like what has been mentioned above should work fine.
2- Gene Set Enrichment Analysis: (GSEA) , it was developed by Broad Institute. This is the preferred method when genes are coming from an expression experiment like microarray and RNA-seq. However, the original methodology was designed to work on microarray but later modification made it suitable for RNA-seq also. In this approach, you need to rank your genes based on a statistic (like what DESeq2 provide), and then perform enrichment analysis against different pathways (= gene set). You have download the gene set file. The point is that here the algorithm will use all genes you have in the ranked list for enrichment analysis [in contrast to ORA where only genes passed a specific threshold (like DE ones) would be used for enrichment analysis]. You can find more details about the methodology on the original PNAS paper, here is a summary of why one should use this approach instead of ORA:
1- After correcting for multiple hypotheses testing, no individual gene may meet the threshold for statistical significance.
2- On the other hand, one may be left with a long list of statistically significant genes without any unifying biological theme.
3- Cellular processes often affect sets of genes acting in concert, using ORA may lead to miss important effects on pathways.
GSEA software maybe finds on its homepage. However, there are some Bioconductor packages which use a similar approach to do GSEA, I like to use this one : fgsea
One of my first choices is AgriGO, I like the presentation of results. The downsides are that it is a web-based tool that can go (and has gone) offline at any time for days or weeks.
http://systemsbiology.cau.edu.cn/agriGOv2/
Plus there is a lack of clarity about the implementation of the algorithms.
I personally prefer g:profiler, as it allows you to set your own background. It also has an R package.
Gene Set Clustering based on Functional annotation (GeneSCF)
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Here is an output by AgriGO, I find this output very informative and is the main reason I recommend the service
Thank you a lot for your reply, I ll check it! So DAVID, do you think it is outdated and not as useful anymore?
hmm the truth is I am interested in Homo Sapiens, so maybe AgriGO is not the perfect tool?
does it not have human annotations? They have started out with plant (and agricultural) focus but I believe they have added many species now. Go to species and select human.
As for DAVID, that tool is also widely used, it has a lot of critics but supporters as well. The interface is counterintuitive, and the data is not ever corrected for multiple comparisons, thus you can find just about anything, but then it is also more subjective.