Hi!
I have a set of gene identifiers(RefSeq) and I want to know the pathways in which these genes are present along with some statistical score.
Any suggestions for tools.
Thank you
Hi!
I have a set of gene identifiers(RefSeq) and I want to know the pathways in which these genes are present along with some statistical score.
Any suggestions for tools.
Thank you
I assume that you are looking at tools to perform pathway analysis / pathway enrichment. AFAIK, that's the only way you could get a score/p-value while assigning genes to pathway.
Sean's suggestion "DAVID/EASE" will be a good start. But I noticed that the pathway databases like KEGG, Panther, Reactome imported to DAVID system in 2009. I am sure that the original databases may have recent updates.
If you can do some ID conversions and use BioConductor you could try other tools like SubpathwayMiner, SPIA, gene2pathway etc. See this post for a detailed list of tools that can perform Pathway enrichment. Tools like SubpathwayMiner directly import pathway annotations from KEGG for your analysis. Thus you can assure that you are using up-to-date reference databases.
Also read this articles for a background on pathway analysis in various contexts: http://www.ncbi.nlm.nih.gov/pubmed/18463709 http://www.ncbi.nlm.nih.gov/pubmed/18207385 http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002053 http://www.ncbi.nlm.nih.gov/pubmed/21496265 http://www.ncbi.nlm.nih.gov/pubmed/21085203
I did a KEGG pathway enrichment both manually and via DAVID recently, and found that DAVID had lost nearly half of the pathways. The last update of DAVID databases was in 2009, due to page http://david.abcc.ncifcrf.gov/content.jsp?file=update.html I think that DAVID should not be recommended as time goes by.
I do not know of a "statistical score" associated with a gene being in a pathway. However, there are many pathway databases. This is a reasonable conglomeration of those available.. The DAVID/EASE system is a nice one for mapping genes to various gene sets including pathways.
The Reactome Pathway Analysis tool analyzes user-supplied lists of genes, proteins and small molecules and provides ID mapping, pathway assignment and overrepresentation analysis. In the Overrepresentation analysis mode, the Pathway Analysis tool takes a user-supplied set of gene or protein identifiers and performs a statistical test to determine whether any Reactome pathways are overrepresented (enriched) in the submitted data, i.e. it answers the question 'does the list represent the proteins within a specific pathway more than would be expected if the set were random?'. A one-tailed Fisher's exact test is used to calculate the probability.
Ha... Didn't you work in the same room as the guys developing PathVisio?
PathVisio can do statistical analysis on PathWays both straightforward over representation analysis of more advanced (but often not really relevant, see some other pathway questions) gene set enrichment analysis. PathVisio has a plugin interface which you could use to add other types of analysis, if you don't mind to code them in Java.
It can use the pathway set available at WikiPathways, including the Reactome pathways (which are quite recent). It can also still can use the set of KEGG converted pathways that are available from the PathVisio download site.
RefSeq ID's and most others are fine for this since PathVisio uses BridgeDB to map the identifiers
Of course it is a good idea to always check what approaches are out there, even if you developed something yourself. If you already have collected part of the answer to a question yourself it is often a good idea to put that information into your question. It will encourage other people to share the extra things they know when you take away the need to describe the well-known.
Yes, I did work in the same room but on a different project :) Initially, I was also thinking of using PathVisio but then came across GSEA (http://www.broadinstitute.org/gsea/index.jsp) and DAVID (http://david.abcc.ncifcrf.gov/). Hence, to know more about available options, I had posted this question. :)
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
You need to be more precise than "some statistical score". What statistical test did you have in mind? Maybe over-representation?
P-value :) .......
In my opinion GSEA is good enough and its upto an individual whether he opts for PathVisio or DAVID (depending upon the biological question one wants to tackle). The R package (gene2Pathway) and articles suggested by Shameer are also good enough.