After many years of pathway analysis with a range of public and proprietary tools (IPA, genego, GSEA, KEGG, GO etc), I have developed a suspicion that pathway enrichment statistics and results in general may be biased by the high representation of cancer related genes and pathways in the pathway databases. The question I have is twofold 1) Do others share the view that there is an inherent bias towards cancer in pathway databases? 2) Can (or should) this be corrected for, and if so how?
I've reviewed the literature on this and found nothing, but this is something that I have heard anecdotally on many occasions
thanks in advance for your thoughts
Thanks Devon, yes I also recognise the points you make! Regarding conclusions it's probably both scenarios, and yes I completely agree with your assessment. I guess the issue that remains it that when to take your example - you report enrichment of PD and AD related genes, many (e.g. reviewers) will take this at face value, but to demonstrate that this represents metabolic stress requires a enrichment in a well annotated metabolic stress pathway to support this statement. Sometimes this may exist, sometimes not. I guess this underlines the importance of a strong gene ontology to compare disease results.