How to find the shared molecular pathways between two Diseases? What are the different types of analysis(gene enrichment analysis etc.) and tools(toppgene etc) that i can use for the same? Thank you
How to find the shared molecular pathways between two Diseases? What are the different types of analysis(gene enrichment analysis etc.) and tools(toppgene etc) that i can use for the same? Thank you
To go from diseases to shared molecular pathways, the best approach is probably to go via genes/proteins. However, there are many ways to do so, and it is not the only option.
The first step would thus be to go to a database of disease-gene associations and retrieve the genes associated with each of the two diseases. This will yield either a set of genes for each disease or possibly a ranked list of genes for each disease (which can obviously be converted to sets by applying a cutoff if one so chooses).
If one goes with the set-based approach, I would first look at the intersection of the sets, i.e. the genes in common for the two diseases. If there are many genes in common, one could simply do enrichment analysis of these genes to find which pathways they fall in; however, it is debatable if this adds anything, since the shared genes themselves provide a molecular association between the diseases.
If there is not much of an intersection, I would do an enrichment analysis of the two sets separately. This will give two p-values for each pathway, which then need to be somehow combined. I would probably just use the worse of the two p-values as the combined p-value, since this ensures that the combined p-value will only be good when a pathway has a good p-value for both diseases.
If the database of disease-gene associations gives a ranked list, I might use a rank-based method to obtained a p-value for each pathway for each disease to avoid introducing an arbitrary cutoff to turn the ranked lists into gene sets. Afterwards, I would combine the two p-values for each pathway the same way as described above.
Instead of relying on global enrichment analyses, one could also opt to retrieve a combined protein-protein interaction network and identify modules within the network that connect the two diseases. Afterwards, enrichment analysis of the genes in each module could give a list of pathways if needed, although I would argue that the modules themselves a probably a better description of the molecular commonality of the two diseases.
Lastly, there is the option to not go via genes or proteins at all. Text mining of the biomedical literature could provide a (ranked) list of pathways associated with each disease, which can then obviously be analyzed to identify common pathways.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you so much Mr. Jensen for the detailed explanation. Very helpful! God bless