I am using SAM to find differentially expressed genes between three groups of variations upon a disease. I have then used pathway overrepresentation analysis (using a hypergeometric test with correction for multiple testing) to see what biological pathways from a curated database are overrepresented for each of the three groups using the top-ranking genes from SAM (q < 0.05). I want to visualize the network to show the following
- Node Size: number of genes in pathway
- Edge Weight: degee of overlap between pathways
- Node Color: not sure what I could use here. Here are some candidates
- average d-score of interclass differentially expressed genes in a given pathway
- average expression of all genes (regardless of whether in result from SAM) in a given pathway
- q value for overrepresentation analysis (from the corrected hypergeometric test)
Any technical criticisms of the 3 I've proposed here or any other suggestions that prove more useful (i.e. to show that a given variation of the disease is dependent in some way upon some sort of activity (or lack) in a given pathway)?