When doing enrichment analysis using public pathways databases (e.g. KEGG, Reactome) or ontologies (e.g. Gene Ontology, Human Phenotype Ontology) I often get categories, which contain both up and down-regulated genes according to the gene expression data. I interpret it as potential indicator that some process is "disrupted" or that my data might resemble particular phenotype. However, I noticed that people are rather interested in up- or down-regulated processes and they tend to equate it with enriched category having up- or down-regulated genes only. I think such understanding might not reflect how the biological processes work (as they can have both repressed and activated genes at the same time), so I seek answers for the following questions:
- How do you interpret enriched categories having both up- and down-regulated genes? Do you perform enrichment analysis on the whole set of differentially expressed genes (I saw practices when people separate up- and down- genes)?
- Do you know any bioinformatics tools or algorithms that go deeper in functional/ontology enrichment analysis and try to tell if the particular biological process is indeed (de)activated (by inferring the state of the final products of the pathway, for example).
Thanks in advance.
1.
I like this answer:D
That's exactly my point, and another problem here is weak understanding how ontologies are composed. For phenotype based ontologies I can get list of genes for each category and list of references, from where the information was inferred. And in most general cases I don't have information on up and down regulation unless I go through each reference. But even then, let's say I got gene A up-regulated in 50 references and down-regulated in other 50. It could mean the gene is totally irrelevant for the given category and needs to be excluded from the phenotype, or it could make perfect sense if its expression depends on the experimental factors (as it's hard to imagine 100 perfectly identical experiments).
I think this is a nice explanation, but sometimes I struggle to explain that it's not antagonistic to cases with "mixed" genes in the enriched category. I wrote this post to hear all different opinions, because it was hard for me to put this into words, and I want to have decent discussion next time such topic pop ups.
2.
Very interesting, thanks for suggesting, I will look into it. I wonder if it's possible to do the same for non-pathway based ontologies, e.g. by inferring up and down regulation from literature (basically the thing I wrote in 1 when answering your comments).
Super nice idea, not easy to implement though, and hard to get data unless produced specifically. But sounds like a nice scientific challenge.
So I think if you are seeing upregulation of, say, positive regulators and negative regulators of a pathway, then you might be seeing cells that are more sensitive to both activators and repressors of that pathway (remember, high expression of a positive regulator means nothing unlike that positive regulator is post-transcriptionally activated).
If you have both up regulation of activators and down regulators of repressors, then I guess you are making the pathway easier to activate and vice-versa you are making it harder to activate.
If you are seeing both up and down regulation of positive regulators (negative regulators) then you could be seeing a change in which signal the pathway will respond to, or a change change in the kinetics (how long will the effect of a signal last in the network for example).
It is unlikely that these sorts of effects can be reasoned about intuitively without modelling.
Yep. Like, at least a Masters project if the data exists. A PhD if it doesn't.
Good science is hard, slow and there are no easy wins.