Entering edit mode
4.9 years ago
CY
▴
750
Most curated pathway gene sets are simple gene lists. If each gene in the list is weighted or ranked based on its relative expression level, would such gene set be more robust when used for pathway enrichment analysis?
Hi CY,
I'd guess that would be true if there is a direct relationship between gene expression and enzyme-activity (which is more or less the same for all genes).
You have also the ambiguity between gene-centered pathway annotation and expression of gene isoforms which might or might not different functions/target organelles/secretion pathways.
I would estimate it will take us a lot of years to have such an insight understanding the molecular functions and relationships that we can get robust pathway enrichment analyses by using the expression levels.
Best,
Michael
I guess that the direct relationship between gene expression and enzyme-activity is the underlying assuption even for current non-weighted pathway gene set.
The more important thing is whether activated pathway defines fixed ranking of expression gene set. Maybe for either biological or technical reasons, both A>B>C and B>A>C (A, B and C are expression level of specific gene) indicate the activation of same pathway.
To my understanding, having the fitting gene expression profile is the necessary condition, that a pathway might be activated in your sample. But it's not the sufficient condition. Some enzymes might be present, but in an inactive state.
To make some robust statements, you can correlate the gene expression with the corresponding enzymatic activity. If gene A's enzyme is a rate limiting step in pathway 1 but not in pathway 2, and gene A is lowly expressed it would increase the probability of pathway 2 over pathway 1.
I once tried to integrate the KM values of Brenda into a petri net of a metabolic pathway (manually) and it was a pain in the neck.
Not quite. The underlying assumption is that an increase in expression is correlated with an increase in activity (and vice versa), but that doesn't imply that 100 copies of protein A means the same thing as 100 copies of protein B.