Hi guys,
I got confused to interpret the result derived by single sample gene set enrichment analysis.
To be specific, I have the expression data for a specific pathway and show it in a heatmap by log and z-score transformation of the FPKM matrix for each gene. See the heatmap as the following:
I further compressed the signature into a single score per sample by ssgsea algorithm using GSVA R package. by the following command:
enrichscore.meta <- gsva(as.matrix(FPKM.HUGO),meta.sig,method="gsva")
The enrichment score was then z-score and truncated to [-1,1], and shown in a scoremap. See the scoremap as the following:
You could see in the heatmap, under the first red subtype, there are two blocks with either upregulated (red rectangle) or downregulated (blue rectangle) expression, and the number of these dysregulated genes is almost 50-50.
However, how could I get a corresponding scoremap with totally blue score in the first subtype (black rectangle) which means this subtype is generally downregulated in this pathway (signature), as compared to other subtypes presented with relative high scores. It doesn't make any sense.
Is there anything wrong, and I hope to get some advises.
Many thanks advanced!
What are you enriching your data to?; what is contained within the object
meta.sig
? This is important to understand.You can 'fix' the order of your heatmap. How you do this will depend on which heatmap package / function that you are using. Which heatmap package / function are you using?
Thanks, the meta.sig is a list contains 23 metabolism related pathways. Each element in the list is a vector of genes corresponding to the specific pathway. I am not asking the order because the order is fixed either in the heatmap or scoremap. I just wonder why the enrichment score would be totally blue but the corresponding gene expression is not generally downregulated, because you could almost 50% is upregulated and 50% is downregulated.
Why do you use
method = 'gsva'
and notmethod = "ssgsea"
as you write? Also in the first heatmap you just show the gene expression? And then the signature you make are just of all genes?Sorry the code is another test, the 'actual' score was calculated by ssgsea method. The heatmap showed the expression of all genes included in the first pathway, which show in the scoremap. I used specific gene list corresponding to a pathway to make a signature. In the meta.sig, there are 23 pathways inside it, so I would get 23 signatures but I just show one of them that confusing me a lot.
Ahh okay - did you turn off the ssgsea post-normalization (via the ssgsea.norm argument - on by default)? Else that could explain it :-)
No, this parameter was just like it was. OMG it is really confusing... Maybe I should just calculate the average value for each signature and at least that would be more interpretable.
I actually think it is a good idea as mostly when you test many signatures you are interested in comparing across those signatures :-)