Hi all, I have an RNAseq data for 81 samples. I ran ssGSEA 2.0 (I cloned it from the Broad Institute's github repository and sourced the code through R). After running the analysis a first time, I was curious whether somehow the score generated for any given sample was affected by other samples present. I re-ran the analysis using half of the samples, and to my surprise obtained slightly different results. However, upon repeating the analysis with the exact same parameters, again different results were obtained.
I conclude that the algorithm is somehow stochastic, explaining different results even despite identical input parameters. However, I still do not know whether the presence of other samples influences the output. Would anyone be able to enlighten me?
Thanks in advance.
It is interesting.. I haven't never seen such a issue since I have run it. But you can also use the other packages for single sample analysis. I would recommend you to try GSVA and Singscore
Thank you - I have since switched to GSVA because I find it much simpler to use in R.
I would be cautious as GSVA is heavily influenced by other samples present in the analysis group. CDF of gene expression is built based on all samples to derive ranks for genes.
You can run both ssGSEA and GSVA with the GSVA package. You just need to change one parameter.