I'm trying to understand the way the GSVA analysis is working behind the scenes.And I was wondering if there is any way to understand it more intuitively the whole process.
So at first according to paper it starts by evaluating whether a gene i is highly or lowly expressed in sample j in the context of the sample population distribution. They use these kernel estimations of the cumulative density functions to transform the initial values so not to be affected by the problematic intensities.
After this "transformation" and a following normalization, GSVA calculates the enrichment scores using the Kolmogorov-Smirnov (KS) like random walk statistic.
As I know, Kolmogorov-Smirnov checks for differences in distributions. Which distributions does it check? Gene-set's against all the others genes? And what is the role of the random walk?
So is there any intuitive way to understand this kind of Kolmogorov-Smirnov (KS) like random walk statistic? How does it actually work? Which one is the null and which the alternative hypothesis in that case?