Hello,
I am trying to implement a R function which does the GSEA.
I read many papers related to this method and each of them tries to destroy the other and show a better performance of its own method (that is what we do as scientists :-D )
Anyway, what I am now working on is to find out how running sum works to calculate the Score!
The running sum is to calculate the Enrichment Score over a gene set
- How to define a gene set? For example if I have over 20000 genes, can I say the first 200 are one set, and the rest is another set?
- How to calculate it ? what they say
a Kolmogorov-Smirnov (K-S) running sum statistic is computed: beginning with the top-ranking gene, the running sum increases when a gene annotated to be a member of gene set S is encountered and decreases otherwise
Can someone explain how does this technique work?
Can it be done for one sample? If not, why?
You can actually download and have a look at how the authors implemented the original GSEA in R: http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/R-GSEA_Readme