As part of a larger project I've implemented GSEA in Matlab. I want to test my code by comparing my output - p-values for KEGG pathways say, with another implementation. In my implementation the GSEA algorithm starts from a ranked list of gene ids with real valued weights. So I want to upload the same list to the tool used for comparison. However, as I look round tools for GSEA most of them seem to start at an earlier point, requiring the original gene data, which would mean that there would be scope for differences in preprocessing to affect the output, whereas I want a pure comparison of the GSEA part.
What would you say is the simplest way to perform GSEA (not another enrichment algorithm) on a weighted list of entrez-gene ids?
I'm fine if it involves some coding, but don't want to get bogged down editing large amounts of other people's code to perform what should be a straightforward test to check that my results are comparable with the expected results.
If you want to compare you implementation of GSEA with others, I'd recommend as @vodka suggested to use the Broad's program: it's developed by the original authors and works pretty fast compared to other implentations. Just beware it results in zero p-value when there was no permutation with more extreme statistic, which is biased and is not a recommended method (http://www.statsci.org/smyth/pubs/permp.pdf).
However, I would recommend to check out an R-package https://github.com/ctlab/fgsea (disclaimer: I'm the author of this package) and not to implement the GSEA method yourself: likely it will be slow if you're testing many pathways. You'd probably be interested in looking at this package as it implements a special algorithm to simulthaneously build background distribution
for all the gene set sizes and thus it works much faster (up to hundred times) than all the versions I'm aware of, but giving the same p-values.
I am also trying fgsea & looks very easy and fast compare to GSEA, but how can I use my own geneset.gmt file and ranked file, I have tried with example data,
Perfect, just what I needed.