Entering edit mode
4 weeks ago
Bine
▴
90
Good morning,
I was wondering if someone could advice me on the following: I have performed a GSEA in KEGG_LEGACY, Reactome and Hallmarks using FGSEA.
I receive a lot of significant pathways as hits:
- E.g. in Hallmarks out of all pathways (I think 50 in total) most of them are siginificant (only around 20 are not significant).
- In KEGG_LEGACY and Reactome there are also a lot of significant pathways...
I dont know if this is possible and if so is there a way to make it a bit easier to interpret? How can I make sense of all this?
Thanks a lot!
What is the dataset you are analysing and what statistic are you using as input? Also what is your cutoff for significance?
Good afternoon, Thank you for your answer. My dataset is TCGA colorectal cancer and I am using the stat parameter (which I received from my DESEQ2 analysis) as input for FGSEA. My cutoff for significance is adjusted p-value of 0.05.
Please also find my code below:
Thank you!
I am not sure if DESeq2 is good when you have hundreds of samples to compare in each group. https://genomebiology.biomedcentral.com/articles/10.1186/s13059-022-02648-4
FWIW this issue wouldn't affect GSEA per-se. The guidelines for running GSEA are to run the whole dataset ranked by a statistic (eg LFC) rather than pre-filter by significance - GSEA does its own permutation testing. In that sense, over convervative FDR in your DE analysis shouldn't affect downstream GSEA results unless you're prefiltering results specifically, in which case you're already biasing your GSEA output.
I ran it on all 18.000 genes (not only the significant ones). The total list was 30.000 or so, but I did some prefiltering of this data according to DESE2 pipeline: