Permutations FGSEA
2
0
Entering edit mode
5 weeks ago
Bine ▴ 90

Good afternoon,

I was wondering if someone could advise me on the following:

I am running FGSEA with my DESEQ2 results in Hallmarks following below code:

# Prepare results from DESEQ2

res<- res[order(-res$stat),]
ranks<-res$stat

names(ranks)<-res$hgnc_symbol

# Run gene set enrichment analysis 

fgseaRes <- fgsea(pathways=pathways.hallmark, stats=ranks,nperm=10000)

However, my pathway results change signficantly if I change the number of permutations (e.g. from 1000 to 10000). I am not quite sure which number of permutations I should choose?

Thank you!

fgsea • 717 views
ADD COMMENT
1
Entering edit mode

You can find your answer in a previous post here: nperm value in GSEA setting

ADD REPLY
1
Entering edit mode

the smallest p-value for 1000 permutation is 1/1000 and if you increase it to 10000, the smallest p-value you can get is 1/10000

ADD REPLY
2
Entering edit mode
5 weeks ago
alserg ▴ 1000

You don't really need to specify nperm anymore (it even produces a warning if you do so). It's only supported for backward compatibility purposes and it results in using an older and less efficient algorithm.

ADD COMMENT
0
Entering edit mode

Thank you for your reply. Do you know what number of permutations is applied then, if I do not specify it? Thanks again!

ADD REPLY
1
Entering edit mode

If you don't specify it, then fgseaMultilevel is used, which doesn't have nperm parameter at all and can calculate arbitrarily small p-values, not limited to 1/nperm.

ADD REPLY
0
Entering edit mode

Thank you. Please see my comment below regarding the different results i get with 1000, 10000 and without permutations. Thanks a lot for your help!

ADD REPLY
0
Entering edit mode
5 weeks ago
Bine ▴ 90

I have run it with permutations = 1000, 10000 and not specify the permutations. The pathways are the same, but the p-values vary significantly. What can I trust??

enter image description here

ADD COMMENT
1
Entering edit mode

True GSEA P-values are hard to calculate exactly, thus FGSEA (and other similar programs) _estimate_ them with a certain level of accuracy. Higher nperm gives you higher accuracy, but takes more time, fgseaMultilevel procedure (which is executed when you don't specify nperm) has a better accuracy to time tradeoff. Anyway, all of these methods are non-deterministic and can give you slightly different results each time you run it, so there can be inconsistencies near the selected significance threshold.

ADD REPLY
0
Entering edit mode

Ah ok that explains why the results between "not specify permutation" and 100.000 permutations are similar. Thank you very much for your reply.

ADD REPLY

Login before adding your answer.

Traffic: 1832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6