Entering edit mode
3.4 years ago
gabrielbaldanzi
▴
10
Hi,
I am trying to run fgsea on ranked p-values, but keep getting the same error:
res <- fgsea(pathways = mylist, stats = rank(-p.val), eps = 0, scoreType = "pos",nPermSimple = 100000)
"Error in if (any(simpleFgseaRes$modeFraction < 10)) { : missing value where TRUE/FALSE needed"
I checked one hundred times already that the vector names are the same as the ones in the pathway list.
I am new to R and fgsea, but after some attempts to debug, I figured that for one pathway, I keep getting NA for results.
Browse[2]> counts[[1]][420:430,]
pathway leEs geEs leZero geZero leZeroSum geZeroSum
1: 420 99257 743 0 1e+05 0 26201.27
2: 421 1135 98865 0 1e+05 0 28324.93
3: 422 29269 70731 0 1e+05 0 32941.02
4: 423 29571 70429 0 1e+05 0 36958.29
5: 424 59529 40471 0 1e+05 0 28029.62
6: 425 NA NA NA NA NaN NaN
7: 426 83255 16745 0 1e+05 0 28093.93
8: 427 28468 71532 0 1e+05 0 32573.34
9: 428 84112 15888 0 1e+05 0 37584.64
10: 429 96643 3357 0 1e+05 0 29239.96
11: 430 497 99503 0 1e+05 0 27778.11
I believe this is the issue, but have no idea how to solve it. I noticed that the pathway with issue has a large size: 1885 compared to the number of stats (1978).
Browse[2]> length(stats)
[1] 1978
Not sure if this could affect the results some how. In total, there are 500 pathways in the list.
Any help is very much appreciated. =)
The value for
stats
should not be the actual rank indices, but, technically, you could still use these [the rank indices]. Thestats
should just be whatever metric you wish to use to represent enrichment. Ones that we may use are:Why only 1978 genes? - GSEA is supposed to be performed using all genes.
Irrespective, 100000 permutations is neither necessary, and this could also be causing the error indirectly as a result of the large pathway / signature and your relatively low number of genes. Also, please first try without setting any value for
eps
, i.e., leave it at the default, and also forscoreType
.The GSEA developer usually responds quickly on the Bioconductor support site, so, if my advice does not solve the problem, you could try there.
Hi Kevin,
Thank you for your suggestions. I tried to set
eps
andscoreType
to default, but unfortunately that did not solve it. SettingscoreType
to default generated the following warning:I also tried to reduce the number of permutation to 10 000 and to 1 000, but still got the same error.
Gabriel
The message about
scoreType
is due to the fact that your values forstats
are all positive - they are just whole integer values representing rank. Can you try -log10(p-values) or one of the other potential values (which I listed above) forstats
??Thank you again, Kevin. I tried using -log10(p.val) and other values, and still got the same error. I followed a suggestion to set
maxSize=500
and it worked. So it is probably something related to the large pathways. And although setting maxSize circumvented the issue, I believe it causes important limitations to the results.I will have to look more into the data and if can't find anything, I'll reported at github as suggested here. Thanks.
Please report it anyway. As fgsea developer I can say that this is not an expected behavior, but we can't look into it further unless we have a reproducible example.
Please, provide your data to reproduce the error and report it to https://github.com/ctlab/fgsea/issues