Question

Issues with pGIG : Negative probability and unexpected P value order

0

Entering edit mode

10 weeks ago

jayeshkumarsundaram ▴ 10

I fit my data to generalized inverse gamma distribution using the gamlssML() function. After fitting, I calculated the probability for some of the range of values I am interested in (4000 to 10000). Interestingly, the P value is not constantly decreasing and also it has some negative values. Is it some error associated with estimation?

gig_fit <- gamlssML(est_var_raw[,4096], family="GIG")

gig_fit$mu; gig_fit$sigma; gig_fit$nu

[1] 4627.405 [1] 0.05937216 [1] -0.4985588

pGIG(q=seq(from=4000, to=10000, by=500), mu=gig_fit$mu, sigma=gig_fit$sigma, nu=gig_fit$nu, lower.tail = FALSE, log.p = FALSE)

[1] 9.924016e-01 6.703192e-01 9.095480e-02 1.619125e-03 4.958494e-06 3.670879e-09 7.459589e-13
[8] 4.027445e-12 8.215650e-15 -1.982858e-13 6.085132e-13 -1.243450e-14 2.109424e-15

Is it possible that the estimation of P values becomes imprecise after a certain limit from the main distribution of values? What should I do if I have to estimate P value for a value beyond a certain value?

Looks like the accuracy becomes a little weird if we have values lower than 10^-10

qGIG(p=sapply(c(1:15), FUN=function(x){10^-x},) mu=gig_fit$mu, sigma=gig_fit$sigma, nu=gig_fit$nu, lower.tail = FALSE, log.p = FALSE)

[1] 4984.207 5302.570 5547.697 5757.560 5945.821 6119.189 6281.530 6435.304 6582.189 6607.697 6607.697 6607.697
[13] 6607.697 6607.697 6607.697

Ref: https://github.com/gamlss-dev/gamlss.dist/issues/6

R distribution gamlss fitting • 500 views

ADD COMMENT • link 10 weeks ago by jayeshkumarsundaram ▴ 10

0

Entering edit mode

Do you get an actual error if you set log.p=True? The (generalized) inverse-gamma is part of the exponential family. Try setting log.p=True and exponentiating if you need it in probability space rather than likelihood.

ADD REPLY • link 10 weeks ago by LChart 4.7k

0

Entering edit mode

Yes, I tried setting the log.P=True and actually got an error.

ADD REPLY • link 10 weeks ago by jayeshkumarsundaram ▴ 10

score 1 · Answer 1 · 2024-10-17

You can see what's going on here: https://github.com/gamlss-dev/gamlss.dist/blob/cfa3030a569141865bb211d626afe5fea6a0e6e4/R/GIG.R#L182

the CDF is approximated by numerical integration; and as the inverse-gaussian is normalized by a Bessel function (which is itself approximated by either integration or a series expansion) it limits the precision probably to something above machine epsilon.

This is way off topic for the forum, but it's possible that you could get around this implementation by using an un-normalized density and properties of the integral. You know that:

f(x;a,b,p) is proportional to exp{ (p-1) log(x) - (ax + b/x)/2}

and that the integral over the whole number line is (a/b)^(p/2) / 2K_p([ab]^(1/2)); so it's probably more stable to use linear properties of the definite integral to subtract the results of numerically integrating the unnormalized density from the known normalization factor.