Issues with pGIG : Negative probability and unexpected P value order
1
0
Entering edit mode
10 weeks ago

I fit my data to generalized inverse gamma distribution using the gamlssML() function. After fitting, I calculated the probability for some of the range of values I am interested in (4000 to 10000). Interestingly, the P value is not constantly decreasing and also it has some negative values. Is it some error associated with estimation?

gig_fit <- gamlssML(est_var_raw[,4096], family="GIG")

gig_fit$mu; gig_fit$sigma; gig_fit$nu

[1] 4627.405 [1] 0.05937216 [1] -0.4985588

pGIG(q=seq(from=4000, to=10000, by=500), mu=gig_fit$mu, sigma=gig_fit$sigma, nu=gig_fit$nu, lower.tail = FALSE, log.p = FALSE)

[1] 9.924016e-01 6.703192e-01 9.095480e-02 1.619125e-03 4.958494e-06 3.670879e-09 7.459589e-13
[8] 4.027445e-12 8.215650e-15 -1.982858e-13 6.085132e-13 -1.243450e-14 2.109424e-15

Is it possible that the estimation of P values becomes imprecise after a certain limit from the main distribution of values? What should I do if I have to estimate P value for a value beyond a certain value?

Looks like the accuracy becomes a little weird if we have values lower than 10^-10

qGIG(p=sapply(c(1:15), FUN=function(x){10^-x},) mu=gig_fit$mu, sigma=gig_fit$sigma, nu=gig_fit$nu, lower.tail = FALSE, log.p = FALSE)

[1] 4984.207 5302.570 5547.697 5757.560 5945.821 6119.189 6281.530 6435.304 6582.189 6607.697 6607.697 6607.697
[13] 6607.697 6607.697 6607.697

Ref: https://github.com/gamlss-dev/gamlss.dist/issues/6

R distribution gamlss fitting • 500 views
ADD COMMENT
0
Entering edit mode

Do you get an actual error if you set log.p=True? The (generalized) inverse-gamma is part of the exponential family. Try setting log.p=True and exponentiating if you need it in probability space rather than likelihood.

ADD REPLY
0
Entering edit mode

Yes, I tried setting the log.P=True and actually got an error.

ADD REPLY
1
Entering edit mode
10 weeks ago
LChart 4.7k

You can see what's going on here: https://github.com/gamlss-dev/gamlss.dist/blob/cfa3030a569141865bb211d626afe5fea6a0e6e4/R/GIG.R#L182

the CDF is approximated by numerical integration; and as the inverse-gaussian is normalized by a Bessel function (which is itself approximated by either integration or a series expansion) it limits the precision probably to something above machine epsilon.

This is way off topic for the forum, but it's possible that you could get around this implementation by using an un-normalized density and properties of the integral. You know that:

f(x;a,b,p) is proportional to exp{ (p-1) log(x) - (ax + b/x)/2}

and that the integral over the whole number line is (a/b)^(p/2) / 2K_p([ab]^(1/2)); so it's probably more stable to use linear properties of the definite integral to subtract the results of numerically integrating the unnormalized density from the known normalization factor.

ADD COMMENT
1
Entering edit mode

Thanks LChart. I decided to change the limits of the integral to my needs to get accurate values as follows:

for(q in c(seq(from=4000, to=15000, by=1000))){ print(integrate(function(x){dGIG(x, mu = 1, sigma = gig_fit$sigma, nu = gig_fit$nu)}, q/gig_fit$mu, Inf)$value) } [1] 0.9924016 [1] 0.0909548 [1] 4.958494e-06 [1] 8.700593e-13 [1] 3.647163e-21 [1] 1.231157e-30 [1] 7.063602e-41 [1] 1.11342e-51 [1] 6.652658e-63 [1] 1.884094e-74 [1] 2.967859e-86 [1] 2.924244e-98

ADD REPLY

Login before adding your answer.

Traffic: 1593 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6