I encourage you to follow up on my points here with a statistician in your department. Always better to get more brains looking at the same thing. There are various ways to calculate confidence intervals for series of data.
You could start with a typical formula for the standard error of the mean (SE mean), which is the standard deviation divided by the square root of the sample n:
σ / sqrt (n)
Lambda is a median value, so, we have to convert the SE mean to SE median:
1.253 * (σ / sqrt (n))
(various articles on the WWW validate this)
As you are dealing with a Chi-squared quantile distribution, in order to calculate the confidence intervals, I believe you can do this with the following:
pvalue <- runif(1000000, min=0, max=1)
chisq <- qchisq(1 - pvalue, 1)
lambda <- median(chisq) / qchisq(0.5, 1)
lambda
[1] 0.9963626
SE.median <- qchisq(0.975, 1) * (1.253 * ( sd(chisq) / sqrt( length(chisq) ) ) )
SE.median
[1] 0.008914307
lambda + (SE.median / qchisq(0.5, 1))
[1] 1.015957
lambda - (SE.median / qchisq(0.5, 1))
[1] 0.976768
I increased the number of random p-values to 1 million just to make the confidence intervals tighter. At 1000 (your original post), they were pretty wide.
Take this as a first step into this. I encourage you to look up all functions / formulae that I have used.
Kevin
Thank you Kevin! I had no idea about the 1.253 to have a median from a mean. I am wondering if this is what is normally done in GWAS studies (or whether maybe Bootstrapping is more appropriate)? It seems reasonable to be able to use this approximation since we would expect a large sample size as you have (1000000).
I will look up the stats as suggested. If there are other opinions about what is normally done for GWAS please share.
Thank you
Please use
ADD REPLY/ADD COMMENT
when responding to existing posts to keep threads logically organized. This comment ideally belongs under @Kevin's answer.There is no standard way of doing this, from what I could / can see. Hence the disclaimers at the start and end of my post. I presume that you were asked to do this by a reviewer (possibly a statistician) for a manuscript? Bootstrapping it to obtain bootstrapped CIs would be a good idea. That can be as easy as random sampling the p values and then repeating 250x.