Hello,
I have a table containing SNPs and their z-score and p-values. I want to calculate their effect and SE.
How can I do this? Is there any code I can use to do this in R?
Thank you in advance.
Hello,
I have a table containing SNPs and their z-score and p-values. I want to calculate their effect and SE.
How can I do this? Is there any code I can use to do this in R?
Thank you in advance.
Assuming that the model that was fit is from a simple linear regression
The formula is as follows.
Var(Y|X) is the variance of the residual under linear regression and N is the sample size
The standardized beta (i.e assuming both Y and X are transformed to have unit variance and mean zero) = Zscore*sqrt(Var(Y|X)/N)
Var(Y|X) = 1/(1 + (Zscore*Zscore)/N)
Var(beta) = Var(Y|X)/N
Here's the code in R that verifies this:
re <- lapply(1:1e4, function(u){
x <- rnorm(1e3, 0, 2)
y <- .6*x + rnorm(1e3)
ft <- summary(lm(scale(y, scale = T)~scale(x, scale = T)))
t_stat <- ft$coefficients[2,3]
beta_o <- ft$coefficients[2,1]
se_beta_o <- ft$coefficients[2,2]
sigma_sqrd <- 1/(1+(t_stat^2/1e3))
beta_est <- t_stat*sqrt(sigma_sqrd/1e3)
se_beta_est <- sqrt(sigma_sqrd/1e3)
data.table::data.table(beta = beta_o, se_beta = se_beta_o, t_stat = t_stat,
betahat = beta_est, se_betahat = se_beta_est, t_stat_est = beta_est/se_beta_est)
})
re <- do.call(rbind, re)
plot(density(re$beta), col = "red", lwd = 1)
lines(density(re$betahat), col = "blue", lwd = 1)
plot(density(re$se_beta), col = "red", lwd = 1)
lines(density(re$se_betahat), col = "blue", lwd = 1)
plot(density(re$t_stat), col = "red", lwd = 1)
lines(density(re$t_stat_est), col = "blue", lwd = 1)
Here's snapshot based on what I ran beta, se_beta, and t_stat are the truth betahat, se_betahat, and t_stat_est are estimates based on the formula above
beta se_beta t_stat betahat se_betahat t_stat_est
1: 0.7634574 0.02044428 37.34332 0.7631385 0.02043574 37.34332
2: 0.7539762 0.02079386 36.25955 0.7536504 0.02078488 36.25955
3: 0.7716880 0.02013227 38.33090 0.7713754 0.02012411 38.33090
4: 0.7674729 0.02029308 37.81944 0.7671570 0.02028473 37.81944
5: 0.7703690 0.02018282 38.16953 0.7700553 0.02017461 38.16953
You can use this equation:
Beta = z / sqrt(2p(1− p)(n + z^2)) and
SE =1 / sqrt(2p(1− p)(n + z^2))
Where p is the frequency of the imputed SNP, you could use out reference panel to calculate p. For reference please go to
https://images.nature.com/full/nature-assets/ng/journal/v48/n5/extref/ng.3538-S1.pdf
Hi,
First of all many thanks for your previous answers. I am using the provided formula:
gwas$se <- sqrt((2*gwas$freq)*(1-(gwas$freq))*(gwas$n+(gwas$z^2)))
Where: freq is my effect allele frequency; n the total sample size, and z my z score.
But for some reason I obtain too high standard error values:
summary(gwas$se)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 20.29 26.55 24.35 29.53 52.54
Does anyone have any suggestions on why I get this results?
Thank you very much!
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Related post, but with no answer:
Caclulate effect estimates and SE from Z scores