Question

How to calculate effect(beta) and SE from z-score and p-value?

1

Entering edit mode

7.0 years ago

anthouliouslive ▴ 10

Hello,

I have a table containing SNPs and their z-score and p-values. I want to calculate their effect and SE.

How can I do this? Is there any code I can use to do this in R?

Thank you in advance.

SNP R • 35k views

ADD COMMENT • link updated 2.7 years ago by Olliepain ▴ 50 • written 7.0 years ago by anthouliouslive ▴ 10

0

Entering edit mode

Related post, but with no answer:

Caclulate effect estimates and SE from Z scores

ADD REPLY • link 7.0 years ago by zx8754 12k

score 1 · Answer 1 · 2019-07-31

Assuming that the model that was fit is from a simple linear regression

The formula is as follows.

Var(Y|X) is the variance of the residual under linear regression and N is the sample size

The standardized beta (i.e assuming both Y and X are transformed to have unit variance and mean zero) = Zscore*sqrt(Var(Y|X)/N)

Var(Y|X) = 1/(1 + (Zscore*Zscore)/N)
Var(beta) = Var(Y|X)/N

Here's the code in R that verifies this:

re <- lapply(1:1e4, function(u){
  x <- rnorm(1e3, 0, 2)
  y <-  .6*x + rnorm(1e3)


  ft <- summary(lm(scale(y, scale = T)~scale(x, scale = T)))
  t_stat <- ft$coefficients[2,3]
  beta_o <- ft$coefficients[2,1]
  se_beta_o <-  ft$coefficients[2,2]

  sigma_sqrd <- 1/(1+(t_stat^2/1e3))
  beta_est <- t_stat*sqrt(sigma_sqrd/1e3)
  se_beta_est <- sqrt(sigma_sqrd/1e3)

  data.table::data.table(beta = beta_o, se_beta = se_beta_o, t_stat = t_stat,
             betahat = beta_est, se_betahat = se_beta_est, t_stat_est = beta_est/se_beta_est)
})

re <- do.call(rbind, re)


plot(density(re$beta), col = "red", lwd = 1)
lines(density(re$betahat), col = "blue", lwd = 1)

plot(density(re$se_beta), col = "red", lwd = 1)
lines(density(re$se_betahat), col = "blue", lwd = 1)

plot(density(re$t_stat), col = "red", lwd = 1)
lines(density(re$t_stat_est), col = "blue", lwd = 1)

Here's snapshot based on what I ran beta, se_beta, and t_stat are the truth betahat, se_betahat, and t_stat_est are estimates based on the formula above

      beta    se_beta   t_stat   betahat se_betahat t_stat_est
1: 0.7634574 0.02044428 37.34332 0.7631385 0.02043574   37.34332
2: 0.7539762 0.02079386 36.25955 0.7536504 0.02078488   36.25955
3: 0.7716880 0.02013227 38.33090 0.7713754 0.02012411   38.33090
4: 0.7674729 0.02029308 37.81944 0.7671570 0.02028473   37.81944
5: 0.7703690 0.02018282 38.16953 0.7700553 0.02017461   38.16953

score 0 · Answer 2 · 2019-12-30

0

Entering edit mode

5.4 years ago

atlas.akhan • 0

You can use this equation:

Beta = z / sqrt(2p(1− p)(n + z^2)) and

SE =1 / sqrt(2p(1− p)(n + z^2))

Where p is the frequency of the imputed SNP, you could use out reference panel to calculate p. For reference please go to

https://images.nature.com/full/nature-assets/ng/journal/v48/n5/extref/ng.3538-S1.pdf

ADD COMMENT • link 5.4 years ago by atlas.akhan • 0

0

Entering edit mode

Link is broke.

ADD REPLY • link 5.1 years ago by Kevin Blighe 89k

0

Entering edit mode

Link works fine for me. This is the paper:

https://www.nature.com/articles/ng.3538

ADD REPLY • link 5.1 years ago by zx8754 12k

1

Entering edit mode

Hmm.... it magically works today.

ADD REPLY • link 5.1 years ago by Kevin Blighe 89k

0

Entering edit mode

Probably a silly question but better be safe than sorry - That "n" is the effective number of samples in the meta-analysis, right?

ADD REPLY • link 3.9 years ago by rodd ▴ 250

zx8754 · Answer 3 · 2021-11-24

0

Entering edit mode

3.5 years ago

paloma.jorda.b • 0

Hi,

First of all many thanks for your previous answers. I am using the provided formula:

gwas$se <- sqrt((2*gwas$freq)*(1-(gwas$freq))*(gwas$n+(gwas$z^2)))

Where: freq is my effect allele frequency; n the total sample size, and z my z score.

But for some reason I obtain too high standard error values:

summary(gwas$se)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   0.00   20.29   26.55   24.35   29.53   52.54

Does anyone have any suggestions on why I get this results?

Thank you very much!

ADD COMMENT • link updated 3.5 years ago by zx8754 12k • written 3.5 years ago by paloma.jorda.b • 0

2

Entering edit mode

It should be 1 divided by that equation. I.e

gwas$se <- 1/sqrt((2*gwas$freq)*(1-(gwas$freq))*(gwas$n+(gwas$z^2)))

ADD REPLY • link 2.9 years ago by Olliepain ▴ 50

0

Entering edit mode

Hi @Olliepain - Can this formula be used for both continuous and categorical traits? (i.e., N will correspond to all individuals included in study)

ADD REPLY • link 2.7 years ago by rodd ▴ 250

2

Entering edit mode

Yes, use the effective sample size, which for binary outcomes is Neff = 4/(1/Ncases+1/Nctrls).