Question

Estimating per-variant heritability from summary stats

1

Entering edit mode

4 months ago

Daniel ▴ 10

Hello Biostars!

A conundrum I'm facing is I want to look at the relationship between per-variant heritability for a GWAS trait and another genomic feature (e.g. number of features at a locus, doesn't really matter).

However, the summary statistics I obtained from UKB don't report these per-variant heritabilites. I have the chromosome, position, effect and non-effect alleles, p-values, allele frequencies, beta, SE, and sample-size for all variants.

So I'm wondering if there's a simple math equation, or some bioinformatics software, to take summary statistics and estimate per-variant heritabilities. I've googled a lot and sought chatGPT's advice as well, but seem to only find annotation-enrichment tools like LDSC. Asking LDSC's github, they said you can maybe estimate if you have the phenotypic variance of a trait, but I don't quite know what this means (I think if the variance is standardized it should equal one?).

Is this possible? Or is this something for which I'd need individual genotype info?

Variants SNP GWAS • 633 views

ADD COMMENT • link updated 4 months ago by LauferVA 4.5k • written 4 months ago by Daniel ▴ 10

0

Entering edit mode

Please do not use bioinformatics as a tag unless your post is about the field of bioinformatics itself. For proper examples, please see Forum and News type posts under https://www.biostars.org/tag/bioinformatics/

I've removed the tag this time but please be more mindful in the future.

ADD REPLY • link 4 months ago by Ram 44k

score 0 · Answer 1 · 2024-07-12

0

Entering edit mode

4 months ago

Sam ★ 4.8k

Given the information, you can estimate the per-variant heritability as

t = beta / se

r = t / sqrt(n - 2 + t^2)

per_snp h2 = r^2

Though you can only use it for per-variant level and adding the per snp h2 up does not gives you the total h2 due to LD.

ADD COMMENT • link 4 months ago by Sam ★ 4.8k

score 0 · Answer 2 · 2024-07-13

Im uncertain as to the meaning of what Sam has written (Im sure you are correct, Sam, I just don't understand what you mean).

Generally you want:

Per SNP heritibility equation

A simplified form that assumes the second term in the denominator is negligible is as follows:

enter image description here

As you can see, if true, then the denominator is nearly equal to 1. This form is therefore certain to give you an overestimate of heritability, but if the SNV in question only accounts for a tiny fraction it can be used back of the envelope. As above, p is allele frequency, beta is the effect size estimate, and N is the sample size of the GWA study.

These types of calculations are used in LDSC, GCTA, etc.