Question

Calculating PRS from pre-computed PGS Catalog

0

Entering edit mode

12 months ago

user230613 ▴ 380

Hi there,

I am trying to calculate polygenic risk score (PRS/PGS) for a given set of individuals (it can be VCF, bgen file, it does not matter) using pre-computed scores (beta values for each variant) from the PGS Catalog. So let's say for example that I want to calculate the PGS000001 PRS score for my specific individuals using PGS000001 computed scores from the PGS catalog. My intention is just to calculate the score for the given individuals, is not to develop new PGS scores or validate existing ones.

I have two questions:

Should I use imputation data, or just the variants that have been actually called (after germline calling)? Do I gain anything by imputing the data?
Should I filter the variants or individuals? For example with filters such as missing call rates (--geno in plink2), --indep-pairwise, --hwe, --maf. The same question, should I discard any individual based on some criteria? I know that when one is developing PGS, a strict consideration of the SNPs and sample properties must be taken into account, but is it the same when just trying to calculate scores using pre-calculated Beta values deposited in the PGS Catalog?

Thank you!

GWAS PRS PGS • 600 views

ADD COMMENT • link updated 3 months ago by UnivStudent ▴ 440 • written 12 months ago by user230613 ▴ 380

score 0 · Answer 1 · 2024-08-12

You might find our tool for calculating PGS from the Catalog useful for these types of analyses (https://pgsc-calc.readthedocs.io/en/latest/ & https://github.com/pgscatalog/pgsc_calc) it will automate all the variant matching and scoring steps. I think the major QC you want to do is imputation, and filtering variants that are either missing or low-imputation quality. This QC steps should be done before using the genotypes within the PGS Catalog Calculator.