Question

PRS from a dataset on which the GWAS is based on

1

Entering edit mode

3.7 years ago

Jenish ▴ 20

I am using the PRSice tool to calculate PRS scores for a dataset (d1) using a set of weights from the PGS catalog. Although the weights were generated from a different cohort (d2), it appears that the original GWAS was based on the same dataset (d1) that I'm using. Does this method inflate the predictive qualities of the PRS for the cohort in d1? Or should we even use it in that case? (Still very new to the concept of PRS scores)

PRSice Catalog PGS PRS • 1.4k views

ADD COMMENT • link updated 3.7 years ago by Sam ★ 4.8k • written 3.7 years ago by Jenish ▴ 20

score 2 · Accepted Answer · 2021-09-15

2

Entering edit mode

3.7 years ago

Sam ★ 4.8k

Unfortunately, if the same data set is used for the GWAS estimate, you can't include that in the PRS analysis, or you will get a highly inflated R2 and significance that cannot be replicate in other data set.

Do you mean that d1 is part of d2? If d1 is a subset of d2, one method is to use the inverse meta-analysis approach. you should be able to find the equation here: https://www.biorxiv.org/content/10.1101/203257v1.full.pdf

ADD COMMENT • link 3.7 years ago by Sam ★ 4.8k

0

Entering edit mode

It appears that d1 and d2 are the same dataset. But if the GWAS estimate as based on a subset x of d1 and we were to include only those subjects from d1 that were not the subset x used for GWAS, then the PRS analysis should be fine?

ADD REPLY • link 3.7 years ago by Jenish ▴ 20

1

Entering edit mode

Rule of thumb: do not include anyone who is included in the GWAS. So if half of d1 was used for GWAS, you can still use the remaining half for PRS. (though you should still remove any samples that are related to anyone included in the GWAS)

ADD REPLY • link 3.7 years ago by Sam ★ 4.8k