I am using the PRSice tool to calculate PRS scores for a dataset (d1) using a set of weights from the PGS catalog. Although the weights were generated from a different cohort (d2), it appears that the original GWAS was based on the same dataset (d1) that I'm using. Does this method inflate the predictive qualities of the PRS for the cohort in d1? Or should we even use it in that case? (Still very new to the concept of PRS scores)
It appears that d1 and d2 are the same dataset. But if the GWAS estimate as based on a subset x of d1 and we were to include only those subjects from d1 that were not the subset x used for GWAS, then the PRS analysis should be fine?
Rule of thumb: do not include anyone who is included in the GWAS. So if half of d1 was used for GWAS, you can still use the remaining half for PRS. (though you should still remove any samples that are related to anyone included in the GWAS)