Entering edit mode
5.8 years ago
javadjamshidi
▴
40
Is it needed to remove people who are relatives in a sample for polygenic risk score calculation??
Is it needed to remove people who are relatives in a sample for polygenic risk score calculation??
Most of current PRS tools (PRSice, lassosum and LDpred) does not support family data. For example, in PRSice, a simply linear regression or glm is used to calculate the P-value and R2 whereas lassosum use correlation between the PRS and phenotype. What this mean is that the relatedness between the samples might artificially inflate your R2 and might lead to a higher level of overfitting, limiting the generalisability of your findings to other dataset
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Risk is calculated based on SNPs of an individual, why it would it matter if they have relatives?
That's right. But does it bias the calculated r-squared for the explained variance?
Depending on how you are calculating your PRS, why not just do a family-specific statistical test and derive the PRS from the coefficients of that?
Thanks Kevin, would you please give me more details on how to do that? Maybe a link to a paper?
Well, most PRS that I have seen are simply derived from the beta coefficient of a fitted regression model. The Beta coefficient is the 'Estimate' that appears in the output when you run
Summary()
on your model object (in R). So, I do not have any to which to link you as it is simply used in all PRS that I have seen, possibly even PRSice.Keep in mind that there is no single definition for PRS... it is as general and 'broad-sweeping' a term as 'bioinformatics'.