What is the best QC to do on imputed UK Biobank data?
1
0
Entering edit mode
3.2 years ago

I am receiving imputed data from UK Biobank to conduct a GWAS on. Previously I have carried out GWAS on genotype data, which I have QC'd for missingness per individual and per SNP, sex discrepancy, MAF filter < 0.05, heterozygosity rate, cryptic relatedness and population stratification.

However, from reading several papers about QC of imputed data, I do not carry out any of these QC checks on the imputed data. What QC procedures should I expect to carry out on UKBB data? Just to filter by info score?

Or would it be better to carry out the initial QC checks on the genotype data and then impute myself?

ukbb uk imputation gwas qc • 1.3k views
ADD COMMENT
0
Entering edit mode
22 months ago
ucbtep ▴ 20

The imputed UKB data has already been QC'd by the UK Biobank during imputation (https://biobank.ndph.ox.ac.uk/showcase/ukb/docs/impute_ukb_v1.pdf). There isn’t really a QC process for the imputed data as such – though you can limit (the results of) your analyses to variants above a certain minor allele threshold (MAF) and above a certain INFO threshold.

If you're using a much smaller subset of the UK Biobank data then both the MAF and INFO scores may be impacted. Perhaps then you'd want to run QC yourself... Alternatively, if you run the GWAS in REGENIE (https://rgcgithub.github.io/regenie/) it will recalculate both of these metrics for the sub-sample, and you could limit variants from there. REGENIE also enables you to include related samples, which might be helpful as there is quite a lot of relatedness within UK Biobank.

Sorry this reply is probably a little late for you psyc_biostars. Also, caveating the above with the fact I'm not a GWAS expert, but this is what I was advised when asking a similar question to someone who knows a lot about GWAS!

ADD COMMENT

Login before adding your answer.

Traffic: 1843 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6