Sequence of steps in quality control of genetic data
0
0
Entering edit mode
4.4 years ago
kl ▴ 10

Hello,

I am performing quality control on my genetic data. I have a very simple question, and perhaps even silly. However, I want to make sure I'm keeping the right number of people. Should I generate individuals with outlying missingness in their genotype rates before filtering for minor allele frequency? I did it both ways, where I provided a dataset filtered for maf of 1% and then an uncleaned dataset (with low mafs as well). In the dataset that had been cleaned for MAF, I get a higher number of people (about 30 extra) that have outlying missingness. However, in the other dataset, not cleaned for MAF, there is less people with outlying genotype rates and thus less people to exclude. It definitely makes sense why there are more people with outlying genotype rates after filtering for MAF. For me, instinctively, it seems that filtering for MAF should be the last step in the process as we would be falsely classifying some people with outlying missingness rate?I would greatly appreciate the advice!

Thank you!

quality control imputation plink genome • 657 views
ADD COMMENT

Login before adding your answer.

Traffic: 1748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6