Hi, I'm working with Human samples with a SNP Array. So far I've been using a QC in PLINK considering a maf of 0.01 or 0.05 as threshold for filtering out variants.
My aim is to filter out variants with evident genotyping errors.
I'm not looking for selection but only for structure in my populations, that's why my interest is basically to keep the most of the SNPs in the work.
The filtering for the maf is a little bit tricky since you have to take into account if the populations you're looking at are more or less isolated and consider the impact that the drift generated in different cases.
I mean, my idea is to filter out more variants in isolated populations because of the inbreeding, and be a little more relaxed in not isolated populations. Does it make sense?
Do you have any suggestion for the threshold to set? I mean, I know that the general idea is to use a <=0.05, but I just want to know other opinions.
Thanks.