Entering edit mode
3.7 years ago
curious
▴
820
Running command plink --het
gives a column "F".
I read that F is essentially "1 - (HI/HS), where HI represents the individual's heterozygosity, and HS the subpopulation's heterozygosity".
From this definition it would seem that the lower the F value for a sample the higher the heterozygosity (eg maybe contamination if low enough, inbreeding if high enough). Is it right?
I am also wondering what a "normal" range of F is for a randomly sampled population. Here they say to remove samples that are 3 standard deviation (SD) units from the mean, but what is a typical mean? 0.018?