Could somebody please help me understand:
What is the significance of Minor Allele Frequency?
how Minor Allele Frequencies are used in GWAS in the selection of SNPs?
What does Minor Allele Frequency being zero signify beyond saying that the population is homozygous with dominant allele.
Loci are selected for a genotype assay such as a SNP chip because they are expected not to be uniform in the population. Most chips distinguish between two genotypes at a given locus; those are the two alleles. We can estimate the frequency of these alleles in the total population from their frequency in a sample population, such as the HapMap samples. One of these alleles will appear less frequently than the other; that is the minor allele. Typically GWAS are designed to exclude SNPs with a MAF < 5%, as it requires very strong statistical power to make meaningful statements about very rare alleles.
To a first approximation, you've correctly interpreted the result of a MAF of zero. In truth since we're just estimating the MAF, there may well be people who do not have the major allele.
David, Thanks much for your reply. Is it true that the minor allele remains the same for a given SNP for a given population?
I am tryin to understand this observation:
I took around 900K SNPs and compared "minor alleles" of HapMap CEU samples(90 in number) with an ABC GWA study CEU samples(9K in number). Of around 900K SNPs, there were ~20K of those that did not have matching "minor alleles", but had similar Minor Allele Frequency values. For those ~20K SNPs, MAFs are in the range of 0.35-0.5 in both HapMap and ABC data. What could be reasons for those 20K SNPs not matching in minor alleles?
I'm confused. There are 90 CEU samples in HapMap; if the GWAS studied the CEU samples, the genotypes must be identical. If the GWAS used new samples, then they're not the CEU samples. Is 9K a typo for 90? Assuming the samples are a new population, the minor alleles could differ either because of genotyping/bookkeping error or because the "minor allele" is a definition infered from the population used as a baseline (90 CEU humans); we would have to sample a much bigger population of caucasians to be more certain that a given allele is really "minor".
I apologize for that. In the ABC GWAS Study that I mentioned there are around 9000 CEU samples. I was comparing the minor allele obtained for 900000SNPs in this ABC GWAS study with the minor allele obtained for the same 900000SNPs in HapMap 90 CEU samples.There were around 20000 SNPs that did not match in minor allele and MAF values of those 20000 SNPs were closer to the line of differentiation (i.e. 0.35-0.5). I was just trying to understand as to why these 20000 did not match in minor allele. I was told that the reason could be that the sample sizes are a lot different.(90 versus 9000).Thks!
straight. concise. I was about to write a few lines myself, but I wouldn't be more precise than this. great and useful answer.
David, Thanks much for your reply. Is it true that the minor allele remains the same for a given SNP for a given population? I am tryin to understand this observation: I took around 900K SNPs and compared "minor alleles" of HapMap CEU samples(90 in number) with an ABC GWA study CEU samples(9K in number). Of around 900K SNPs, there were ~20K of those that did not have matching "minor alleles", but had similar Minor Allele Frequency values. For those ~20K SNPs, MAFs are in the range of 0.35-0.5 in both HapMap and ABC data. What could be reasons for those 20K SNPs not matching in minor alleles?
I'm confused. There are 90 CEU samples in HapMap; if the GWAS studied the CEU samples, the genotypes must be identical. If the GWAS used new samples, then they're not the CEU samples. Is 9K a typo for 90? Assuming the samples are a new population, the minor alleles could differ either because of genotyping/bookkeping error or because the "minor allele" is a definition infered from the population used as a baseline (90 CEU humans); we would have to sample a much bigger population of caucasians to be more certain that a given allele is really "minor".
I apologize for that. In the ABC GWAS Study that I mentioned there are around 9000 CEU samples. I was comparing the minor allele obtained for 900000SNPs in this ABC GWAS study with the minor allele obtained for the same 900000SNPs in HapMap 90 CEU samples.There were around 20000 SNPs that did not match in minor allele and MAF values of those 20000 SNPs were closer to the line of differentiation (i.e. 0.35-0.5). I was just trying to understand as to why these 20000 did not match in minor allele. I was told that the reason could be that the sample sizes are a lot different.(90 versus 9000).Thks!