major allele
"the most common allele for a given SNP"... in the cohort in question. The cohort may be just 10 people, though, or it could be 2,504 like in 1000 Genomes Phase III. In addition, the major allele, by definition, could have a frequency of 50.5%, in which case, although it is more frequent, it is only more frequent by 0.5%. The point that I want to make is that the major allele only makes sense when you understand the cohort in which it is the major allele, and also the size of that cohort.
minor allele
As above but, yes, the reverse, in that it is the less frequent allele. Also, yes, the MAF is the frequency of the minor allele and, from the MAF, one can infer the frequency of the major allele if it is a bi-allelic site (some sites understandably are tri- or quad-allelic).
On what you said about the "variation of genotypes", if a site has a very low MAF in a global cohort (i,.e. samples from various parts of the World), it may imply that the major allele is conserved and is 'fixed' in the human genome, but not necessarily. A very rare allele at such a site may, thus, be under selective pressure if it reflects positive gain of function, or it could be deleterious and more likely to be eliminated from the human lineage.
risk allele
What you said is correct. The risk allele is statistically significantly associated with risk of having a disease under study. Such an allele should have genome-wide significance and have an odds ratio > 1.0. A situation in which a major allele may be seen as the 'risk allele' is where the minor allele is found to be protective against disease by having an odds ratio < 1.0, coupled with a statistically significant p-value. However, such a situation is not usually interpreted from the context of the major allele being the risk allele.
You may have been thinking about rare and common (MAF>5%) variants. For example, it is accepted (by those who actually think) that common alleles have roles in disease. An example are the variants in the CCND1 locus, which have MAFs of ~15% in Caucasians but which confer increased risk of ER+ breast cancer. Look at Rare and common variants: twenty arguments. for further reading.
I should add that many rare variants may be functionless, but that they can still accumulate in the human genome and eventually become functional if combined with other nearby variants. For example, variants accumulated over time eventually form novel TSS sites, TF binding sites, histone binding sites, protein binding sites, etc.
------------------------------------
In relation to the above 3, you may enjoy reading a recent answer that I gave: A: SNP dataset and Z Score
effect allele
This isn't used that much. It is essentially the allele whose effects in relation to disease are being studied. The effect allele is therefore, invariably, the minor allele.
reference allele
If you hear this term, exercise caution. The best way to view it is as the allele that is in a particular reference build, e.g., GRCh37 / hg19, GRCh38 / hg38, etc. In some cases, however, the reference allele can be a risk allele. Read here for further information: A: Alternate nucleotide is more frequent than reference nucleotide. OMG I'm dizzy.
wildtype allele
Not the same as the reference allele. A wildtype allele is specific to your case-control study and is merely the allele that is present in your wild-type samples. This could feasibly be a minor allele, or anything else - it's specific to your study and what you view as the wild-type condition.
Thank you.
Kevin
This saved me alot of time! Well explained. Thanks alot
well explained and comprehensive
Thank you so much! I should have added "ancestral" allele in my question. Am I correct in saying the ancestral allele is the major allele, given a large reference population?
Yes, the ancestral allele would be the major allele. Again, however, due to the 'quirks' of the reference genome builds, the ancestral allele is not always the allele that appears in the reference genome. The reference genome has many thousands of rare alleles, at least in the case of hg19.
Edit: note that 'ancestral' will be interpreted differently depending on who you talk to. Here is another definition that more or less refers to the major allele: https://biology.stackexchange.com/questions/19159/ancestral-allele-explanation
A situation could arise, though, where a rare allele could confer gain of function and, therefore, it would eventually become more frequent than the ancestral allele. This is obviously over many many generations, though, and is more in the realm of evolution.
Thank you so much for clarifying all this for me!
I still feel a bit of confused. what's the difference between alternative allele and effective allele? effective allele is risk allele?
There may be no difference. It is study-dependent.
Thanks for the detailed explanation.
You are very welcome.
I have some points on the answer that I am not sure about: First: The effect allele is used every where in GWAS studies Second: Why the fact that the minor allele is the effect allele? Can't the effect allele be just the major allele and the specific study studies the relation between it and the trait? Even as a protective effect which I think is found a lot in summary statistic files.
Thanks a lot
Hi, yes, the effect allele can be the minor or major allele