I have some problems with understanding of the concepts of SNP and related MAF, Alleles and NGVS Names.
I was looking at rs2476601. From "Allele" Information I could understand that
- Reference Allele is A and variation results in G; and
- A is the minor allele and has a frequency of 2.7%
Does it mean that A, our reference allele, is a minor allele and G, our variation, is a major allele? Why is then A called a 'reference allele' if it is a minor allele? Wouldn't it be logical to call a major allele, in our case G, to be a reference allele? Or a reference allele is the allele in the genome version used in mapping etc?
From 'HGVS Names':
- NC_000001.10:g.114377568A>G
- NC_000001.11:g.113834946A>G
- NG_011432.1:g.41808C=
- NG_011432.1:g.41808C>T
- NM_001193431.1:c.1858C=
- NM_001193431.1:c.1858C>T
There are sequences where A is mutated to G, but also sequences where C is mutated to T. I do not understand how it is possible. If we are looking at sense strand at see an 'A' there, which changed somewhere to 'G', it means that on antisense strand we will see a 'T' which changed to 'C'. So, it must be T>C. Why they show C>T then?
Thank you in advance.
Thanks for the question and the answer. Personally, I still don't understand what is the wild type allele. In other words, what's more prevalent in the population. When I look at GeneView section, I can see that A is on the positive strand. I guess that's the best way to determine the wild type?
G is called "ancestral" . That means wild type? on Population Diversity section I can see that G is the prevalent allele. "ancestral" allele will always show the prevalent allele?
Thanks!
Erez