How To Find Major And Minor Alleles From Dbsnp ?
2
13
Entering edit mode
12.2 years ago

I am new to SNPs and the dbSNP database. I am not quite sure if I understand how to find major and minor alleles for a particular position from the database.

Let's take for example rs16532.

Initially, the "Allele" section says that "RefSNP Allele" is C/G. Next, the "Ancestral Allele" is G. Does this mean that the reference allele is G ?

Now we come to the tables on that page. Which of the tables should I refer to, to find major and minor alleles for this position ? There's a "Population Diversity" section which has some statistics. Should I be referring to this table ? Also, how should I find the strand information for the reference as well as the major and minor alleles ?

Also, some more basic questions :

Are major and minor alleles ALWAYS alternate alleles ? Can a minor allele be a reference allele ?

Any help will be greatly appreciated.

dbsnp allele • 32k views
ADD COMMENT
0
Entering edit mode

Hi Expert Peer/seniors

Valuable info, highly appreciated!

Thank you

ADD REPLY
0
Entering edit mode

Hello seniors,

Excellent clarification!

Thank you

ADD REPLY
11
Entering edit mode
12.2 years ago

There are many things we could discuss about this question, but you should definitely go for some basic reading before asking. the first one I would recommend would be dbSNP's self description on its own web page, as it answers all your doubts. I would go anyway through them to address your question:

  1. When you see in dbSNP a C/G it just means that dbSNP has recorded 2 alleles, C and G, for that particular position. the order C/G or G/C will be determined by the RefSeq strand just by convention, which is the strand that was sequenced for the reference genome used, with no critical reason to do so (my guess is that it was due to historical reasons). it can be the case that the SNP was typed on the other strand, so there's a field indicating whether the SNP is on the same strand as the reference strand or on the other.
  2. The ancestral allele is the one the chimp has for that particular position, after having aligned chimp's and human's genomes. we don't have ancestral alleles for all our genome, but it's still useful for comparisons due to evolutionary reasons as you may understand. other databases, such as ensembl, also report macaque's alleles for that kind of usage.
  3. To find minor alleles and minor allele frequencies (MAF) you may look to the SNP summary at the top of each SNP page, where a "MAF/MinorAlleleCount" field is reported in case dbSNP has population information for that SNP, giving you the overall frequency and count in all populations. you will have then the allele which is considered as minor due to its frequency, so for the rs16532 example you can see that the C allele is the minor one with 0.427 MAF, which means that the G would be the major allele with 1-0.427=0.573. if you want a deeper population description of the SNP you will have then to scroll down to the "Population Diversity" table, where all this previous information is splitted by populations. you may sometimes find out that a SNP that is minor for a population can be major for other (i.e. C allele is minor on 1000 genomes' pilot 1 YRI, but is major on 1000 genome's pilot 1 CHB+JPT for the rs16532 example), so this is the kind of information you'll have to deal with.
  4. The alternate allele is just the one not on the strand that was sequenced of the reference genome. since that selection is arbitrary, just a reference used to normalize the data reporting, the answers to your last questions would be the following:

    • NO, major and minor alleles are not always alternate alleles. sometimes they will, sometimes they won't.
    • YES, a minor allele can be the reference allele. sometimes it will, sometimes it won't.
ADD COMMENT
0
Entering edit mode

Thank you for your thorough explanation, Jorge — Are the populations and MAFs per population in the Population Diversity table those corresponding to the 1000 Genomes project, or has this table been updated to include ExAC populations and MAFs?

ADD REPLY
7
Entering edit mode
12.2 years ago
dfornika ★ 1.1k

Scroll down to the 'Population Diversity' section to find allele frequencies.

One very important thing to understand is that allele frequencies must be interpreted within the context of a specific population. An allele will have a different frequency in Salt Lake City, Utah than in Tokyo, Japan. And even within a geographic location, the frequency will vary depending on the ethnic background of your population of interest.

ADD COMMENT

Login before adding your answer.

Traffic: 1725 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6