Question

Finding The Rs Number For A Specific Snp

1

Entering edit mode

13.0 years ago

Mériem ▴ 10

How can I find the rs number for my snp?

For example, what is the rs number of the ACE gene polymorphism: snp= ACE C1237T?

snp • 15k views

ADD COMMENT • link updated 13.0 years ago by Larry_Parnell 16k • written 13.0 years ago by Mériem ▴ 10

0

Entering edit mode

are you sure there is a "C" at 1237 of "ACE" or is it just a random number/AA ?

ADD REPLY • link 13.0 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Cys cannot be changed into Thr with a single base change.

ADD REPLY • link 13.0 years ago by Larry_Parnell 16k

score 4 · Answer 1 · 2011-11-15

4

Entering edit mode

13.0 years ago

Larry_Parnell 16k

One reason Pierre's answer gives "none" is because the SNP in ACE that you're studying was described using old coordinates or some other antiquated system of numbering positions - either nucleic acid (genome DNA or mRNA) or protein. I have had to deal with this on countless occasions.

The best way to convert a SNP such as this to an rs entry is to trace back through the methods sections of the articles to the source where either the polymorphism is first described or where the assay to detect the variant is first described. In something like 98-99% of the cases, I can indeed track this back to PCR primers or original sequence data in a figure in the paper. With those data in hand, it is a matter of a BLASTN search either against the reference genome or against dbSNP to find the rs entry.

We've genotyped ACE, but only at the insertion/deletion, I/D variant better known as rs4646994. Unfortunately, not at your SNP.

Thus, I have a table with columns Gene, SNP, SNP alias where the latter contains all these common names like I/D or C1237T or 1237C>T, etc.

An alternative approach, which came to mind after reading Bert's response, is to search for a gene- and gene-family-specific database. I don't know of one for ACE, but have seen these for mitochondrial genes and for cytochrome P450s. Still, I prefer to track it down as described above, because when I recreate the original expt via something like electronic PCR, I know I have the right variant and am not relying on names that change or shift.

ADD COMMENT • link 13.0 years ago by Larry_Parnell 16k

2

Entering edit mode

Doing this using the paper http://www.biomedcentral.com/1471-2350/11/94/, it looks like the coordinates for the variants in the ACE gene reported in this paper are 22 nt off. I am pretty confident that your C1237T variant is rs4309 at position 1215: http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs4309. When do people learn to use the dbSNP identifiers? (judging from its very low rs number this variant is already in dbSNP for a long long time .... the mentioned paper is from 2010 ....).

ADD REPLY • link 13.0 years ago by Bert Overduin ★ 3.7k

2

Entering edit mode

Ah, now you're beginning to uncover the same frustrations I have had for years! I think it is for historical reasons. It is also possible that the genotyping was done long ago and only recently analyzed for a new association. Nonetheless, it should be the responsibility of authors, reviewers and editors to put the standard nomenclature in for genes, SNPs and other entities. Everyone should consider the database and data mining folks!

ADD REPLY • link 13.0 years ago by Larry_Parnell 16k

0

Entering edit mode

I've done exactly this same thing in this situation. One time I even wrote to the research team for guidance because the system was so borked. And they told me someone would get back to me. Never happened.

ADD REPLY • link 13.0 years ago by Mary 11k

0

Entering edit mode

Same non-response here...

ADD REPLY • link 13.0 years ago by Larry_Parnell 16k

0

Entering edit mode

I couldn't agree more, Larry.

ADD REPLY • link 13.0 years ago by Bert Overduin ★ 3.7k

Leonor Palmeira · Answer 2 · 2011-11-15

In http://plindenbaum.blogspot.com/2011/03/mapping-mutation-on-protein-to-genome.html I wrote a tool to map a proteic position back to the genome.

echo -e "ACE\tC1237T" | java -jar backlocate.jar

#User.Gene  AA1 petide.pos.1    AA2 knownGene.name  knownGene.strand    knownGene.AA    index0.in.rna   codon   base.in.rna chromosome  index0.in.genomic   exon
ACE C   1237    T   uc002jau.1  +   L   3708    CTC C   chr17   61574514    Exon 25
ACE C   1237    T   uc002jau.1  +   L   3709    CTC T   chr17   61574515    Exon 25
ACE C   1237    T   uc002jau.1  +   L   3710    CTC C   chr17   61574516    Exon 25

you can pipe this result back to the mysql ucsc database:

$ echo -e "ACE\tC1237T" |\
java -jar backlocate.jar |\
grep -v "#" | cut -d ' ' -f 11,12 |\
awk '{printf("select name from snp132 where chrom=\"%s\" and chromStart=%s;\n",$1,$2);}' |\
mysql  --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19

#answer: none

score 0 · Answer 3 · 2011-11-15

0

Entering edit mode

13.0 years ago

Bert Overduin ★ 3.7k

You can use the Variation Table page for the ACE gene in Ensembl that shows all the annotated variants in this gene and search it for 1237 using the "Filter" option (at the top right of the table). When you do this you will see that there is no variant annotated in position 1237, though ....

ADD COMMENT • link 13.0 years ago by Bert Overduin ★ 3.7k

1

Entering edit mode

Not a surprise as the old 1237 position may not correspond to a current gene model or the current numbering system. There are many cases where non-synonymous variants were numbered according to the active, processed peptide thereby ignoring the positions of amino acids in the leader peptide (because it is cleaved).

ADD REPLY • link 13.0 years ago by Larry_Parnell 16k