Finding SNPs RSID from older papers
2
0
Entering edit mode
5.9 years ago
ognjen011 ▴ 290

I encountered a paper which test an IL-6 polymorphism, but never clearly specifies it with an RSID, but simply calls it IL-6 -174:

https://www.ncbi.nlm.nih.gov/pubmed/18930842

I tried using online resources to find the exact chromosomal position and thus find the RSID that corresponds to position -174 of IL-6 gene, but I've found no matches at the location I calculated/found. Thus, I would prefer if someone checked my method and help me correct it.

1) What is -174 in reference to? Start of transcription, start of translation? Which position is that for IL-6 gene in GRCh38 or how does one find it exactly? ENSEMBL and Entrez data for the gene beginning differ, which is probably not uncommon.

2) Is there a resource that gives these alternative names so that one can use them as mapping?

3) Is such data likely to change with genome builds? IL-6 is at 7p21 in the paper, but in the current build it is in 7p15, adjacent to 7p21, if I am reading it correctly.

Thanks in advance!

SNP RSID assembly • 1.7k views
ADD COMMENT
2
Entering edit mode
5.9 years ago
GenoMax 147k

@Denise: Here is a relevant line from intro of the paper

The IL-6 gene, located in humans in the short arm of chromosome 7 (7p21), displays a single nucleotide polymorphism (SNP) in the promoter region (−174 G/C) which seems to be associated to variations of IL-6 gene expression and serum levels (Fishman et al., 1998).

The study referred to by OP is a meta-analysis of 9 studies that were published between 2001-2005.

Here is the polymorphism sequence referred to in Fishman paper:

https://www.ncbi.nlm.nih.gov/nuccore/AF005485

that sequence now UNIQUELY maps here (between the added * in alignment below):

blat-IL6

>chromosome:GRCh38:7:22726595:22727497:1
22726595 TTTGAGGATGGCCAGGCAGTTCTACAACAGCCGCTCACAGGGAGAGCCAGAACACAGAAG 22726654
22726655 AACTCAGATGACTGGTAGTATTACCTTCTTCATAATCCCAGGCTTGGGGGGCTGCGATGG 22726714
22726715 AGTCAGAGGAAACTCAGTTCAGAACATCTTTGGTTTTTACAAATACAAATTAACTGGAAC 22726774
22726775 GCTAAATTCTAGCCTGTTAATCTGGTCACTGAAAAAAAATTTTTTTTTTTTCAAAAAACA 22726834
22726835 TAGCTTTAGCTTATTTTTTTTCTCTTTGTAAAACTTCGTGCATGACTTCAGCTTTACTCT 22726894
22726895 *TTGTCAAGACATGCCAAAGTGCTGAGTCACTAATAAAAGAAAAAAAGAAAGTAAAGGAAG 22726954
22726955 AGTGGTTCTGCTTCTTAGCGCTAGCCTCAATGACGACCTAAGCTGCACTTTTCCCCCTAG 22727014
22727015 TTGTGTCTTGCCATGCTAAAGGACGTCACATTGCACAATCTTAATAAGGTTTCCAATCAG 22727074
22727075 CCCCACCCGCTCTGGCCCCACCCTCACCCTCCAACAAAGATTTATCAAATGTGGGATTTT 22727134
22727135 CCCATGAGTCTCAATATTAGAGTCTCAACCCCCAATAAATATAGGACTGGAGATGTCTGA 22727194
22727195 GGC*TCATTCTGCCCTCGAGCCCACCGGGAACGAAAGAGAAGCTCTATCTCCCCTCCAGGA 22727254
22727255 GCCCAGCTATGAACTCCTTCTCCACAAGTAAGTGCAGGAAATCCTTAGCCCTGGAACTGC 22727314
22727315 CAGCGGCGGTCGAGCCCTGTGTGAGGGAGGGGTGTGTGGCCCAGGGAGGGCTGGCGGGCG 22727374
22727375 GCCAGCAGCAGAGGCAGGCTCCCAGCTGTGCTGTCAGCTCACCCCTGCGCTCGCTCCCCT 22727434
22727435 CCGGCACAGGCGCCTTCGGTCCAGTTGCCTTCTCCCTGGGGCTGCTCCTGGTGTTGCCTG 22727494
22727495 CTG
ADD COMMENT
0
Entering edit mode

Thanks for the reply. Using BLAST is a great idea. Is there a resource that could help me skip BLAST if the sequence has a known accession number? For example, if I know my variant is NM_001105580:c.1394T>C, how would one skip BLAST and directly get the absolute genomic position in the current build?

ADD REPLY
0
Entering edit mode

NM_001105580:c.1394T>C is a HGVS notation, so I'd expect you could annotate variants like that and find their genomic coordinates plus functional consequences on human transcripts using the VEP. VEP does take HGVS identifiers

ADD REPLY
0
Entering edit mode

Hey, I've just found the explicit variant: https://www.snpedia.com/index.php/Rs1800795 The position of this base is 22727026. When counting -174 from the end of your aligned sequence, I get a position 22727024. Also, the "real" SNP is C > G. Could you comment on that?

ADD REPLY
0
Entering edit mode

I see what was done. The accession link is actually: https://www.ncbi.nlm.nih.gov/nuccore/AF039228

And it was sequenced from -550-+61. As a result, -174 is not counted from the end of the sequence as I assumed. Unusual.

ADD REPLY
1
Entering edit mode
5.9 years ago
Denise CS ★ 5.2k

I can't access the full text as the paper has been published by Elsevier.

But let's have a look at what is available without paying for access. Note the publishing date: Jan 2009. GRCh37 was released in Feb 2009. So the data would have been in NCBI36 at the best; perhaps NCBI35?

Looking for the term longevity in the newly released Open Targets Genetics, selecting one study (Zeng Y 2016, Sci Rep 26912274) out of many others that match that term, then focusing on the signal on chr 7, that leads me to 7_22768027_A_G rs2069837.

Whether or not this is the rsID corresponding to IL-6 -174, I don't know. More exploratory research would be needed. But at least we know the study was not carried out on GRCh37 or GRCh38.

ADD COMMENT
0
Entering edit mode

Thank you for clearly outlying your reasoning.

ADD REPLY

Login before adding your answer.

Traffic: 2651 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6