Repeated Rsids In Dbsnp?
3
3
Entering edit mode
13.2 years ago

Is it possible that rsID is repeated in dbSNP? I recently downloaded dbsNP130 from UCSC came across cases such as

chr10 50325 50326 rs10221381 0 + G G G/T genomic single by-cluster 0 0 unknown exact 3

chr18 4739 4740 rs10221381 0 + G G G/T genomic single by-cluster 0 0 unknown exac

Is this expected? And what'd be the explanation? Or, have I made an error in downloading parsing file?

dbsnp • 6.0k views
ADD COMMENT
0
Entering edit mode

Thanks for all replies. Just searched rs10221381 in UCSC table browser, and got the following hits in multiples dbs:

snp130

rs10221381 at chr10:50076-50576 rs10221381 at chr18:4490-4990

snp129

rs10221381 at chr10:50076-50576 rs10221381 at chr18:4490-4990

snp128

rs10221381 at chr10:50076-50576 rs10221381 at chr18:4490-4990

snp126

rs10221381 at chr10:50076-50576 rs10221381 at chr18:4490-4990

ADD REPLY
2
Entering edit mode
13.2 years ago

this issue has been covered several times in the past, as you may see if you search BioStar for "dbSNP multiple" or "dbSNP duplicated", so it's never late to suggest first to query the forum system before posting a question. moreover, the "related questions" that appear before sending the question usually perform quite well, so it's always worth it to pay a little attention to them.

the best answer in my opinion is the one given by neilfws on this question, where he describes that this happens due to the way dbSNP locates the snp position, which is no other than mapping. so in case the same SNP (and its flanking sequences of course) map to several genome locations, then you will see multiple location entries for the same SNP. therefore, I wouldn't consider it as an error, but a limitation of the system. do not get confused with the usual fact that dbSNP gives different locations for the same SNP due to the referring to different genome assemblies, as they all may reflect the same position indeed.

ADD COMMENT
1
Entering edit mode
13.2 years ago

No, I do not think you've parsed the file in error. This is not correct, of course, but dbSNP is not perfect and it does have erroneous entries. The entry for this SNP at NCBI shows it mapping to chr10. When I look closely at the chr18 region indicated by your parsing, I see no SNPs at all.

ADD COMMENT
2
Entering edit mode

BLATing the flanking sequence of this SNP shows that it maps to the mentioned regions on chr 10 and chr 18 with 99.75 and 100% identity, though .... So, I would argue that the dbSNP entry is correct and that the SNP just cannot be mapped to a unique genomic position.

ADD REPLY
1
Entering edit mode
13.2 years ago
User 3869 ▴ 100

It is possible. In dbSNP, some SNPs will map to more than one genomic region. You can see some other cases in chr_*.txt in dbSNP FTP. There also exist cases that one genomic position associated to multiple SNP IDs.

I thought it is due to different alignments of the submitted fasta sequences associated to that SNP. Biological data are usually messy and not consistent. As more accurate reference genome been assembled and more evidences of SNPs are submitted, this situation might become better.

ADD COMMENT

Login before adding your answer.

Traffic: 1696 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6