dbSNP not using hg19?
3
3
Entering edit mode
10.4 years ago
lilla.davim ▴ 160

Hello,

I looked in dbSNP for existing variants in the interval chr20:256720-256730. There is one match at position ch20:256722. The corresponding sequence displayed is the following:

GAAAGGATTCTGGAAAAGTGAGCTG[G/T]AACAGAAAAGAACTGTCTCAATGGG

However when I look at this position in UCSC genome browser the reference sequence is different:

Leads to: tcaataatctg

Is is due to the fact that dbSNP does not use hg19? How can I know which reference sequence is used? Or is there another explanation?

Thanks for your help.

SNP • 6.6k views
ADD COMMENT
2
Entering edit mode
10.4 years ago

Try checking the current reference genome (hg38). You'll see that it matches.

ADD COMMENT
1
Entering edit mode
10.4 years ago
enricoferrero ▴ 910

First of all: thank you! I'm also using dbSNP build 141 at the moment, but I didn't realise it was the GRCh38/hg20 assembly until I randomly stumbled upon this question!

You can access dbSNP build 141 in the GRCh37/hg19 assembly here: http://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b141_GRCh37p13/

while the GRCh38/hg20 assembly is here: http://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b141_GRCh38/

ADD COMMENT
1
Entering edit mode

I see, thanks a lot. And is there a way to query directly older builds of dbSNP based on hg19, or do I necessarily need to go through an intermediary mapping step between the 2 reference assemblies (e.g. using UCSC liftOver tool)?

I have namely I list of potential variants described with hg19 positions and would like to check them under dbSNP.

Thanks.

ADD REPLY
0
Entering edit mode
10.4 years ago
Bert Overduin ★ 3.7k

As you can read in the release notes of the latest dbSNP build, they have mapped their variants both to GRCh38 (hg20) and GRCh37(hg19):

dbSNP has released human Build 141 based on the GRCh38 assembly (http://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/), as well as on the GRCh37.p13 assembly.

If you go for example to the dbSNP page for rs123, you can see that in the section "Integrated Maps" the genomic position in both assemblies is displayed, 7:24926827 in GRCh38 and 7:24966446 in GRCh37.13.

However, I don't think it's possible to search dbSNP with GRCH37 based positions ....

I would suggest to use the Variant Effect Predictor (VEP) (which is still GRCh37 based) to investigate if there are already known variants for the same genomic positions of your potential variants.

Hope this helps.

PS - The official names of the assemblies are GRCh37 and GRCh38, so please use these instead of hg19 and hg20.

ADD COMMENT
0
Entering edit mode

There is no hg20 assembly, you've messed it with hg38. https://www.ncbi.nlm.nih.gov/assembly/GCF_000001405.26/

ADD REPLY

Login before adding your answer.

Traffic: 1352 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6