Searching for ClinVar entries based on SNP annotation
2
0
Entering edit mode
8.4 years ago
ceruleanivy ▴ 50

I currently have a table for every patient with all of their variants on exome sequencing. One column contains the identified dbsnp code (rs). I would like to know how can I automatically generate another column that stores the ClinVar information for every available common single nucleotide polymorphism.

I recently searched BiomaRt for R but couldn't find any attribute sending queries to ClinVar's database. Sometime ago, I used to download every single variant on dbsnp for each gene individually but that took way too long as both my patients and their sequencing data were way too long.

R SNP snp genome • 6.5k views
ADD COMMENT
4
Entering edit mode
8.3 years ago
Newgene ▴ 370

MyVariant.info API is very handy for this type of queries. Given a rsid to get the ClinVar annotation:

http://myvariant.info/v1/query?q=rs727503873&fields=clinvar

or more explicitly:

http://myvariant.info/v1/query?q=dbsnp.rsid:rs727503873&fields=clinvar

Batch query is also supported via POST, or even more convenient to use myvariant Python or R clients:

http://docs.myvariant.info/en/latest/doc/packages.html

In your case, Python code might look like this:

import myvariant
mv = myvariant.MyVariantInfo()
res = mv.querymany(your_rsid_list, scopes='dbsnp.rsid', fields='clinvar')

Additionally, you might this "Access ClinVar Data from MyVariant.info Services" tutorial useful.

ADD COMMENT
0
Entering edit mode

I second this, MyVariant.info is a great tool. Once you got the variation IDs from MyVariant, you can further use them to fetch more data from Entrez itself. While on Python, if you have biopython installed, you can do:

from Bio import Entrez

Entrez.email = 'foo@bar.com' # your email here

# Supposing 1550 is your Clinvar Variation ID
handle = Entrez.efetch(db='clinvar', id='1550', rettype='variation')
response = handle.read()

print(response)

You will end up with the XML found in pages like these: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=clinvar&id=1550&rettype=variation

It's an ugly, unparsed XML, but it does provide some extra info that sometimes MyVariant.info doesn't include.

ADD REPLY
0
Entering edit mode
8.4 years ago
EagleEye 7.6k

How about using the files directly from NCBI,

http://www.ncbi.nlm.nih.gov/clinvar/

ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/

ADD COMMENT
0
Entering edit mode

I would prefer a more automated method (programming based) that takes advantage of the latest database entries. If /tab_delimited/archive/ are the files you are referring to then I will probably have to download and decompress/merge every .gz every time I want to update my database of clinvar entries,

ADD REPLY
1
Entering edit mode

See this link for programmatic access for Clinvar.

ADD REPLY
0
Entering edit mode

thanks for the post and link

ADD REPLY

Login before adding your answer.

Traffic: 2564 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6