I currently have a table for every patient with all of their variants on exome sequencing. One column contains the identified dbsnp code (rs). I would like to know how can I automatically generate another column that stores the ClinVar information for every available common single nucleotide polymorphism.
I recently searched BiomaRt for R but couldn't find any attribute sending queries to ClinVar's database. Sometime ago, I used to download every single variant on dbsnp for each gene individually but that took way too long as both my patients and their sequencing data were way too long.
I second this, MyVariant.info is a great tool. Once you got the variation IDs from MyVariant, you can further use them to fetch more data from Entrez itself. While on Python, if you have
biopython
installed, you can do:You will end up with the XML found in pages like these: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=clinvar&id=1550&rettype=variation
It's an ugly, unparsed XML, but it does provide some extra info that sometimes MyVariant.info doesn't include.