I have successfully constructed a series of functions based on eutils
that capture the rsID of a given SNP as a character variable and search for clinical significance strings on clinVAR and dbSNP. When I run every function with apply()
on a column of >1000 whole exome sequencing entries I usually get a connection timeout error, unlike running the function for a single variable every time. For example;
> clinVARlookup("rs4340")
[1] "Pathogenic"
> clinVARdisease("rs4340")
[1] "Microvascular complications of diabetes 3, Ischemic stroke, susceptibility to, Myocardial infarction, ANGIOTENSIN I-CONVERTING ENZYME INSERTION/DELETION POLYMORPHISM, Stroke, hemorrhagic, susceptibility to, Severe acute respiratory syndrome, progression of, Susceptibility to progression to renal failure in IgA nephropathy"
Realistically speaking, what are the alternatives ? I am currently searching for indexing methods based on downloaded databases, but I would prefer not to update >1Gb databases every now and then whenever I want to provide genetic counselling tables.
Which library are you using? reutils?
Yes! Any suggestions welcome.