Question

How Can I Query A Genetics Database Of Snps (Preferably With R)?

1

Entering edit mode

11.1 years ago

Farrel ▴ 240

Starting with a a few human single nucleotide polymorphisms (SNPs) how can I query a database of all known SNPS such that I can generate a list (data.table or csv file, but preferably read straight into R) of the 1000 or so closest SNPS, whether or not the SNP is a tagSNP, and what the minor allele frequency (MAF) is and how many bases it is away from the starting SNPS?

I would prefer to do this in R (although it does not have to be). Which database should I use? My only starting point would be listing the the starting snps (eg rs3091244 , rs6311, etc).

I am certain there is a nice simple Bioconductor package that could be my starting point. But what? Have you ever done it? I imagine it can be done in about 3 to 5 lines of code.

Shame on me. I originally posted this to Stackoverflow because I could not find a good place to ask bioinformatics in StackExchange. I had forgotten about the great divide when "Biostar is no longer affiliated with Stack Exchange." At stackoverflow I was advised that it was not an appropriate place to post such a question. Sorry Biostar for having forgotten that you are a free and independent site.

I subsequently received the suggestion that I try the Broad Institute. I was delighted with what I saw. I was going to use it and download the csv to read in. Any comments?

Should I try anything else?
Is there an R package or few lines that will allow me to interact with the Broad directly from within R?

r bioconductor snp • 3.5k views

ADD COMMENT • link 8.4 years ago by Farrel ▴ 240

score 2 · Answer 1 · 2013-10-11

2

Entering edit mode

11.1 years ago

Fabio Marroni ★ 3.0k

A former collaborator of mine developed a nice R package that could help you. It's called ncbi2R, and is aimed at annotating SNPs, which is - I guess - what you are aiming at...

Here is the link: http://ncbi2r.files.wordpress.com/2010/09/ncbi2r-tutorial_1_3_13.pdf

ADD COMMENT • link 11.1 years ago by Fabio Marroni ★ 3.0k

0

Entering edit mode

I got this message.

Package ‘NCBI2R’ was removed from the CRAN repository. Formerly available versions can be obtained from the archive. Archived on 2015-03-13 as misuse of \donttest was not corrected.

ADD REPLY • link 8.4 years ago by Farrel ▴ 240

score 1 · Answer 2 · 2016-07-18

1

Entering edit mode

8.4 years ago

Farrel ▴ 240

I found a current package that can do this for me. rsnps: Get SNP (Single-Nucleotide Polymorphism) Data on the Web https://cran.r-project.org/web/packages/rsnps/index.html

NCBI_snp_query(SNPs = snps) where snps is a character vector of rs numbers

ADD COMMENT • link 8.4 years ago by Farrel ▴ 240

0

Entering edit mode

Have you tried the Variant Effect Predictor (aka VEP) at all? It's available in R via the Ensembl REST API (check the Variation endpoints). The VEP accepts variant IDs such as rs123 and the annotation is done against dbSNP146 in Ensembl 84 (although in the next release, Ensembl will have the dbSNP 147 data for human).

ADD REPLY • link 8.4 years ago by Denise CS ★ 5.2k