How Can I Query A Genetics Database Of Snps (Preferably With R)?
2
1
Entering edit mode
11.1 years ago
Farrel ▴ 240

Starting with a a few human single nucleotide polymorphisms (SNPs) how can I query a database of all known SNPS such that I can generate a list (data.table or csv file, but preferably read straight into R) of the 1000 or so closest SNPS, whether or not the SNP is a tagSNP, and what the minor allele frequency (MAF) is and how many bases it is away from the starting SNPS?

I would prefer to do this in R (although it does not have to be). Which database should I use? My only starting point would be listing the the starting snps (eg rs3091244 , rs6311, etc).

I am certain there is a nice simple Bioconductor package that could be my starting point. But what? Have you ever done it? I imagine it can be done in about 3 to 5 lines of code.

Shame on me. I originally posted this to Stackoverflow because I could not find a good place to ask bioinformatics in StackExchange. I had forgotten about the great divide when "Biostar is no longer affiliated with Stack Exchange." At stackoverflow I was advised that it was not an appropriate place to post such a question. Sorry Biostar for having forgotten that you are a free and independent site.

I subsequently received the suggestion that I try the Broad Institute. I was delighted with what I saw. I was going to use it and download the csv to read in. Any comments?

  • Should I try anything else?
  • Is there an R package or few lines that will allow me to interact with the Broad directly from within R?
r bioconductor snp • 3.5k views
ADD COMMENT
2
Entering edit mode
11.1 years ago
Fabio Marroni ★ 3.0k

A former collaborator of mine developed a nice R package that could help you. It's called ncbi2R, and is aimed at annotating SNPs, which is - I guess - what you are aiming at...

Here is the link: http://ncbi2r.files.wordpress.com/2010/09/ncbi2r-tutorial_1_3_13.pdf

ADD COMMENT
0
Entering edit mode

I got this message.

Package ‘NCBI2R’ was removed from the CRAN repository. Formerly available versions can be obtained from the archive. Archived on 2015-03-13 as misuse of \donttest was not corrected.

ADD REPLY
1
Entering edit mode
8.4 years ago
Farrel ▴ 240

I found a current package that can do this for me. rsnps: Get SNP (Single-Nucleotide Polymorphism) Data on the Web https://cran.r-project.org/web/packages/rsnps/index.html

NCBI_snp_query(SNPs = snps) where snps is a character vector of rs numbers

ADD COMMENT
0
Entering edit mode

Have you tried the Variant Effect Predictor (aka VEP) at all? It's available in R via the Ensembl REST API (check the Variation endpoints). The VEP accepts variant IDs such as rs123 and the annotation is done against dbSNP146 in Ensembl 84 (although in the next release, Ensembl will have the dbSNP 147 data for human).

ADD REPLY

Login before adding your answer.

Traffic: 2796 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6