Starting with a a few human single nucleotide polymorphisms (SNPs) how can I query a database of all known SNPS such that I can generate a list (data.table or csv file, but preferably read straight into R) of the 1000 or so closest SNPS, whether or not the SNP is a tagSNP, and what the minor allele frequency (MAF) is and how many bases it is away from the starting SNPS?
I would prefer to do this in R (although it does not have to be). Which database should I use? My only starting point would be listing the the starting snps (eg rs3091244 , rs6311, etc).
I am certain there is a nice simple Bioconductor package that could be my starting point. But what? Have you ever done it? I imagine it can be done in about 3 to 5 lines of code.
Shame on me. I originally posted this to Stackoverflow because I could not find a good place to ask bioinformatics in StackExchange. I had forgotten about the great divide when "Biostar is no longer affiliated with Stack Exchange." At stackoverflow I was advised that it was not an appropriate place to post such a question. Sorry Biostar for having forgotten that you are a free and independent site.
I subsequently received the suggestion that I try the Broad Institute. I was delighted with what I saw. I was going to use it and download the csv to read in. Any comments?
- Should I try anything else?
- Is there an R package or few lines that will allow me to interact with the Broad directly from within R?
I got this message.
Package ‘NCBI2R’ was removed from the CRAN repository. Formerly available versions can be obtained from the archive. Archived on 2015-03-13 as misuse of \donttest was not corrected.