Hi all,
I am trying to get the alleles and frequences of some SNPs (from across the genome) for the assembly 37:GRCh37 (positive strand). I thought the easiest way would be to download the frequency data from hapmap and then look for my SNPs, but they are only have data for up to build 36. I also tried to send a batch query at the ncbi data but they dont support files as large as mine(I have to cut it in chunks) and they return far too much information than I need (genotypes for all submitted data, all existing populations, etc). Im only interested in CEU and the frequences from HapMap are more than good enough for my purposes. Im thinking there must be an easier way to do it than the batch query. All ideas are welcome!
Thanks!
Emma, Biomart has all the data that you need (i.e. SNP information mapped to GRCh37), plus an archive of past mappings. you may have incorrectly landed on one of these, but if you go to http://www.biomart.org/, select MartView, choose database "Ensembl Variation 59", and choose dataset "Homo Sapiens Variation (dbSNP131)" you will surely be working with up to date information.
Thanks Khader for the Biomart intro.
Thanks, this is a good link to keep in mind for future use. But for now Im afraid it has similar problems as downloading directly from the hapmap ftp, ie it only has release 27 data, not the build that I need.
What Jorge said !