I have a bunch of rsIDs and I need to find the corresponding 1000 genomes MAF as provided in dbSNP: I have a set of IDs as input:
rs1751034 rs1799852 rs1799983 rs1800460 rs1801030
I am looking for server / service / database that can give me an output like
rs1751034 NA
rs1799852 0.132
rs1799983 0.203
rs1800460 0.020
rs1801030 0.090
Google searches revealed several raw data-munging options:
I have tried several options including the following two suggestions: http://www.ncbi.nlm.nih.gov/books/NBK44431/#Search.Finding_Minor_Allele_Frequencies I downloaded the AlleleFreqBySsPop.bcp, but am not sure if this table provides the 1000 genomes MAF.
Also tried this option: http://seqanswers.com/forums/showthread.php?t=4910 Downloaded and checked for MAFs, but the allele frequencies are not concordant with the 1000 genomes MAF in dbSNP.
I just need this info only for a couple of SNPs, so looking for a simple search/retrieve option. Do you know how can I get this information ?
Thanks in advance !
Something it is worth noting is while AF based on AC and AN are useful its better to use the ones the project provides where possible as these will of used additional haplotype and LD info to calculate the AF so will give better estimates for low frequency snps. We provide the files with AF based on just either the ASN, AFR or EUR individuals in the supporting directory ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20100804/supporting/
VCF spec (google vcftools).
Thanks Laura. Can you please tell me what are the AC and AN you are referring here ?
I guess your question differs slightly and you need to pull specific positions out of 1000 genomes. Perhaps a perl wrapper around tabix?
Thanks Stephen, that's a perfect solution for me. MAFs are concordant with dbSNP and am also getting allele frequencies different alleles.