Entering edit mode
19 months ago
Barista
▴
10
Hi all! :)
I have a list of snp ids, to which I need their correspodning MAF (minor allele frequency) values.
I tried to do it in R using biomaRt but it seems that it is not possible- correct me if I am wrong.
The only way I managed to do it is in Python using the API from ensembl, but since the list is over half a milion of snps, it takes so much time and I always get connection errors or something, so I cannot make this finish even once :(
Please help me and give me some tips, I will highly appreciate that, because it is my first time working on such a task.
download a VCF containing the frequencies (dbsnp, etc....) and use bcftools to get the data in a set of regions
In addition to Pierre's answer, the Ensembl Variant Effect Predictor (VEP) might be useful here as you can retrieve allele frequencies (as well a slots of other data) for a custom list of variants:
https://www.ensembl.org/info/docs/tools/vep/index.html
Where can I find such a VCF file? I am currently trying to use dbsnp API in Python but with no success so far.
gnomad see the download section : https://gnomad.broadinstitute.org/downloads
Does it really take so long to download one file? I got ..teen hours for one chromosome file (and I want several). Am I doing something wrong, is there a faster way? Highly appreciate your help in advance!