Entering edit mode
8.0 years ago
alessandrotestori7
▴
420
Hi. I have plenty of SNPs and I want to get their mafs. Is there a file somewhere I can download with this information? I have already tried SNIPA, Haploreg and dbSNP, but the problem is that I cannot find all of my SNPs because some of them have aliases. Please let me know. Thanks, Alessandro.
Does this previous post help? You may need to update the builds as this post is a few years old, but the process is the same.
Unfortunately some SNPs are missing in UCSC or dbSNP; probably the SNPs in my list have been merged to other SNPs with another name (have been "renamed").
Are you just looking for something that will output SNPs and their AFs that have an allele frequency above a specific value? I wrote a parser that will iterate through a vcf file and can output AFs which can be ordered.
My problem is that I cannot find a database which I can download as a text file with all the SNPs I need and information about allele frequency: unfortunately some of my SNPs are probably aliases.
Then what is the format of your SNPs? How is the notation? Which identifiers?
Just a column with rs identifiers:
rs10503665\nrs8130815\nrs4822193
....
Around 25.000 rs identifiers
the snp IDs you have ('rs\d+') are dbSNP accession IDs - do you know what release of dbSNP (132, 135, ...) your collection of SNPs are derived from? What percentage of your SNPs are you failing to find a MAF for?
Unfortunately I don't know the dsSNP release. I fail to find around 2% of my SNPs. I need a database to access where I can find all SNPs and aliases. Best if I can download it locally.
This makes sense to me (others may disagree). A small number of SNPs will fail to match for reasons you have already mentioned; dbSNP IDs get retracted or moved. If you knew the base-pair coordinates or the dbSNP release this would be easier. Because you do not know, you could; i) try to get this info, ii) try to match the 2% that fail to previous releases of dbSNP, iii) accept that 2% 'got away' and move on.
98% successful? That's pretty good :-) If you would download or query different versions of dbSNPyou could try to hunt down in which release those 2% were present.