Allele frequency calculation for genotype dosage value
1
1
Entering edit mode
14 months ago
Sebastian ▴ 10

Hello, i have a data set with the dosage data (between 0-2) from a couple million SNPs, i would like to get the MAF for each SNP. I saw somewhere (not that reliable place) that you can get it just doing:

SNP1 <- c(0.03, 0.05, 1.95, 1.21, 0.09)

MAFSNP1 <- sum(SNP1) / (2*length(SNP1))

i compared this "MAF" from my dataset with the 1000 genome one, and it match.

Do you know if this is the good way to get the MAF in dosage data? do you know a paper or book giving the formula? thanks a lot !

plink dosage r impute • 1.2k views
ADD COMMENT
1
Entering edit mode

a couple million SNPs

Why not use plink to calculate the frequency? It would be a lot faster than R.

ADD REPLY
2
Entering edit mode
14 months ago
4galaxy77 2.9k

Almost, you need to add this, since you are calculating the alternate allele frequency, not the minor one:

MAFSNP1 = ifelse(MAFSNP1 > 0.5, 1-MAFSNP1, MAFSNP1)
ADD COMMENT
0
Entering edit mode

Yes, you right. I didn't put it for make the post shorter. Do you know a source where i can confirm my formula? the package gwasurvivr uses almost the same thing:

MAFSNP1 <- mean(SNP1, na.rm=T) * 0.5
ADD REPLY

Login before adding your answer.

Traffic: 2291 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6