Haplotype Frequencies And Maximum Likelihood Estimation / Expectation Maximization Algorithm
2
0
Entering edit mode
11.3 years ago

Does anyone have a numerical example on how the EM algorithm can be used to determine haplotype frequencies from genotype frequencies? I have searched a lot with Google, and I just can't find a single numerical example out there. Thank you.

haplotype • 6.3k views
ADD COMMENT
0
Entering edit mode

Is the end goal to associate a phenotype to a haplotype? If so, BEAGLE has auxiliary scripts to do this. Do you have phased data? Can you provide a little more background?

ADD REPLY
0
Entering edit mode

Hi Zev. The end goal is to understand, how PLINK, WDIST and other software packages estimate haplotype frequencies from genotype data using the EM algorithm cf. http://pngu.mgh.harvard.edu/~purcell/plink/ld.shtml

ADD REPLY
0
Entering edit mode

Can I close this question as nobody seems to know the answer?

ADD REPLY
0
Entering edit mode

tommy.carstensen http://www.slideshare.net/awais77/measures-of-linkage-disequilibrium-10238151. These slides have an empirical example: slides 14-16. "calculating linkage disequilibrium R squared" - google search.

ADD REPLY
0
Entering edit mode

Thanks. That is an example with haplotypes. I am interested in how the haplotype frequencies are calculated from the genotypes with the EM algorithm.

ADD REPLY
1
Entering edit mode
11.3 years ago

tommy.carstensen The expectation is explained on the Linkage Dis-equilibrium wiki page

http://en.wikipedia.org/wiki/Linkage_disequilibrium

Take a close look at the example:

Example: Human Leukocyte Antigen (HLA) alleles

specifically:

and the estimated frequency of haplotype xy is

If this doesn't answer your question I will attempt to explain it in other terms.

ADD COMMENT
0
Entering edit mode

Thanks Zev. I will have a look at it this evening. Thank you.

ADD REPLY
0
Entering edit mode

Let us say I have 10 samples in total. At SNP1 I have these genotypes: AA CC AC AA CC AC AA CC AC AA

At SNP2: GG GT GG GG GT TT TT GT GT TT

What is then the LD between these two SNPs?

ADD REPLY
0
Entering edit mode

Which measure of LD do you want? R, R^2, D, D'?

ADD REPLY
0
Entering edit mode

R squared. I can convert if necessary. I am not interested in the answer. That I can get with PLINK or WDIST. I am interested in the calculation. Thank you.

ADD REPLY
1
Entering edit mode
11.3 years ago

I found a paper pointing me in the direction of an answer to my question: "Linkage Disequilibrium Between Loci With Unknown Phase" http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2710162/pdf/GEN1823839.pdf

ADD COMMENT
0
Entering edit mode

Ah, sorry should have sent that to you! I work with Chad Huff. I implemented his method and it works well.

ADD REPLY
0
Entering edit mode

Thanks for all of your replies. Is your code open source and on github by any chance?

ADD REPLY

Login before adding your answer.

Traffic: 1793 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6