Hypothetical question just to illustrate what I'm trying to do:
A small gene "FOOBAR" on human chromosome 1 has been genotyped last year for three SNPs (in a LD block) for a total 1000 unrelated Finnish samples. The following are the summarized genotypes. Sample_1's genotypes for these three SNPs are "AG", "AC" and "CT", respectively. What are the estimated haplotypes for Sample_1 and the haplotype's probabilities?
SNP1 SNP2 SNP3 TotalSample
AA CC TT 160
GG AA CC 358
GG AA CT 2
AG AC CT 480
The first 160 are homozygous for ACT, the next 358 are homozygous for GAC, The next two are (GAC,GAT), and the last 480 are (ACT,GAC)
The first 160+358 subjects are "obvious" homozygous, and the last 480 are easily explained as having one copy each of the haplotypes that the first 160 and the next 358 share. The remaining two require the introduction of a new haplotype (GAT).
Homework problem?