Hi all,
I'm reading the book Bioinformatics for Biologists and I'm having doubts with the statistic D' for linkage disequilibrium.
At the end of chap 1 they ask me to show that 0 <= D' <= 1.
As D' is D normalized is very simple to say. All measures normalized are between 0 and 1. However I was not sure and I look at the formula trying to figure out how is normalized.
D' = D/Dmax
Searching on the web I realized that the formula is quite similar to feature scaling but not exactly the same. In fact the formula changes whether D >= 0 or D < 0.
D >= 0 --> D' = D/min{p1q2, p2q1}
D < 0 --> D' = D/-max{p1q1, p2q2}
being "p" and "q" two loci and "1" representing the mutated allele and "2" representing the wild allele.
Said that, why the formula changes? I understand the "-" sign to have D' as a positive number, but I don't understand why first is min{p1q2, p2q1} and for D < 0 is max{p1q1, p2q2}.
Thank you.
Edit: All the resources (pdf and blog posts) I've read so far about linkage disequilibrium doesn't explain this, they just say the formula and move on.