Can anyone briefly explain how the subclonal evolution is modeled as a (finite or bayesian non-parametric) mixture model? In particular, what is the usual metric/metrics for defining centroid of a cluster (Is it SNV frequency only?).
In the recent review paper by Beerenwinkel, Niko, et al. Systematic biology 64.1 (2015) e1-e25, the authors comment in this regard is: "Because it is unknown in which genome a given SNV occurred, prior to tree reconstruction, SNVs are clustered into sets of mutations with common frequency. This clustering is often performed using Bayesian mixture models, either finite ones (Larson and Fridley 2013) or nonparametric ones in which the number of mixture components is estimated together with their frequencies and densities"
If possible, a toy example will be greatly appreciated.
Sorry, if I am confusing terms/concepts!