I was recently researching for methods on constructing tumor phylogenetic trees by processing SNP frequencies from multiple samples and inferring viable models of subclonal decomposition. That approach applies to individual patients regardless of their sample size. Recently I ran across a paper that draws comparisons between different patients and assigns multiple numbers on every branch/transition ... (Figure 1D and 5B)
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3864404/
I've searched the online methods but haven't quite figured out how to replicate their method.
Hi,
just a very picky comment, if you make a tree of different patient's tumors, the tree cannot be called a phylogeny. A phylogeny implies in my understanding that there is also an ancestry, and two branches joining implies there exists a common ancestor from which they diverged, which is impossible for two patients unless we include the ancestor of those patients.
Phylogenetic trees were generated based on three different genomic
events: mutations data, copy number and compound copy number events.
For mutation data, a consolidated matrix containing the mutations of
all samples (rows) with ‘1’ and ‘0’ representing the presence and
absence of a mutation in a gene (column), respectively, is generated.
The rows of this matrix represent the samples and columns represent
the genes. For copy number, the matrix consisted of segment log ratio
data for each patient-gene pair. Pearson correlation coefficients ρxy
were computed between pairs of patients x and y. The results were used
for phylogenetic analysis such that the pairwise distance of x and y
was computed as 1 − ρxy. The Neighbor-Joining method of Saitou and Nei
(Saitou and Nei, 1987) and the Unweighted Pair Group Method with
Arithmetic Mean (UPGMA) method of clustering were used to construct
the phylogenetic tree. We used ‘ape’ R package (Paradis et al., 2004)
for constructing and plotting the phylogenetic trees. For compound
copy number events, a similar procedure was performed with the
exception of the matrix construction and computation of the distance.
The matrix consisted of the weight of observing compound events: 2 for
amplified LOH (ALOH), 2 for copy neutral LOH (NLOH), 2 for homozygous
deletion (HOMD), 1 for hemizygous deletion (HETD), and zero for
diploid heterozygous (HET) and allele-specific amplification (ASCNA).
Euclidean distance was computed between pairs of tumour samples. The
tree construction was performed the same as before.
Hi, just a very picky comment, if you make a tree of different patient's tumors, the tree cannot be called a phylogeny. A phylogeny implies in my understanding that there is also an ancestry, and two branches joining implies there exists a common ancestor from which they diverged, which is impossible for two patients unless we include the ancestor of those patients.
Humans do all share ancestry, and the root of their tree appears to be the human reference sequence, so I think it's legit, (albeit a little weird)