Question

Tumor phylogeny among different patients

1

Entering edit mode

8.6 years ago

ceruleanivy ▴ 50

I was recently researching for methods on constructing tumor phylogenetic trees by processing SNP frequencies from multiple samples and inferring viable models of subclonal decomposition. That approach applies to individual patients regardless of their sample size. Recently I ran across a paper that draws comparisons between different patients and assigns multiple numbers on every branch/transition ... (Figure 1D and 5B) http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3864404/ I've searched the online methods but haven't quite figured out how to replicate their method.

R next-gen sequencing SNP sequence • 2.0k views

ADD COMMENT • link updated 8.6 years ago by Chris Miller 22k • written 8.6 years ago by ceruleanivy ▴ 50

0

Entering edit mode

Hi, just a very picky comment, if you make a tree of different patient's tumors, the tree cannot be called a phylogeny. A phylogeny implies in my understanding that there is also an ancestry, and two branches joining implies there exists a common ancestor from which they diverged, which is impossible for two patients unless we include the ancestor of those patients.

ADD REPLY • link 8.6 years ago by Michael 55k

0

Entering edit mode

Humans do all share ancestry, and the root of their tree appears to be the human reference sequence, so I think it's legit, (albeit a little weird)

ADD REPLY • link 8.6 years ago by Chris Miller 22k

score 1 · Answer 1 · 2016-04-10

The full details are there in the appendix:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3864404/bin/path0231-0021-sd1.pdf

10 Phylogenetic analysis

Phylogenetic trees were generated based on three different genomic events: mutations data, copy number and compound copy number events. For mutation data, a consolidated matrix containing the mutations of all samples (rows) with ‘1’ and ‘0’ representing the presence and absence of a mutation in a gene (column), respectively, is generated. The rows of this matrix represent the samples and columns represent the genes. For copy number, the matrix consisted of segment log ratio data for each patient-gene pair. Pearson correlation coefficients ρxy were computed between pairs of patients x and y. The results were used for phylogenetic analysis such that the pairwise distance of x and y was computed as 1 − ρxy. The Neighbor-Joining method of Saitou and Nei (Saitou and Nei, 1987) and the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) method of clustering were used to construct the phylogenetic tree. We used ‘ape’ R package (Paradis et al., 2004) for constructing and plotting the phylogenetic trees. For compound copy number events, a similar procedure was performed with the exception of the matrix construction and computation of the distance. The matrix consisted of the weight of observing compound events: 2 for amplified LOH (ALOH), 2 for copy neutral LOH (NLOH), 2 for homozygous deletion (HOMD), 1 for hemizygous deletion (HETD), and zero for diploid heterozygous (HET) and allele-specific amplification (ASCNA). Euclidean distance was computed between pairs of tumour samples. The tree construction was performed the same as before.