Question

R phangorn phylogenetic analysis from somatic mutations' binary table

2

Entering edit mode

9.1 years ago

Alejandro Jimenez Sanchez ▴ 180

I want to use a binary matrix to build a phylogenetic tree using the R package phangorn as done in this paper:Tracking the Genomic Evolution of Esophageal Adenocarcinoma through Neoadjuvant Chemotherapy

In the methods they say:

Trees were built using binary presence/absence matrices built from the regional distribution of variants within the tumor. The R Bioconductor package phangorn (1.99-7; ref. 36) was utilized to perform the parsimony ratchet method (18), generating unrooted trees. Branch lengths were determined using the acctran function.

I have the binary presence/absence matrix, however the phangorn package uses phyDat objects, which are derived from sequence alignments according to the phangorn vignettes.

My question is:

How can I use a binary table to build a phylogenetic tree with the R phangorn package?

If there is a way to read the binary matrix as a phyDat object, that would solve the problem, but I don't see how that could be done.

phangorn R tumour phylogeny binary table • 7.4k views

ADD COMMENT • link updated 21 months ago by Sleepy • 0 • written 9.1 years ago by Alejandro Jimenez Sanchez ▴ 180

3

Entering edit mode

9.1 years ago

poisonAlien ★ 3.2k

There are multiple ways to construct trees based on binary data.

You can use neighbor-joining method from Phangorn for tree construction.

Since you already have a binary matrix.

mat.nj = nj(dist.gene(t(mat))) #neighbour joining tree construction
plot(mat.nj, 'cladogram') #plot cladogram
write.tree(mat.nj, 'mat.newick') #write newick tree

This is using UPGMA method (I'm not sure you can use this one for binary data)

mat.upgma = upgma(dist.gene(t(mat)))

Use Phylip character parsimony which I think most suitable for this kinda data.

Collapse your matrix for pars input. For example:
```
4 10 #Four samples ten mutations
tumor_right
11110
tumor_left
11111
tumor_up
11001
tumor_root
00000
```
And use phylip pars with outgroup root set to your germline sample (here its tumor_root, 4th sample). Setting outgroup in Phangorn is bit difficult (I'm not sure though).
As Chris suggested above, you can use other sophisticated methods such as lichee, which uses vaf info to cluster and constructs trees (also divides trees based on clones).

ADD COMMENT • link updated 6.6 years ago by Ram 45k • written 9.1 years ago by poisonAlien ★ 3.2k

0

Entering edit mode

Hello Alien! Do you know how to draw a rooted tree by phangorn?That is to say,how can I set the group(0,0,0,0,0) to be the root? Regards Tang

ADD REPLY • link 21 months ago by Sleepy • 0

1

Entering edit mode

9.1 years ago

Chris Miller 22k

An alternative solution, and a more typical workflow for cancer samples, would be to feed your VAF and clustering information into a package like clonevol, which does the phylogenetic inference and produces some nice visualizations. (clustering can be accomplished with a package like sciclone or pyclone).

ADD COMMENT • link updated 6.6 years ago by Ram 45k • written 9.1 years ago by Chris Miller 22k

Ram · Accepted Answer · 2016-02-23

Hello Alejandro,

There are generic functions as.phyDat() in phangorn to transform matrices and data.frames into phyDat objects.

For example you can read in your data with read.table() or read.csv(), but you might need to transpose your data. For matrices as.phyDat() assumes that the entries each row belongs to one individual (taxa), but for data.frame each column. For binary data you can transform these with a command like (depending how you coded them):

as.phyDat(data, type="USER", levels = c(0, 1))
as.phyDat(data, type="USER", levels = c(TRUE, FALSE))

Regards,
Klaus