Question

How to check monophyly in NJ tree created with the "ape" package?

0

Entering edit mode

2.8 years ago

João ▴ 10

So I have this script that reads a fasta file containing sequences from species belonging to a single genus, then aligns them, creates a distance matrix and then a NJ tree using the "nj" function from the "ape" package.

library(msa)
library(ape)
library("Biostrings")

fastaFile = readDNAStringSet("genus.fasta")
seq_name = names(fastaFile)
sequence = paste(fastaFile)
genus_df <- data.frame(seq_name, sequence)
mySequences <- readDNAStringSet("genus.fasta")
alignment1 <- msa(mySequences,type="dna",method="ClustalW")
converted<- msaConvert(alignment1,type="ape::DNAbin")
class(converted) <- "matrix"
converted <- as.DNAbin(converted)
matrix_dist<-dist.dna(converted, model = "K80",as.matrix = T)
nj<-nj(matrix_dist)
plot(nj)

I am trying to check, for each species present in the NJ tree, which species are monophyletic in the tree and which aren't. For that I am using the "is.monophyletic" function also found in "ape":

species<-unique(genus_df$seq_name)
for (i in species){
  print(paste(is.monophyletic(nj, tips=i),i))

The code seems to be working, but the output I am getting is not congruent with my interpretation of the plotted tree.

Output:

[1] "TRUE Trachinotus_blochii"
[1] "TRUE Trachinotus_baillonii"
[1] "TRUE Trachinotus_mookalee"
[1] "TRUE Trachinotus_ovatus"
[1] "TRUE Trachinotus_carolinus"
[1] "TRUE Raja_binoculata"

The tree plot, where we can see that Trachinotus ovatus is not monophyletic

I am not sure if the problem is with my interpretation or the code. Since the tree is unrooted, I also tried to use the "root" function from "ape", using the Raja sequence as outgroup, but I still got the same output (TRUE for all species being monophyletic).

Thank you in advance for any suggestion!

tree monophyly ape R • 583 views

ADD COMMENT • link updated 2.8 years ago by M__ ▴ 200 • written 2.8 years ago by João ▴ 10

score 0 · Answer 1 · 2022-02-21

The output is congruent with your code.

To put it another way - I am monophyletic with myself and you are with yourself. The last time I compared me with me, I didn't find anybody else involved. This is the immediate issue in the code.

I am however not monophyletic if I'm compared with my cousins when my siblings are in the same tree, if the function concerns the most recent common ancestor (MRCA), i.e. my parents, or else possibly the relationship between cousins (forcing paraphyly).

Is that ok for an answer? I am just visiting the site.