How to Handle Incorrect Clade Grouping of a Reference Genome in a Phylogenetic Tree?
1
0
Entering edit mode
5 days ago
oasiswho • 0

Hi everyone,

I encountered an issue while building a phylogenetic tree and would appreciate some guidance or discussion.

I used a reference genome to annotate my target genome, but during the phylogenetic analysis, I found that the reference genome was incorrectly grouped into the same clade as my target genome. Biologically, this reference genome should be much more distant from my species.

Here are my questions:

Has anyone experienced a similar issue? If I want to build a robust phylogenetic tree, should I remove this reference genome and reconstruct the tree? Alternatively, should I keep the reference genome in the tree and describe this anomaly in my discussion section? Any insights or advice on how to handle this situation would be greatly appreciated!

Thank you!

Tree Clade Incorrect Phylogenetic Reference Genome • 393 views
ADD COMMENT
0
Entering edit mode
5 days ago
Mensur Dlakic ★ 28k

Has anyone experienced a similar issue?

I am sure many people have gotten the trees they didn't expect.

Assuming you know how to do a tree reconstruction, the obtained tree doesn't lie. Either the two species are more related than you think, or something went wrong upstream of tree reconstruction.

You didn't give us enough information to offer more than this general conclusion. We don't know: 1) is this is a single-gene or concatenated tree? 2) DNA or protein? 3) how did you align the sequences? 4 were the alignments trimmed? 5) what were you expecting in terms of tree distances and what did you get? 6) what are other tree members and how are their distances relative to your species of interest?

ADD COMMENT
0
Entering edit mode

Dear Mensur Dlakic Thank you very much for your response. I am currently working on constructing a phylogenetic tree based on mitochondrial genomes. I included five existing mitochondrial genomes along with two outgroup species to build the tree.

Here is my results: enter image description here

Figure 1

A. sachalinensis is my target genome, and A. alba is my reference genome. After annotation, I blasted all genes, and corrected it manually.

(1)I used two methods: one is a concatenated multi-gene tree constructed with IQ-TREE2, as Figure 1 shown (a), and the other is a coalescent-based species tree constructed using ASTRAL(Figure 1 (B)). (2)The analysis was performed using DNA sequences. (3)I aligned the sequences using MAFFT. (4)The alignments were not trimmed and were directly used for tree construction after alignment. (5)My primary goal is to compare the placement of species in the mitochondrial tree with that in the chloroplast tree. However, I found discrepancies between the placement of species in the mitochondrial tree and the chloroplast tree. (6)The tree members and the two reference papers for comparison are provided in the attached image. The I would like to refer to the chloroplast phylogenetic tree illustrated in Figure 2 of Semerikova et al., 2018, as well as the chloroplast and mitochondrial phylogenetic trees presented in Figures 3 and 4 of Park et al., 2024. Thank you so much! enter image description here

Figure 2 enter image description here

Figure 3 enter image description here Figure 4

ADD REPLY
0
Entering edit mode

Trees don't lie, so I suspect it is something in the upstream methodology.

I am not sure we are comparing apples to apples here. If yours are concatenated trees, is the number of genes the same as in other trees? Have you looked at your alignments? I suspect something is off in them, and doing a visual inspection may give you an answer. Trimming should definitely be done unless the alignment is continuous and without any suspicious gaps.

I can keep asking questions and you will probably keep providing answers, but I don't have the capacity to turn this into a long back-and-forth. I suggest you backtrack all the way and make sure all the steps are correct. Maybe do this first on a smaller scale, say 3-5 concatenated genes, and see what you get from that. If it looks OK, try to replicate it on a larger number of genes. Try also to replicate the exact methodology of other trees.

Once again: trees reflect the alignments. Assuming you have done the tree reconstruction correctly, your problem is most likely somewhere in the alignment creation.

ADD REPLY

Login before adding your answer.

Traffic: 2288 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6