Hi everyone,
I encountered an issue while building a phylogenetic tree and would appreciate some guidance or discussion.
I used a reference genome to annotate my target genome, but during the phylogenetic analysis, I found that the reference genome was incorrectly grouped into the same clade as my target genome. Biologically, this reference genome should be much more distant from my species.
Here are my questions:
Has anyone experienced a similar issue? If I want to build a robust phylogenetic tree, should I remove this reference genome and reconstruct the tree? Alternatively, should I keep the reference genome in the tree and describe this anomaly in my discussion section? Any insights or advice on how to handle this situation would be greatly appreciated!
Thank you!
Dear Mensur Dlakic Thank you very much for your response. I am currently working on constructing a phylogenetic tree based on mitochondrial genomes. I included five existing mitochondrial genomes along with two outgroup species to build the tree.
Here is my results:
Figure 1
A. sachalinensis is my target genome, and A. alba is my reference genome. After annotation, I blasted all genes, and corrected it manually.
(1)I used two methods: one is a concatenated multi-gene tree constructed with IQ-TREE2, as Figure 1 shown (a), and the other is a coalescent-based species tree constructed using ASTRAL(Figure 1 (B)). (2)The analysis was performed using DNA sequences. (3)I aligned the sequences using MAFFT. (4)The alignments were not trimmed and were directly used for tree construction after alignment. (5)My primary goal is to compare the placement of species in the mitochondrial tree with that in the chloroplast tree. However, I found discrepancies between the placement of species in the mitochondrial tree and the chloroplast tree. (6)The tree members and the two reference papers for comparison are provided in the attached image. The I would like to refer to the chloroplast phylogenetic tree illustrated in Figure 2 of Semerikova et al., 2018, as well as the chloroplast and mitochondrial phylogenetic trees presented in Figures 3 and 4 of Park et al., 2024. Thank you so much!
Figure 2
Figure 3 Figure 4
Trees don't lie, so I suspect it is something in the upstream methodology.
I am not sure we are comparing apples to apples here. If yours are concatenated trees, is the number of genes the same as in other trees? Have you looked at your alignments? I suspect something is off in them, and doing a visual inspection may give you an answer. Trimming should definitely be done unless the alignment is continuous and without any suspicious gaps.
I can keep asking questions and you will probably keep providing answers, but I don't have the capacity to turn this into a long back-and-forth. I suggest you backtrack all the way and make sure all the steps are correct. Maybe do this first on a smaller scale, say 3-5 concatenated genes, and see what you get from that. If it looks OK, try to replicate it on a larger number of genes. Try also to replicate the exact methodology of other trees.
Once again: trees reflect the alignments. Assuming you have done the tree reconstruction correctly, your problem is most likely somewhere in the alignment creation.