The support value of my phylogenetic tree is very low. What are the possible reasons? How to improve the stability of the tree? (The peptide sequences were retrieved from the gene set of different insect genomes.)
The support value of my phylogenetic tree is very low. What are the possible reasons? How to improve the stability of the tree? (The peptide sequences were retrieved from the gene set of different insect genomes.)
Tell us more. How do you produce the tree?
In general, you should (1) get the protein sequences*, (1a) remove pseudogenes if you can, (2) align them (clustalX), (3) manually select conserved region(s), (4) iterate between points (2) and (3) till it is stable and then (5) bootstrap, (6) produce trees and (7) consensus.
I suspect you might have skipped step (3) and (4). If, after doing so, you still have low bootstrap values, I suspect you'll have some branches with good support but lower values in the "high branches". Focus on clusters of well conserved genes, remove them, cluster them alone and re-cluster the other with only one or two genes from each group.
Phyolgenetic treee are a bit of a craft...
Using "the protein sequences" is not always the right thing to do. It depends. See also this question.
what is you gene of interest, or family of interest? if you are dealing with completely sequenced species, you can try to browse phylomeDB and Ensembl for Metazoa for you genes of interest. Both offer good quality of reconstructed alignments and trees.
Another question is how much data do you have (i.e. how many well-aligned positions?). If it's e.g. a single-gene alignment, it might not contain much information about the poorly-supported branches and you wouldn't have really been doing anything wrong. Poorly-aligned (or very variable) sequences, or sequences that are almost identical, might also not "be able" to resolve many of the branches on the tree. Or the sequences might be evolving in an unusual way.
Sometimes it helps to remove the most divergent sequences. Also the quality of the MSA is of uttermost importance, I recommend to use MAFFT or ProbCons to calculate it.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
HI
I appreciate to know how to read and interpret-ate the phylogenetic tree for viruses?
What does the numbers for bootstraps mean and how interpreted?
Regards
Oumed
The bootstrap values signify how sure one can be that the tree is correct at any given node. A tree is calculated by looking at existing molecular data, and the more data one can gather, the easier it will be to relate the sequences in a highly-confident phylogeny. If the sequence evidence is not good enough, the bootstrap values will be low.