I was trying to construct a phylogenetic tree for related 80 fish based on their whole genome sequences. I have used BUSCO to identify the single-copy orthologs and the common single-copy orthologs from them. then I extracted sequences of these common single-copy orthologs from their Genomes. However, the problem is that for a single id, all the sequences from these organisms much more diverged from each other than expected. can any one say me the cause and how the problem can be solved? and if any one have any other idea through which a whole genome phylogenetic tree can be constructed kindly let me know.
You can't undo/obscure the evolutionary signal in your data. It sounds like your assumptions about evolutionary distance between organisms may be wrong, or your data has a lot of spurious variation.
mtDNA is often used for phylogenetics given it's slower rate of accumulating mutations.
I should proceed with protein sequences. I have the mitochondrial sequences also I will use them as well. thanks for the suggestion.
making a concatenated alignment of all your common single copy genes might be worth considering?
i am trying to do that only. but when I extracted all (nucleotide sequence ) CDs sequences of the single copy genes and merged them for single genes for all organism, I found they are diverged from each other and have stop codon in them.
Can you elaborate on the that stop codon issue you mention?
If annotated genes include stopcodons it's because the gene(ome) annotation was wrongly done.
Is there a reason you are not using the protein sequences identified by BUSCO?
I was unaware of fitting a time tree with protein sequences, so I considered nucleotide sequences but now I think I should proceed with protein sequences only. Thank You for your response.
Why do you think that being "much more diverged from each other than expected" is a problem?
Also, are you sure that you took sequences in the proper orientation? Sometimes sequences seem more divergent than they actually are if one of the sequences is taken reverse-complement compared to what it should be.