I am reconstructing a gene tree of geneA of a bacterial strain. Some sequences were identical. Should I include those sequences in Tree analysis. If yes then what posible explanation for that.
I am reconstructing a gene tree of geneA of a bacterial strain. Some sequences were identical. Should I include those sequences in Tree analysis. If yes then what posible explanation for that.
Two interesting posts:
How do exactly identical sequences in the alignment effect the log(likelihood) score?
Phylogenetic tree editing: Reinserting removed identical sequences
I will quote the response by Alexandros Stamatakis (RAxML author) from the first link, so my answer consists of more than just links:
That's a good way to do it, I'd remove the duplicate seqs though because
they don't provide additional info
the analyses will run faster with less seqs
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Yes, you have to include them. Those bacteria are very close
relatives and/or this is an important gene, some mutations spoil its
function and evolution got rid of such bacteria (with mutated gene).
What do you think the explanation is?