Hi,
I would like to make a single phylogenetic tree after a multiple alignment of sequences of differents species.
For the sequences, should I use the whole transcript or only de CDS part ? or it doesn't matter.
Thanks for your help.
Hi,
I would like to make a single phylogenetic tree after a multiple alignment of sequences of differents species.
For the sequences, should I use the whole transcript or only de CDS part ? or it doesn't matter.
Thanks for your help.
It does matter.
Go for the CDS (or the protein is even better). Evolutionary constraints/conservation works mainly on those sequences (and less on the UTR, which is what you add when using the complete mRNA/transcripts).
If in the unlikely event that the CDS/protein does not give you enough 'resolution' you might consider adding UTRs to the analysis but keep in mind that comes with a whole bunch of other issues :/ (you might even go very wild and add all the introns, but let's no go there yet :) )
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks for your answer.
I am not sure to understand this. Do you mean algorithm used by phylogenetic softwares (such as raxml or phyml or whatever) used model that worked with cds (dna level) and/or amino acid ?
well, yes and no.
I meant that the biological constraints put into place by evolution mainly play on those kind of sequences. As a consequence the algorithms that model/analyse those things are also best performing (optimised) on those sequences indeed.