A rather general question:
The phylogeny of many gene families has been studied using only the coding sequences, with an analysis focused on identifying the relevant genes / aligning them / building trees and exploiting the results.
How can you adapt this process to newly annotated full length transcripts? Do introns matter?
I guess you can do the same work with full length transcripts and compare the trees you obtain with both methods, classify the genes according to the length and structure of their 3'UTR, or look for regulatory motifs. Is there anything else to do, or relevant tools?
I would be interested in any advice, article or textbook reference. This kind of phylogeny often seems to be done on virus...
Thanks for the comment. I was considering olfactory receptor genes, which are badly annotated. They tend to be a bit of a mess (genomic clusters with lots of similar sequences). Zhang and Firestein wrote some papers on their CDS in 2002/2004.
An interesting question is to see if they have conserved regions in their introns/UTRs, and if they evolve at a faster pace than the CDS.