I have orthologous clusters of viral genes between related species. I am working with Vaccinia virus and want to compare Vaccinia genes from the Copenhagen strain to other Vaccinia strains (ex. WR) and determine dN/dS which hopefully hints to which genes are undergoing rapid evolution and which are constrained. I would also like to then compare Vaccinia genes to MPXV and see how these values change.
I have all the translational alignments done at the codon level, and even computed tree with RAXML for groups of 4-5 strains.
I can run codeml with Nsites=0,model=0 and pairwise= -2 which calculates dN/dS in a pairwise manner. However, reading PAML documentation confuses me, as it seems model=0 is not meant to look for selective pressure or rapid evolution. A paper looking and pro/antiviral factors for HIV used model=8 based on their methods, but also somehow calculated p-values to determine whether genes are statistically significantly undergoing positive selection (and at which amino acid positions) - https://www.nature.com/articles/s41467-022-29346-w
As much as I try to read PAML documentation to understand what they did and what I need to do, I can't get to anything meaningful.
If anyone has a pipeline or scripts to do something similar please let me know. Thanks
I remember when I did a similar gene-level analysis then I had used DataMonkey (https://www.datamonkey.org/) which, if I remember correctly, implements CodeML internally. It is web interface which is very easy to use - you choose you general conditions/criterion and it suggests an evolutionary model for you. If I recall, it only takes the alignment sequences of the gene of interest. Have you tried using this service?