Hi I am trying to use codeml to evaluate positive selection using multiple nucleotide sequences alignment (phylip format) and maximum likelihood tree (newick format tree from RAxML).
I used branch site model by specifying model = 2 and NSsites = 2 and test by using null model (fix omega) and alternative model (estimate omega).
To make certain branch as foreground branch, I need to add #1 on that branch in tree files. I know which taxa I am interested in. If I want to label the branch representing smallest monophyletic group which contains the taxa I am interested in (see example below), how to do it automatically? Since I have hundreds of tree to run codeml, an automatic way to label trees is necessary. Also checking tree file that contains hundreds of taxa to find the branch is painful.
( (A37:0.0392824443,A36:0.2301499674):0.0070550356, (A35:0.0345817897, (A34:0.0028844901,A33:0.0086649064):0.0413749361 ):0.0479157566)
For example, I am in interested in A35. The smallest monophyletic group contains A35: is
(A35:0.0345817897, (A34:0.0028844901,A33:0.0086649064):0.0413749361 )
So I will add #1 behind that right parentheses. so the finally labeling should be
( (A37:0.0392824443,A36:0.2301499674):0.0070550356, (A35:0.0345817897, (A34:0.0028844901,A33:0.0086649064)#1:0.0413749361 ):0.0479157566)
^ ^
But how can do it automatically?
the following Codeml wrapper could interest you: http://etetoolkit.org/docs/2.3/tutorial/tutorial_adaptation.html (Easy branch selection and inline visualization of results)
also available as an interactive command line tool: http://etetoolkit.org/documentation/ete-evol/