I am interested in calculating the dN/dS ratio along all the codon sites across all the columns of ~45 aligned RNA virus genomes (including non-coding regions). I am not exactly sure of the best way to go about this. Should I run the whole genome at once, or specific regions at a time? How should gaps in the alignment be handled? I have seen many programs that calculate these values but I am not sure which would be the best (PAUP, PAML, MrBayes, etc).
I am especially interested in seeing if there is any selection to conserve synonymous codons in the non coding regions. So it is better to use individual regions. Should I just cut these regions from the whole genome alignment or independently align them? I agree that PAML is difficult to use, lots of parameters.
I think it would be better to localize the regions with sinteny and perform independent alignments using them, similar to looking for orthologs when working with genes. If you perform the whole genome alignment and then cut it it's going to take a lot of time, computational power and you will introduce a lot of errors due to alignment programmes imperfection trying to align non-homologous regions.
The biggest problem I have with PAML is dealing with the parameterization of it, any suggestions to that end?
Actually the link that posted fransua contains a tutorial to run the free ratio model that is the one to calculate dN/dS. I would recommend you to use ete2 or at least follow the instructions in the tutorial to configure your PAML parameters file.