How To Best Calculate Dn/Ds Ratios Of Viral Genomes
2
5
Entering edit mode
11.3 years ago
pld 5.1k

I am interested in calculating the dN/dS ratio along all the codon sites across all the columns of ~45 aligned RNA virus genomes (including non-coding regions). I am not exactly sure of the best way to go about this. Should I run the whole genome at once, or specific regions at a time? How should gaps in the alignment be handled? I have seen many programs that calculate these values but I am not sure which would be the best (PAUP, PAML, MrBayes, etc).

rna phylogeny msa paml • 5.5k views
ADD COMMENT
5
Entering edit mode
11.3 years ago
fransua ▴ 390

Recently ETE2 program (http://ete.cgenomics.org) has come with a new version that completely wraps CodeML and allows to represent visually the results. From my experience CodeML might be easier to use in this way (you can check the tutorial relative to this new part: http://pythonhosted.org/ete2/tutorial/tutorial_adaptation.html).

ADD COMMENT
2
Entering edit mode
11.3 years ago
Biojl ★ 1.7k

Although PAML can be a bit difficult to use sometimes I think it's the best option. Be very careful reading all the options in the configuration file. PAML only uses positions where there are no gaps, and does that automatically, you only have to provide the alignment. That's about the technical part of the question. I don't understand why would you want to run the whole genome at once, to me would be much more interesting to run specific regions and compare them, trying to unravel why particular regions are evolving faster.

ADD COMMENT
0
Entering edit mode

I am especially interested in seeing if there is any selection to conserve synonymous codons in the non coding regions. So it is better to use individual regions. Should I just cut these regions from the whole genome alignment or independently align them? I agree that PAML is difficult to use, lots of parameters.

ADD REPLY
0
Entering edit mode

I think it would be better to localize the regions with sinteny and perform independent alignments using them, similar to looking for orthologs when working with genes. If you perform the whole genome alignment and then cut it it's going to take a lot of time, computational power and you will introduce a lot of errors due to alignment programmes imperfection trying to align non-homologous regions.

ADD REPLY
0
Entering edit mode

The biggest problem I have with PAML is dealing with the parameterization of it, any suggestions to that end?

ADD REPLY
0
Entering edit mode

Actually the link that posted fransua contains a tutorial to run the free ratio model that is the one to calculate dN/dS. I would recommend you to use ete2 or at least follow the instructions in the tutorial to configure your PAML parameters file.

ADD REPLY

Login before adding your answer.

Traffic: 1938 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6