Question

How find the best model for calculate dN/dS with big data (727 sequences)?

0

Entering edit mode

4.3 years ago

mauricio.1313 • 0

Hi!

I´m working on my thesis project to calculate dN/dS with CODEML, my data have 727 sequences. I have 3 different proteins (727 sequences each one) of different viruses. I performed different analyses, for example, a parallel analysis with Pairwise Comparison (Yiang and Nielsen 2000), model 0 (M0) and site model.

Also, I had evaluated the likelihood of the sequences for a variety of ω values and the sensitivity of ω to the transition/transversion ratio (k). My principal goal now is to find the best model for my data, to calculate ω ratio. I know that with de Lnl I can make an LTR for then evaluate the best model for my data, but my problem is finding a unique Lnl for my 727 sequences which a specific model, for example, M7 (LnL = 0,84848....), other for M8 (LnL = ....).

My problem is that with all analyses that I made previously obtain an LnL for pairwise comparison for each sequence's data, this means 727 * 727 LnL differents.

I´m looking for something like this:

Model 0: one-ratio ... lnL(ntime: 23 np: 25): -1133.892429 +0.000000 ... kappa (ts/tv) = 2.27532 omega (dN/dS) = 0.91089 branch = 14..1 t = 0.024 N = 234.4 S = 38.6 dN/dS = 0.9109 dN = 0.0078 dS = 0.0086 NdN = 1.8 0.3 SdS = 0.3

Model 1: NearlyNeutral (2 categories) lnL(ntime: 23 np: 26): -1110.361397 +0.000000 ... kappa (ts/tv) = 2.28035 dN/dS (w) for site classes (K=2) p: 0.49440 0.50560 w: 0.08871 1.00000 ... branch = 14..1 t = 0.025 N = 234.4 S = 38.6 dN/dS = 0.9109 dN = 0.0078 dS = 0.0086 NdN = 1.8 SdS = 0.3

Model 2.... Model 7 ... Model 8...

In this moment I can´t continue with my thesis, because I can´t find the way for made the best model to represent ω.

Any idea or comments is welcome!

Best wishes!

gene PAML CODEML evolution sequence • 904 views

ADD COMMENT • link 4.3 years ago by mauricio.1313 • 0