Question

How to optimize Matrix by Differential evolution algorithm??

0

Entering edit mode

7.8 years ago

eng_samar_zayed • 0

I need to optimize matrix that compute distance between species using differential evolution algorithm ...

And i need to sure about my implementation code ..

Distance Matrix Optimization • 2.4k views

ADD COMMENT • link updated 7.8 years ago by Frank Wright (BioSS) ▴ 80 • written 7.8 years ago by eng_samar_zayed • 0

0

Entering edit mode

I need to optimize matrix that compute distance between species using differential evolution algorithm ...

Why? (homework?) What is the input data?

What about your other problem with the phylogenies, is this related? I suggest you focus on solving one problem at a time, and finish following up on your previous posts.

ADD REPLY • link 7.8 years ago by Michael 55k

0

Entering edit mode

yes, it is related but i separated them in 2 parts for confusion ... but according to your follow you will surely understand :)

I just need to be sure about my work because i used(ML) tests to compare my optimal tree and my start tree but my start tree was better so, i suggest it is optimization problem in my code ... i am using Differential evolution to optimize distance matrix and i need to sure about my code .

ADD REPLY • link 7.8 years ago by eng_samar_zayed • 0

0

Entering edit mode

Why do you want to use the method differential evolution to optimize a phylogenetic tree? You would have a hard time beating the best available methods ML and Bayesian. Is there a publication using that technique for phylogeny?

ADD REPLY • link 7.8 years ago by Michael 55k

score 1 · Answer 1 · 2017-02-22

I assume that you are developing a new method that estimates a distance matrix and are then using a distance-based method to estimate a phylogenetic tree.

You might be best to compare your distance matrix to distance matrices rather than moving from distance matrix to tree and then doing a tree-to-tree comparison. By moving to trees and then comparing a tree from distances with a tree direct from an alignment - I think you are adding two unnecessary additional steps.

You can compare distance matrices using a Least Squares fit (as used by Fitch Margoliash) - see Sums of Squares calculation: http://evolution.genetics.washington.edu/phylip/doc/fitch.html

You can also use the Mantel test (from statistics) to compare two distance matrices: https://en.wikipedia.org/wiki/Mantel_test

You should also consider how distance matrices used in phylogenetics are produced. For example, if you use a 2-parameter nucleotide substitution model (e.g. with transition and tranversion parameters, usually expressed as a ratio), do you allow that to vary for each pairwise distance calaculation or do you constrain all pairwise distances to have the same value? You can also vary the substitution model from one parameter (Jukes Cantor) to six parameters (GTR).

Sice you seem to be developing a method, why not simulate alignment data from a tree using a stated model, using the SEQGEN program?
http://tree.bio.ed.ac.uk/software/seqgen/

You will then know the "truth" and can see how well your method works. You can choose a tree (branching order, branch lengths) and directly write down the distance matrix from it (DM-expected) by adding branch lengths together. Then simulate an alignment. Then estimate your distance matrix (DM-observed). You then comapre the performance of your method by comparing DM-observed with DM-expected. Your method may work better with alignments with many or few species, may have problems with short alignments, and may work better with sequence that are very different. You can vary the tree scale (i.e. divide or multiply branch lengths by a factor) and vary alignment length using SEQGEN.

This approach may be of interest to you. See "PANDIT-Based simulation" section near the end of the paper: https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/mst024

This involves simulating data based on real alignments.