Hi,
I am using distance based method for phylogenetic tree construction , i have optimized distance matrix , and i compare distance matrices according to standard deviation , but when i increased iterations, standard deviation increases , what did it mean ??
I am using distance based method , I passed distance matrix that used to construct phylogenetic tree to optimization algorithm , So i have a new distance matrix after specific iterations , i compare between these matrices according to standard deviation , but it increases with increasing iterations of optimization algorithm , i want to know, what does it mean according to phylogenetic tree ??
First - I'm not a statistician so check this answer before you go with it.
So you are taking a matrix...improving it...then improving the result... then improving that result and so on? hmmm...
SD tells you about the spread of the population
--> therefore is useful to predict the range of the values in the matrix
--> 1 SD from the mean includes around 66% of a normally distributed population.
Your matrices are very similar (they are just improved versions of themselves) so if you plot them you would expect to see overlapping SD bars.
So in the matrix with the greatest SD, a wider RANGE of values between -1SD and +1SD (from the mean) is required for 66% of the population to be within 1SD of the mean.
In a nutshell, increasing SD indicates a wider range of values are included in the 66% of the sample (matrix) that lies within 1 SD from the mean.
While this sounds ok...what you are basically doing in these iterations is increasing the error bar. My guess is that if you keep iapplying the improvement algorithm, you'll end up with 2 values (one at each extreme end of the scale) and a huge SD - so think carefully about repeatedly using this improvement algorithm.
As I said, im not a statistician, also whenever Ive constructed a tree I have never used SD. So I am only answering the question "what is SD?" based on my own limited understanding. Unfortunately because of this I can argue both sides of the coin.
In favour of decreasing SD: so my original answer
In favour of increasing SD: To me SD tells me the possible range of means I will observe between samples that have no significant difference (or distance) between them. So increasing SD indicates a greater "distance" between samples (i.e. row and column names of the matrix). So maybe the increasing SD is telling you that the values in the improved matrix are more distinct from eachother so will produce a more differentiated phylogenetic tree.
Again, please, you need a statistician or someone experienced in applying SD to trees to check this.
Do you not have the computer resources to use maximum likelihood method instead of distance matrix method?
Thanks for your efforts, but i am master student and i have limited time for my publication and i started with distance method for along time , but metrics of comparison was my problem now.
Why do you have more than 1 matrix? Why would u compare distance matrices? What is the name of the method you are using?
I am using distance based method , I passed distance matrix that used to construct phylogenetic tree to optimization algorithm , So i have a new distance matrix after specific iterations , i compare between these matrices according to standard deviation , but it increases with increasing iterations of optimization algorithm , i want to know, what does it mean according to phylogenetic tree ??