RAXML phylogenetics analysis
1
0
Entering edit mode
6.5 years ago
MAPK ★ 2.1k

I tried to run RaxML(https://sco.h-its.org/exelixis/resource/download/NewManual.pdf) tool to generate phylogenetic tree (Maximum Likelihood tree) using the command below. I ran 100 bootstraps and got the tree, but the bootstrap value on the tree is 100 for all branches. I compared the same data and made the ML tree with mega which gave me similar topology but completely different bootstrap values. Could someone please help me if there is anything I am doing wrong with my commands below. Thanks for your help in advance.

Here is my aligned fasta file: https://www.dropbox.com/s/9s05msdpn67tqnh/test_mpk.fas?dl=0

Performed model test using PROTGAMMAAUTO command:

raxmlHPC-PTHREADS -s test_mpk.fas -n mpktreeml -m PROTGAMMAAUTO -p 84381764921 -T 20

Then, I ran 100 boostrap trees:

raxmlHPC-PTHREADS -s test_mpk.fas -n mpktreeml_bootstrap_r -N 100 -m PROTGAMMAJTT -p 427482396541 -T 20

concatenated all tree files: cat mpktreeml_bootstrap_r* > allBootstraps

Tested majority rule consensus:

raxmlHPC-PTHREADS -z allBootstraps -m PROTGAMMAJTT -I autoMRE -n TEST -p 3824142315 -T 20

Then finally, got the tree:

raxmlHPC-PTHREADS -f b -z allBootstraps -t mpktreeml_bootstrap -m PROTGAMMAJTT -n mpkBOOTSTRAP.txt

I then used itol to view mpkBOOTSTRAP.txt which looks like this: https://www.dropbox.com/s/gi6i6vkisu8eue4/0FklaRAXqssnABHmZR-Cow.pdf?dl=0

The tree above looks good except it doesn't show correct bootstrap values compared to the tree generated by mega: https://www.dropbox.com/s/xy06glvud0izr1q/mega_mpk_tree.pdf?dl=0

RAXML • 5.4k views
ADD COMMENT
1
Entering edit mode
6.5 years ago
Joe 21k

Are you sure your commands are right?

According to the manual:

Step 4: Bootstrapping Now let's conduct a simple bootstrap analysis. Initially, let's try to find the best-scoring ML tree for a DNA alignment. We refer to this as the best-scoring tree because the ML search problem is computationally hard and we can thus generally not find the optimal tree under ML for a given alignment.

Let's execute: raxmlHPC -m GTRGAMMA -p 12345 -# 20 -s dna.phy -n T13 This command will generate 20 ML trees on distinct starting trees and also print the tree with the best likelihood to a file called RAxML_bestTree.T13. Now we will want to get support values for this tree, so let's conduct a bootstrap search: raxmlHPC -m GTRGAMMA -p 12345 -b 12345 -# 100 -s dna.phy -n T14 We need to tell RAxML that we want to do bootstrapping by providing a bootstrap random number seed via -b 12345 and the number of bootstrap replicates we want to compute via -# 100. Note that, RAxML also allows for automatically determining a sufficient number of bootstrap replicates, in this case you would replace -# 100 by one of the bootstrap convergence criteria -# autoFC, -# autoMRE, -# autoMR, -# autoMRE_IGN.

Having computed the bootstrap replicate trees that will be printed to a file called RAxML_bootstrap.T14 we can now use them to draw bipartitions on the best ML tree as follows: raxmlHPC -m GTRCAT -p 12345 -f b -t RAxML_bestTree.T13 -z RAxML_bootstrap.T14 -n T15. This call will produce to output files that can be visualized with Dendroscope: RAxML_bipartitions.T15 (support values assigned to nodes) and RAxML_bipartitionsBranchLabels.T15 (support values assigned to branches of the tree). Note that, for unrooted trees the correct representation is actually the one with support values assigned to branches and not nodes of the tree!

the -# flag governs bootstrap replicates, but it looks to me like you're using -N?

ADD COMMENT
0
Entering edit mode

Isn't -# same as -N? Here is what they mentioned " The current MPI version only works properly if you specify the ­#  or ­N option in the command line, since it has been designed to do multiple inferences or rapid/standard BS (bootstrap) searches in parallel!"

ADD REPLY
0
Entering edit mode

You might be right. I've only ever seen the hash used personally, thought I doubt thats really the problem.

Do you really need to use the PTHREADS binary anyway? Your tree isn't very large, I haven't looked at the fasta file, but at 18k I'd guess that's not very large either.

Personally, ever since I discovered IQ-Tree, I stopped using raxml (which to my mind has one of the most confusing CLIs in existence).

ADD REPLY
0
Entering edit mode

Yes I need to do this in HPC. The data I am sharing here is just a mock data. I have a very large data to be analyzed so would like multithreading option.

ADD REPLY
0
Entering edit mode

I can't see anything on that manual page about -N, though I'm not at a terminal to check it myself. Might be worth just trying it with -# instead to see if that solves it?

ADD REPLY

Login before adding your answer.

Traffic: 1714 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6