Question

Compute bootstrap values on consensus tree

0

Entering edit mode

7 months ago

pablo ▴ 310

Dear,

I work on 16S pacbio dataset. I used DADA2 to get my taxonomy, and now I would like to use raxml-ng to get a phylogeny. I first ran DECIPHER to get my sequences alignment and it goes well :

./raxml-ng --check --msa ./output/alignment.fasta --model GTR+G

(...)
[00:00:00] Reading alignment from file:./output/alignment.fasta
[00:00:00] Loaded alignment with 18299 taxa and 10649 sites

Alignment comprises 1 partitions and 10649 sites

Partition 0: noname
Model: GTR+FO+G4m
Alignment sites: 10649
Gaps: 86.32 %
Invariant sites: 45.54 %

Alignment can be successfully read by RAxML-NG.

It is a very time-consuming to get my boostraped tree, using :

./raxml-ng --all --msa ./output/alignment.fasta --model GTR+G --prefix test --seed 2 --threads 128 --bs-metric fbp,tbe

That's why I try to go by first, get a consensus tree from 100 trees and then, compute the bootstrap values :

#consensus
./raxml-ng --msa ./output/alignment.fasta --model GTR+G --prefix test --tree pars{50},rand{50} --threads 128 --start

./raxml-ng --consense MRE --prefix consensus  --tree test.raxml.startTree --threads 128 

#bootstrap 
./raxml-ng --msa ./output/alignment.fasta --model GTR+G --prefix T7 --seed 2 --threads 128  --bootstrap

The bootstrap step is still runing. I will use the --support option to map these values on the tree if it ends on time.

Is it a correct way to do?

Best

phylogeny tree raxml • 624 views

ADD COMMENT • link 7 months ago by pablo ▴ 310

0

Entering edit mode

How did you decide to use 128 threads? Have you run raxml-ng with the --parse flag? If not, I would recommend doing so and try setting the number of threads to the value recommended. Using more threads than is recommended may actually cause the analysis to be slower than it otherwise would be.

ADD REPLY • link 7 months ago by Dave Carlson ★ 1.9k

0

Entering edit mode

I decided on this arbitrarily, thinking that increasing the number of threads would improve computing speed. That command gives me : * Recommended number of threads / MPI processes: 9 . I'll check that. Thanks.

ADD REPLY • link 7 months ago by pablo ▴ 310

0

Entering edit mode

I finally reduce my input dataset, from 19k ASVs to 3600 ASvs because it still gets stuck . raxml-ng runs well on it, with the good number of threads recommended by --parse . Thanks

ADD REPLY • link 7 months ago by pablo ▴ 310

0

Entering edit mode

Reducing my input dataset is the key. It took about 2 days, with 20 starting trees as defaut. I need to set 100 bootstrap replicates, but I suppose it is not enough (since the default value is 1000) ? I also found this topic : https://stats.stackexchange.com/questions/86040/rule-of-thumb-for-number-of-bootstrap-samples#:~:text=Example%20(Table%20V%2C%20ibid.,95%25%20sure%2C%20850%20replications. that says "to be 90% sure that the relative CI length discrepancy does not exceed 10%, 700 replications are sufficient in half of the cases, and to be 95% sure, 850 replications."

ADD REPLY • link 7 months ago by pablo ▴ 310