Greetings :)
I'm having difficulty feeding raxml many MSAs. My final goal is to create a single ML majority-rule consensus tree based on 6,405 alignments of orthologous genes. My pipeline thus far is orthomcl > MUSCLE > trimal > raxml. My bottleneck is raxml. Here's what I tried (as well as variations of this loop):
Run full BS and ML analysis in raxml
for f in $(ls raxmltest/vbro*.phy); do
raxmlHPC -f a -x 12345 -p 12345 -# 100 -m PROTGAMMAJTTF -s $f >${f/%.phy} -n Test;
done
My issue is that running $f -n Test only writes output for 1 MSA. I would like to write output for all MSAs. Any advice or assistance much appreciated. Even better if you can help me with the next step as well - using the 6,405 trees to build a single consensus. I know the following works for one MSA.
Use bootstrap replicates to build majority-rule consensus tree
for f in $(ls raxmltest/vbro*.phy); do
raxmlHPC -m PROTGAMMA -J MR -z RAxML_bootstrap.Test -n Test;
done
Thank you!
why not concatenate all alignments together? As I know, concatenation is a standard approach in multi-locus phylogeny reconstruction. Maybe you have different Taxon sampling in each alignment?
Thank you for the reply. I think concatenating the sequences would produce an accurate tree and I will do that; however, a few recent publications are big on using many trees to create a consensus tree and I wanted to try it out as well. I do not know how much they will vary. I'll continue posting what I learn to this thread.
I think you're talking about the supertree approaches. Usually they're used when taxon sampling can't be matched for each locus and you'd like to keep as much as information as possible (concatenation would throw away alignments that have missing taxa).
My bacteria are especially adept at lateral gene transfer so you're correct that not every strain matched locus for locus. It would be nice to retain as much data as possible in the tree.
Lateral gene transfer is a major problem in phylogeny reconstruction of prokaryotes. Indeed, supertree methods might be better in this case, to resolve the lateral gene transfers where you see strong topology disagree among loci.