Hi all,
I have a very large (~30,000 sequence, each ~17000 bases) multifasta alignment and I am wondering if this is too large to construct a phylogenetic tree? If not, which program would be most appropriate for this use case?
Thank you!
Hi all,
I have a very large (~30,000 sequence, each ~17000 bases) multifasta alignment and I am wondering if this is too large to construct a phylogenetic tree? If not, which program would be most appropriate for this use case?
Thank you!
Unless you are starting a new classification (new tree of life?) or building some sort of public database, 30K sequences is completely unnecessary. For just about any other purpose I can think of, that many sequences is an overkill. For publications or for grants, it is not practical to inspect trees that have more than few hundred branches, and even those would have to be collapsed into groups.
Your purpose for doing this aside, it will be difficult to get this tree to converge. With IQ-TREE in the fast bootstrap mode (a minimum of 1000 bootstraps which may not be enough for you) and 20-40 CPUs, it takes half a day for a protein alignment of ~150 sequences that are ~15,000 residues each. This may give you some idea about the time needed when you scale it up to what you have - and I don't think it scales up linearly.
If you still want to do it, you may want to give this a look:
https://cme.h-its.org/exelixis/web/software/examl/index.html
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
How was the multifasta generated? Generally I would be very skeptical of the quality of any MSA of that size. Most tools break down long before that.
It was generated with MAFFT. I agree, the construction of the tree is actually part of post-processing/quality checking
I would suggest using
RAxML-NG
oriqtree
. I believe thatiqtree
is faster than RAxML though.Unless OP has thousands of cores, I think he would be better off with e.g. fasttree
IIRC
iqtree
has a fast mode which performs comparatively tofasttree
Just curious: any reason you have and use two accounts?
Oh sorry, I forgot I had already made an account this summer to ask a question (before getting my DTU email). I will go delete the old one.