Construction Of Big Phylogenetic Tree
1
1
Entering edit mode
11.3 years ago
sanchezcavani ▴ 220

I am trying to construct a big phylogenetic tree where about 800 OTUs will be involved in. I guess I need to reduce the number of OTUs in the tree by choosing some representative species. However, I am not sure how to do it. Does anyone know about how representative species should be chosen? Or is there any paper about this issue? Thanks a lot!

• 4.4k views
ADD COMMENT
0
Entering edit mode

Well, sequence type and length as well as your computational resources come into play here. Consider the quality of alignment and tree. Large, diverse sets of species are generally preferable.

ADD REPLY
2
Entering edit mode
11.3 years ago
Joseph Hughes ★ 3.0k

You can easily generate phylogenies of 800 OTUs using tools such as RAxML. If you really want to reduce your set of OTUs, you will need to choose a threshold of similarity between the sequences and this becomes rather subjective. If you really want to do the latter, then useful tools are CD-HIT and its companion cdhit-cluster-concensus for creating non-redundant sets above your arbitrary threshold.

ADD COMMENT
0
Entering edit mode

Yup, 800 OTUs is certainly feasible, especially for single gene phylogenies. More diversity is usually better, although be careful of things like paralogs in your dataset (unless that is something you are intending on looking at). Otherwise, as suggested I would only look at removing sequences that don't really add much in terms of sequence diversity to your dataset.

ADD REPLY
0
Entering edit mode

Thanks! RaxML will cost more than two weeks which is a still little long. Also I am thinking that the tree figure may not be very clear when using the full species tree.

ADD REPLY
0
Entering edit mode

There is also RAxML-Lite which will run faster. FastTree2 should also give you an approximation of the ML tree quite rapidly. Assuming groups resolve themselves well in your tree (not guaranteed if it is a single gene phylogeny) you can collapse groups afterwards with a program like FigTree for making final figures. Then include the full tree in your supplementary materials.

As an aside, two weeks isn't a long time for doing a good phylogenetic analysis. Large phylogenies often take weeks to months, even on clusters, to do properly for some analyses. Anything worth doing is worth doing the best possible way.

ADD REPLY
0
Entering edit mode

Thanks for your helpful suggestions!

ADD REPLY

Login before adding your answer.

Traffic: 1679 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6