Best way for phylogenetic tree with missing data
2
0
Entering edit mode
4.6 years ago
Picasa ▴ 650

Hi all,

I have 50 species and 8 genes. My goal is to make a phylogenetic tree but the problem is that I don't have the 8 genes for all species. Some species have just one gene for example, and also most of the genes are fragmented (so for the same gene, the size is different between species).

What is the best way to do this:

1) Concatenate all genes for all species and perform a multiple alignment (so with 50 fasta)

2) Do an alignment for each genes (so with 8 fasta) and then concatenate all alignements to species level ?

3) Other methods ?

Thanks for your help.

phylogenetic tree • 1.6k views
ADD COMMENT
1
Entering edit mode
4.6 years ago

I don't know what the best way is, but you can align the genes separately and then compare the 8 trees using treespace and select the most representative tree yourself or let treespace create it for you based on the input trees. Additionally, this method will tell you if all of the genes point to the same evolutionary history or there was a recombination or horizontal gene transfer. Treespace is easy to use and has great vignettes that walk you through the whole process.

ADD COMMENT
1
Entering edit mode
4.6 years ago
Mensur Dlakic ★ 28k

Your option 2) is the way to do it. Trimming the alignment after concatenation is always a good idea.

I would exclude any species that doesn't have at least half the genes, or half the combined sequence length in case there are large differences in gene sizes.

ADD COMMENT
0
Entering edit mode

Thanks ! Do you have any recommendations for the trimming step ?

ADD REPLY
0
Entering edit mode

I usually trim at 50% gap threshold.

ADD REPLY
0
Entering edit mode

Thanks,

To infer the phylogeny, do you perform partition analysis with each gene coordinate ?

ADD REPLY

Login before adding your answer.

Traffic: 2044 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6