Filtering Msa For Phylogenetic Tree Construction
4
0
Entering edit mode
11.6 years ago
Pappu ★ 2.1k

I am trying to filter MSA for phylogenetric tree construction. So I removed columns with >33% of gap and also sequences with >33% gap. I also removed sequences with >70% sequence ideantity to reduce the number of sequences. I am wondering if it is the correct way of doing it. Thanks.

phylogenetics msa • 5.4k views
ADD COMMENT
1
Entering edit mode

It's difficult to say whether fixed parameters like that are a good idea --- it depends on how similar the sequences you are analyzing are. It is probably better to use one of the available tools that will select sites based also on e.g. biochemical similarity, or adapt "masking" parameters to the overall conservation of the alignment - trimal, guidance, bmge, gblocks, or one of several other possibilities. You could use these programs, then take a look at the alignment to get a feel for the effect they are having. Ideally, you would infer at least a couple of trees using more and less conservative filtering to see how robust your results are to site selection.

ADD REPLY
2
Entering edit mode
11.6 years ago
Aldo ▴ 60

You should check trimAl tool at http://trimal.cgenomics.org/

Here is the associated reference: http://www.ncbi.nlm.nih.gov/pubmed/19505945

ADD COMMENT
2
Entering edit mode
11.6 years ago
Biojl ★ 1.7k

I agree with Aldo that trimAl is a very good tool to let's say "cut" alignments and get rid of the gappy regions, it might be OK for phylogeny but it might not be optimal for other analyses.

I don't think removing sequences with >70% similarity is a good idea, since you are losing a lot of information, although that may depend on the species you are using to build such MSA and their divergence time from each other. Specially for very close species the most informative alignments will be the most similar ones >90%.

You may be interested in selecting isoforms before the MSA and thus, obtaining better alignments that do not require that much trimming. That will not only improve your phylogenetic reconstruction but will give you better results if you do further analyses with those MSA such as positive selection or others. Reference: http://gbe.oxfordjournals.org/content/5/2/457

ADD COMMENT
1
Entering edit mode
11.6 years ago
cts ★ 1.7k

It sounds good to me. These kind of filtering steps can be specific to the alignment that you're using. You might need to vary the identity cutoffs for the gaps and sequence similarity to get it right. You'll know if you're removed too many positions from the topology of the tree, removing too many variable positions will cause nodes to sit on top of each other, but based on the settings that you've used I don't think that will happen.

ADD COMMENT
0
Entering edit mode
7.5 years ago
al-ash ▴ 210

I suggest to check this article on the benefits of MSA filtering for subsequent phylogeny reconstruction - you might decide not to filter at all after reading it :)

ADD COMMENT

Login before adding your answer.

Traffic: 1738 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6