How To Compute A Local Multiple Protein Sequence Alignment?
3
1
Entering edit mode
13.2 years ago
Christian ★ 3.1k

Given a set of related protein sequences based on pairwise BLAST searches, how do I construct a multiple sequence alignment that contains only the similar subsequences (presumably representing the domain) and ignores unrelated parts of the input sequences?

I assume this boils down to the problem of local multiple sequence alignment.

local multiple protein sequence domain • 7.2k views
ADD COMMENT
1
Entering edit mode
13.2 years ago
Iain ▴ 260

You could use T-Coffee to do this, http://tcoffee.crg.cat/ By default, it combines global alignment and local alignment, but you could choose to only use local alignment. Also, you could use the M-Coffee option to only combine the output of local alignment programmes

Alternatively, you could input your own pairwise alignment and use T-Coffee to generate an optimum multiple alignment

ADD COMMENT
0
Entering edit mode

this is a wonderful resource. A first try of PSI-T-Coffee gave me very promising results. +1!

ADD REPLY
0
Entering edit mode

I should probably add that T-Coffee does not get rid of unrelated sequences in the alignment (as asked for in my question), but at least it seems to correctly align the related subsequences. I attribute this to the pairwise local alignment algorithms used for constructing the library.

ADD REPLY
0
Entering edit mode

You can use the CORE option on the website to identify which parts of your sequences are well aligned. You should then refine the alignment manually by excluding those sequences that don't align well.

A multiple sequence alignment programme will only work as well as the input sequences.

I don't think "subsequences" means removal of unrelated sequences.

ADD REPLY
0
Entering edit mode

I regard this question as solved, because the proposed solution basically works and one can just ignore/remove unrelated sequences with low consistency score.

ADD REPLY
0
Entering edit mode
13.2 years ago

for Protein multiple alignment you can try out COBALT www.ncbi.nlm.nih.gov/tools/cobalt/

ADD COMMENT
0
Entering edit mode
13.2 years ago
Adrian ▴ 700

The gaps inserted in a multiple sequence alignment allow it to 'ignore' the unrelated parts of the input sequences. So your concern about making alignments local is already implicitly addressed, though depending on what you're after you might want to fiddle with the penalties.

Another software option is clustal omega / clustalw, from http://clustal.org

ADD COMMENT
0
Entering edit mode

clustalw failed miserably on my sequences, which is why I posted this question. In case of clustalw, the attempted global alignment is greatly misled by the presence of too much unrelated sequence data (my sequences show only weak similarity over about 20% of their length). Fiddling around with gap penalties did not improve the alignment significantly.

ADD REPLY

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6