How can I identify regions that are exclusive to some groups in a protein alignment?
1
2
Entering edit mode
11 months ago

Hi,

I made a protein aligment and then a phylogenetic tree which separated them in two groups. I was wondering if there is any program that gives me which regions were exclusive to each group, in other words, which exclusive regions were responsible for separating the protein in the two groups.

Thank you

protein alignment • 890 views
ADD COMMENT
0
Entering edit mode

What I want is basically to be able to see the regions that the alignment algorithms use to group proteins/DNA regions.

ADD REPLY
0
Entering edit mode

Not sure exactly what you are looking for. The regions that are homologous are going to be different for each alignment. These regions would become apparent once you do the MSA. One needs to make sure that the sequences one is trying to align share some homology to get meaningful results. Idea here is the clades you see shared a common ancestor in past and over time have changed to the extant alignment you see now.

ADD REPLY
0
Entering edit mode

Thank you for your reply. There are conserved sequences, and they share homology, having more or less 70% of similarity. I'm analysing the kinase domain of 10 plant receptors. They are grouped in two major clades. I wish to know a way/program that would give me the main region of the kinase domain responsible for the separtion of these two groups. A more objective analysis than me going through the sequence trying to find it. Do you have any idea?

ADD REPLY
0
Entering edit mode
10 months ago
Michael 55k

First off, there might not be insertion sequences that are present in only one group. Such sequences would create gaps in the alignment and might have been ignored. If you just want to look for some conserved sequences, open the alignment in a viewer like Jalview or Mega and sort the rows by the tree. Then visually inspect the alignment for conserved domains. Another possibility is to calculate the consensus sequence for both groups. You could also try to extract informative sites, those sites that are not identical across all taxa.

Yet another but possibly more viable approach could be to apply ancestral sequence reconstruction using the tree with FastML and then compare the ancestral sequences for the root of each of the two clades.

ADD COMMENT
0
Entering edit mode

Thank you for your reply. There are conserved sequences. I'm analysing the kinase domain of 10 plant receptors. They are 70% similar, more or less. They are grouped in two major clades. I wish to know a way/program that would give me the main region of the kinase domain responsible for the separtion of these two groups. A more objective analysis than me going through the sequence trying to find it. The site that you reccomended to me is out. But thank you anyway.

ADD REPLY

Login before adding your answer.

Traffic: 1884 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6