How to determine consensus sequence and conservation score for each protein?
1
1
Entering edit mode
6.6 years ago
Kian ▴ 50

How i can determine to consensus sequence and conservation score " percent of conservation" for some protein? i have some genes and i want to determine these. i know in CLC and using create alignment can do it but i want another tools for this work. Thanks

protein consensus sequence conservation score • 5.4k views
ADD COMMENT
0
Entering edit mode

If these are known proteins then you could take a look at homologene database at NCBI. That should give you pre-computed alignments of homologous sequences.

A blast search followed by extaction of the related sequences and a multiple sequence alignment (using Clustal omega, MAFTT, t-coffee, muscle) would get you the alignments/consensus.

ADD REPLY
0
Entering edit mode
6.6 years ago
Joe 21k

I wrote a pretty basic script which is just a wrapper to BioPython's dumb consensus method.

https://github.com/jrjhealey/bioinfo-tools/blob/master/Consensus.py

I'd look at maybe using hmmer (hmmemit) though if you want very sensitive consensus, though I think it tends to work better if there are more sequences in the mutliple alignment (but then again that's true for most MSAs).

ADD COMMENT
0
Entering edit mode

Just to be clear the expected input for your script is a multiple-sequence alignment (MSA). So @KIan needs to provide that.

ADD REPLY
0
Entering edit mode

Yep, I assumed that's what they were getting at by saying they have "some sequences" and they are "using create alignment".

ADD REPLY
0
Entering edit mode

i want another tools for this work

Not sure if that wording indicates a desire to not use CLC (or a pre-existing alignment). We will have to wait for OP to clarify.

ADD REPLY
0
Entering edit mode

Thanks for responses, If i have 3 protein sequence, i should align them for consensus sequence? how about conservation score?

ADD REPLY
0
Entering edit mode

You will need to elaborate about what you mean by conservation score. Something like average global pairwise sequence identity?

ADD REPLY
0
Entering edit mode

May be yes, i want to comparing amino acid sequences of three protein that i have, and obtain the consensus sequence and generated homology models for these proteins.

ADD REPLY
0
Entering edit mode

If you align them with a tool like CLUSTAL Omega, it outputs scores for all the pairwise combinations which you could use? I have done that in the past. That would also generate a decent MSA which you can use to generate the consensus sequence.

ADD REPLY
0
Entering edit mode

I see the CLUSTAL Omega, it can align the sequence the amino acids of the proteins, i should get consensus sequence by common amino acids between 3 sequence? (if 2 amino acid is common alignment, it can be in consensus sequence)? and how i can calculate conservation score?

ADD REPLY
0
Entering edit mode

I just told you, Clustal outputs that information when you do the alignment itself.

You should really use more than 3 sequences if you want a decent consensus though, but yes, I suppose you could use 2 out of 3.

The problem you’ll have is what if all 3 sequences have different amino acids? This is why you should consider using a hmm for this.

ADD REPLY

Login before adding your answer.

Traffic: 1994 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6