I am handling protein sequence file, for example:
5 592
Homo_sapie MEMQDLTSPH SRLSGSSESP SGPKLGNSHI NSNSMTPNGT EVKTEPMSSS
Macaca_mul MEMQDLTSPH SRLSGSSESP SGPKLDNSHI NSNSMTPNGT EVKTEPMSSS
Mus_muscul MEMQDLTSPH SRLSGSSESP SGPKLDSSHI NSTSMTPNGT EVKTEPMSSS
Danio_reri ---------- ---------- ---------- ---------M SWILMWSLLS
Ciona_inte ---------- ---------- ---------- ------MLFS VYIVMMIVTS
My query is that I want to know if someone has information about the physiochemically similar amino acids. What I mean to say is that there are some amino acids which are considered similar on the basis of same physoichemical properties and on these basis the alignment of two sequences are done. What I want to know is the list of amino acids which fall in this category.
for example, in clustal format file of aligned sequences, how are the amino acids categorized to be [*, ., :]
, I only know that *
the 2 amino acids aligned are identical. Then what amino acids will come under . or :
category.
Homo ESPSGPKLGNSHINSNSMTPNGTEVKTEPMSSSETASTTADGSLNNFSGSAIGSSSFSPR
Macaca ESPSGPKLDNSHINSNSMTPNGTEVKTEPMSSSETASTTADGSLDNFSGSAIGSSNFSPR
Canis ESPSGPKLDNSHRNSNSMTPNGTEVKTEPMSSSEIVSTTADGSLDNFSGSAIGSSSFSPR
Mus ESPSGPKLDSSHINSTSMTPNGTEVKTEPMSSSEIASTAADGSLDSFSGSALGSSSFSPR
Rattus ESPSGPKLDSSHINSTSMTPNGTEVKTEPMSSSEIASTAADGSLDSFSGSALGSSSFSPR
********..** **.****************** .**:*****:.*****:***.****
This information can be found in a basic biochemistry book (and by doing a google search).
https://biology.stackexchange.com/questions/71272/reading-an-amino-acid-physicochemical-properties-diagram
Up till now I have only found the properties of amino acids, and not their similarity list or something. Do you know any helpful material??
ALright, but I am still not understanding what I need to know.
From clustal help page:
Yes I understand but does'nt it mean that if some specific amino acids come across then they are considered as strongly/weakly conserved?
Like, if same amino acids are aligned , this means there is full conservation and an
(*)
symbol is placed there. Then maybe there are other amino acids which are considered as strongly/weakly conserved residues.You are looking for an amino acid similarity matrix - e.g. http://2.bp.blogspot.com/-hxhLatiONEk/U0sIZkl8GpI/AAAAAAAAAL0/tYMbcjVKzoY/s1600/_17545_tabular891.gif.
In amino acid alignments, it's not as easy as just saying that a match is perfect or not, because there is a bit of redundancy in the process. For example, Tryptophan, Phenylalanine, and Tryptophan are all aromatics because they contain a benzene ring, which means that in a protein structure, all 3 amino acids are capable of forming pi-pi stacking interactions which may be critical to a proteins structure. This means that any one of those 3 amino acids might be likely to occupy that spot in the protein, and you'd score any proteins with these 3 amino acids as 'more similar' than one which puts a lysine in that site - for instance. This is just one very specific example, but amino acids can be similar or different in many, many ways.
So does it mean that the decision of amino acids to be weakly or strongly similar is made on the basis of alignment score and on the scoring criteria the symbols
(. or :)
are assigned??? Am I getting it right ???Are you reading what people are posting here?
Well, ofcourse, I am reading all the stuff and also being responsive. I am very thankful to all who are so ready to help every time and give very useful suggestions.
(About the above question, actually I was much confused about the concept , that is why I was not getting the point. )
It is not based on alignment score, alignment score is based on the similarities identified and the extent of alignment. The amino acid similarity scored preceeds the alignment score as far as I’m aware.
This is the matrix that clustal bases it’s decisions on “strongly” or “weakly” similar on. Combine this with the info in genomax’s post:
https://slideplayer.com/slide/1661678/7/images/44/The+PAM+250+Scoring+Matrix.jpg
There are many ways to cluster amino acids based on multiple physico-chemical properties. If you Google "amino acid classification", or check out a chapter on amino acids on any popular biochemistry text book, you'd be able to read up on this.