How to extract the probability of replacing one amino acid by other form BLOSUM matrix?
1
1
Entering edit mode
10.5 years ago
shark_1 ▴ 10

Every element of BLOSUM matrix is computed by the formula (from Wikipedia):

S_{ij}= \left( \frac{1}{\lambda} \right)\log{\left( \frac{p_{ij}}{q_i * q_j} \right)}

p_{ij} is the probability of two amino acids i and j replacing each other in a homologous sequence, and q_i and q_j are the background probabilities of finding the amino acids i and j in any protein sequence. The factor \lambda is a scaling factor.

I would like to compute probability of replacing one amino acid i by another j. The BLOSUM matrix is implemented in biopython module, but unfortunately I have not found the probabilities q_i and q_j. Is there any easy way to obtain it or compute it?

biopython BLOSUM • 4.1k views
ADD COMMENT
1
Entering edit mode
10.5 years ago
Hugues ▴ 250

The background probabilities are the probabilities of occurrence of each amino acids.

These are observed probabilities. You can gather a set of representative proteins for your particular organism and count how often they occur. Just to give an idea:

AA observed probabilities in vertebrae

Alanine          7.4%
Arginine         4.2%
Aspargine        4.4%
Aspartic Acid    5.9%

Update1: You will therefore get a different BLOSUM matrix for each organism, but also if you are in interested in comparing proteins that have non-standard compositions (see this paper for instance)

About your question: If you really want, you could try to compute them. For example you know that there is only one way to code for Tryptophan (UGG) while there are three ways to code for Isoleucine (AUU, AUA, AUC). Therefore we could say that Trp is three times less likely than Ile. Knowing that DNA is 22.0% U, 30.3% A, 21.7% C, and 26.1% G, you could in principle compute in-silico the probabilities. Of course, Nature works differently and you should expect discrepancies with the observed probabilities. In particular for Arginine which does not follow those rules at all.

Ref

ADD COMMENT
0
Entering edit mode

Please upvote and "accept" if this answered your question.

ADD REPLY

Login before adding your answer.

Traffic: 1633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6