blastp threshold value does not seem to work
1
0
Entering edit mode
3.1 years ago

Hello folks,

I have heard that blastp truncates the query sequence even when there is an exact match. Let me illustrate. Lets say my query is "ABCDEFG" and it has a match with subject from D to G. The previous letters do not match the threshold and gets dropped so my result become.

query:    DEFG
subject:  DEFG

But here is the thing I was told that when the subject sequence is inspected its actually CDEFG but somehow the C gets dropped.

So I was trying to simulate a situation like this and I came across something that I am not able to understand. My input query is LNRNQPAATALANTIE against the pdbaaDB and the code I am using is blastp -query query.txt -db pdbaa -taxidlist negative.list -matrix PAM30 -word_size 2 -threshold 21 and this code is giving me an output as below.

Query  3   RNQPAATALANTI  15
           R QP AT    TI
Sbjct  45  RSQPEATNASQTI  57 

I even set the threshold to 2k but still I am keeping recieving this. I was wondering if any of you could enlighten me?

Thank you

threshold blastp word_size • 1.2k views
ADD COMMENT
2
Entering edit mode
3.1 years ago

First blast is a local aligner, it is not a threshold that drops sections from sequences but the scoring. The aligner will produce the local alignment that has the maximal score.

Also, don't confuse the blast search strategy with the properties of the alignments produced by a strategy.

The words and thresholds describe the search properties, how blast goes about searching for alignments. Various choices there can speed up or slow down the search. It may also have some effect on what hits are found results, but not directly as you describe.

ADD COMMENT
0
Entering edit mode

Thanks for the reply.

Isn't the scoring has to go above the threshold and then alignment gets extended? To my knowledge, word size of 2, scores the amino acids and if its not the above threshold it gets dropped and moves to the next one. In this case,

RN - RS score is 8 which does not pass the threshold and moving to the next one NQ - SQ and so on, and none of the 2 word scoring is above the threshold thus the above results should not be presented, no?

ADD REPLY
0
Entering edit mode

it is not alignment score, it is the word extension scoring,

just because an algorithm starts the search in one location vs another does not mean that it won't find certain alignments. It still finds them, just on a different path.

as I said before it is not a parameter that we usually set to control the resulting alignments - it is a parameter that controls the speed of the search.

ADD REPLY
0
Entering edit mode

Hello again Istvan,

I have been trying to understand to what you have explained. Thank you for that.

There is still something not clear to me. So I set the blast algorithm with parameter to -matrix PAM30 -word_size 2 -threshold 21 I understand that this is a parameter that controls the speed of search as you said. However, Shouldn't the matching alignment (given in the original post) be skipped as it does not meet my search criteria? As I know, its scoring each of the word length (word_size) using a subtitution matrix and only when that score is above the defined threshold it is extending the alignment till the score stays positive.

As explained in here

Additionally, when I add window_size 3 and set the threshold to 9 given alignment disappears and comes back when threshold is below 9. On the other hand when window_size 4, threshold does not make any difference, does not matter how low or high I set the threshold, it always appears.

ADD REPLY

Login before adding your answer.

Traffic: 2357 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6