My understanding is that the bias value of the hmmscan output is supposed to rage between 0 and 1:
Bias - The bias composition correction (ranging between 0 and 1), is the bit score difference contributed by the null2 model. High bias scores may be a red flag for a false positive. It is difficult to correct for all possible ways in which nonrandom but nonhomologous biological sequences can appear to be similar, such as short-period tandem repeats, so there are cases where the bias correction is not strong enough (creating false positives).
However for many hits that I am seeing against the PfamA database, I see biases above 1 for both the "full sequence" and "this domain" categories. Am I missing something?
ADD COMMENT
• link
updated 2.7 years ago by
Ram
44k
•
written 10.0 years ago by
pld
5.1k
0
Entering edit mode
From the user guide:
The next number, the bias, is a correction term for biased sequence composition that has been applied to the sequence bit score.1 For instance, for the top hit MYG PHYCA that scored 222.7 bits, the bias of 3.2 bits means that this sequence originally scored 225.9 bits, which was adjusted by the slight 3.2 bit biasedcomposition correction. The only time you really need to pay attention to the bias value is when it's large, on the same order of magnitude as the sequence bit score.
After reading this part, it makes more sense why the bias value could be above one, but now I'm not sure why the documentation on the webpage says it is.
ADD REPLY
• link
updated 2.7 years ago by
Ram
44k
•
written 10.0 years ago by
pld
5.1k
The user guide example and definition is correct. The bias field in hmmer results is defined as the difference in bit score after applying the bias correction and can be of values greater than one.
I just heard back from them, they are fixing the website now.
ADD COMMENT
• link
updated 2.7 years ago by
Ram
44k
•
written 10.0 years ago by
pld
5.1k
From the user guide:
After reading this part, it makes more sense why the bias value could be above one, but now I'm not sure why the documentation on the webpage says it is.