bbmap.sh
can take both minid
and minratio
. Which parameter is prioritized since they both filter out reads? Suppose you had minratio=0.56
(as in bbsplit.sh
) and also minid=0.76
. Which one would be prioritized?
bbmap.sh
can take both minid
and minratio
. Which parameter is prioritized since they both filter out reads? Suppose you had minratio=0.56
(as in bbsplit.sh
) and also minid=0.76
. Which one would be prioritized?
In an e-mail exchange with the author Brian Bushnell
:
Sorry for the confusion; they set the same thing, just with different units. "minratio" is the internal number using the internal scoring system; if you use the "minid" flag it converts it to the equivalent minratio. So if you set "minid=0.76 " then it will be converted to approximately "minratio=0.56" and then the minratio variable will be set to 0.56. So there is no reason to set both of them. If you do set both of them, the last value will be used. I recommend minid over minratio because it is easier to read, but they accomplish the same thing.
"minid" is approximate since it adjusts minratio, or the ratio of the maximum possible internal score, but the internal score is not directly related to percent identity because of different penalties for substitutions, deletions, and insertions, which vary by length. If you need an exact %identity cutoff, use "idfilter=0.76" which will eliminate all alignments below 76% identity, as a postfilter. "minratio/minid" actually affect mapping, and prevent the creation of any alignments with an internal score below "minratio". As such, a high value of minid will make the program faster and a low value will make it slower. Whereas "idfilter" does not affect speed.
That said, I don't recommend using "idfilter" unless you have a good reason (like, a paper with a graph showing %identity cutoffs, for clarity; not for something like variant calling). "minid/minratio" are proportional to the internal score which should better model actual mutations or evolution, by trying to minimize the number of independant events rather than the edit distance; "idfilter=0.5", for example, will disallow a 100bp read from mapping with a 400bp deletion in the middle, since that would be only 20% identity. "minid=0.5" would allow this mapping because the internal score would actually still be quite high (maybe 0.9 of the maximum possible ratio).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Brian Bushnell may be the only person who can answer that authoritatively and he unfortunately no longer participates on forums. You may want to do some testing if you must know.
Ah man, thanks for the heads up. I SUSPECT that minid takes priority since minratio isn't listed in argument docs but minratio can still be used. I've tried reading the source code but it looks like it imports some java modules and I definitely don't have the knowledge to interpret those.