Hi everyone,
I would like to know if there is a way to set a %coverage treshold using hmmsearch. I've read the user guide, but I haven't found it.
Regards,
C.
Hi everyone,
I would like to know if there is a way to set a %coverage treshold using hmmsearch. I've read the user guide, but I haven't found it.
Regards,
C.
You'll need to provide a definition of "%coverage" to be sure, but the answer is almost certainly "no".
When I hear "%coverage", I think of a measure of shared length, probably "what percent of the sequence is aligned to the model" (or vice versa). If this is what you're after, you may want to use hmmsearch's --tblout
flag. This will give you an easy-to-parse space-delimited file that includes start/end-positions for both the query model and the target sequence, along with the total target sequence length. Knowing the model length also, the %coverage math from there is pretty easy to pull off.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hello Travis,
That's a great answer. Have you any advice on which parser i should use ? If there is one for python or perl ?
Thaks again !
I recommend writing your own "parser". First run hmmsearch
Then look at the format of the file produced when using hmmsearch's
--tblout
flag (in the example above:my_output.tblout
). You'll find that this is likely the easiest parsing task you'll ever do. Simply walk through the lines of the table file one-at-a-time, (a) ignoring lines starting with#
, and (b) accessing the content of each line using either split or a regex (I'm imagining Perl here). Heck, you could even use awk to parse the file, if you're a fan of awkOk. Thanks for you time and your answer. i'll try. (i'd give you a 'biostar gold' if it was available)
Notably you actually need to use
--domtblout domains.txt
because--tblout hits.txt
doesn't give you per-alignment fields such as the position.If you do
--domtblout
, you get a table with this header (and therefore these fields):