Hi, I'm currently generating new bacterial genomes and I want to identify all ankryn domain containing proteins (and the number of repeats they have). What are some approaches I can use for this?
Due to the scale I don't want to use web tools, and I've been looking at the hmmer documentation which looks quite complicated to parse. So I'm hoping there is a simpler way.
Ideally, I'd like to process a gbff file, but I can always convert between file types.
Thank you!
Note: I can't just extract this from the annotation information as bakta, the package I'm using doesn't provide it for all cases.
Brilliant, thank you!