Hi everyone,
I just ran my first hhblits (hhblits -cpu 4 -M first -i MSA/g_1.fa.out -d my_databases/my_db) and I noticed there are multiple hits to the same cluster in my results file (for e.g. see column 2 below). I'm guessing this represents different domains with homology to my query MSA that are all significant, but i wanted to double check if this makes sense. Anyone run this before and seen a similar output?
No Hit Prob E-value P-value Score SS Cols Query HMM Template HMM
1 cluster_id_124 100.0 1E-42 6.7E-46 242.0 0.0 201 13-221 101-350 (396)
2 cluster_id_124 100.0 1.6E-42 1E-45 241.0 0.0 202 7-219 48-261 (396)
6 cluster_id_124 100.0 9.2E-37 6.1E-40 211.5 0.0 198 11-218 142-391 (396)
Also, my database is made up of ~2k HMMs, why then in the output results file, I see that there is only 136 searched HMMs?
Query g_1
Match_columns 229
No_of_seqs 1529 out of 22987
Neff 11.9485
Searched_HMMs 136
Thank you for any input.
Is this from a custom database?
The output looks reasonable at a glance, but I’ve not seen
cluster_id_xxx
before. I typically usehhsearch
too, so there could be some difference in the program that I’m not accounting for.I usually run my searches against the PDB, so I get PDB hits back.
Yes, this is from a custom database. Each HMM in my database is produced from a multiple sequence alignment of an ortholog group.
Do you also see duplicate hits when you used PDB?