Missing hhm_db and a3m_db for HHSuite
1
0
Entering edit mode
6.0 years ago
briantn97 • 0

Hey guys, a bit new to Linux here and I've been desperately trying for days now to get HHsearch working for a batch fasta file. Compiling hasn't worked out too well for me so I resorted to simply sudo-apt install hhsuite, HHsearch is there and I have the database downloaded however when I try to run the command for example

hhsearch -cpu 6 -i /home/example.faa -d /home/b/Downloads/pdb70/pdb70_hhm.ffdata -B 10 -Z 10 -E 1E-03

I end up with Could find neither hhm_db nor a3m_db! I've looked into this and it mentioned something about assigning the config file but there is none. Could anyone please give me some guidance?

hhsearch hhpred hhsuite • 6.0k views
ADD COMMENT
0
Entering edit mode

When you downloaded the pdb70 database from the HHsuite servers, did you unzip everything in side the same folder?

What’s inside your pdb70 folder?

ADD REPLY
0
Entering edit mode

Hello jrj.healey! Thank you for helping me out. So I downloaded pdb70_from_mmcif_181114.tar.gz from http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/

Which had 12 files:

pdb70_a3m_db
pdb70_a3m_db.index
pdb70_a3m_db.index.sizes
pdb70_a3m.ffdata
pdb70_a3m.ffindex
pdb70.cs219
pdb70.cs219.sizes
pdb70_hhm_db
pdb70_hhm_db.index
pdb70_hhm_db.index.sizes
pdb70_hhm.ffdata
and one more.

These are what is in my pdb70 folder, I have also tried making a Databases folder in where I he I believe the DPRK installed HHsuite to /usr/lib/HHSuite/Databases/ but to no avail as it still gives me the same "could find neither"

EDIT: Not sure if this is relevant but in my scripts folder I can't seem to find the hhpred folder, which I would like for homology modeling.

ADD REPLY
0
Entering edit mode

Assuming there’s nothing wrong with your install config itself, I can’t see anything obviously wrong with your command. I use a similar set up, but maybe your file path is wrong in a way I don’t know.

Try what I do:

db=$(find ~/Applications/HHSuite/databases/pdb70 -type f -name "pdb70_hhm.ffdata")

for file in ./*.faa ; do
   hhsearch -dbstrlen 50 -B 1 -b 1 -p 60 -Z 1 -E 1E-03 -nocons -nopred -nodssp -cpu 30 -i $file -d $db
done

HHPred has changed versions a few times since I installed and I had to spend a lot of time messing with environment variables. In the newer versions its meant to be easier to install so I dont think you will need to have done everything I had do, but I’m not 100% sure.

ADD REPLY
0
Entering edit mode

So I tried your command. Not sure if I did it correctly so I attached a screenshot of exactly what I did, as well as where my files were. I also tried moving the database file to the desktop to see if it could find it there. picture

I have the newest version 3.0 beta 3, and so I tried leaving out the _hhm and just typing pdb70 however it still says could find neither hhm_db nor a3m_db.

I'm beginning to think perhaps its because I didn't set up the config at step 2 here https://github.com/soedinglab/hh-suite/tree/master/scripts/hhpred However I don't even have an hhpred folder or the config file.

Again, thank you for your help.

ADD REPLY
0
Entering edit mode

you have, in databases folder (on desktop). pdb70_hhm.ffdata, not 79. Remove also -nocons. It is not working. briantn97

you are looking for db=$(find ~/Applications/HHSuite/databases/pdb70 -type f -name "pdb79_hhm.ffdata") instead of db=$(find ~/Applications/HHSuite/databases/pdb70 -type f -name "pdb70_hhm.ffdata")

ADD REPLY
0
Entering edit mode

It looks like from the github, that you only need the config file if you’re using the hhsearch perl script. The binary may not need this.

I’ll have to go and install 3.0 and see what’s up as I’m stumbling around in the dark now.

ADD REPLY
0
Entering edit mode

Hello cpad0112,

I removed the as well as -nocons and other errors however, it still didn't work sadly, Still gives me the missing files. I installed it on Ubuntu using Apt-get version 3.0 using Apt-get which makes me worry because it seems this method doesn't contain all proper files. So if you do install it this way please let me know if you get it working.

ADD REPLY
0
Entering edit mode

It appears they’ve removed or changed -nocons in 3.0.x, but that seems to just be a warning. Did it successfully run or were there more errors?

ADD REPLY
0
Entering edit mode

So I also removed -nocons and all the other missing variables however still gives the same missing databases.

ADD REPLY
0
Entering edit mode

Which version are you using? With my v 3.0.3, hhsearch doesn't need suffix like _hhm.ffdata thus I think

hhsearch -cpu 6 -i /home/example.faa -d /home/b/Downloads/pdb70/pdb70 -B 10 -Z 10 -E 1E-03

will work (didn't checked cuz i don't have enough disk space to download pdb70 sorry).

PS. I can't remember well but version 2 or so, it needs full filename of hmm database like this

hhsearch -cpu 6 -i /home/example.faa -d /home/b/Downloads/pdb70/pdb70_db.hmm -B 10 -Z 10 -E 1E-03

ADD REPLY
0
Entering edit mode

Verson 2.x requires you to point the command at the _hmm.ffdata file. hhblits and other tools in the suite require the other files. If OP is on a 2.x version, their specification of the DB name should be correct as far as I can work out.

ADD REPLY
0
Entering edit mode

I see. thank you. I haven't used v2.x these days.

ADD REPLY
0
Entering edit mode

Hello fishgolden,

I do indeed have 3.0.3, I just tried without the suffix however it still can't find it. I'm beginning to believe it's because the I installed hhsuite with apt-get. May I ask how you installed it?

ADD REPLY
0
Entering edit mode

Sorry, I can remember that I tried both; compile from source code & apt-get, and I'm not sure which one is working now.

By the way, according to your screen shot, the path to the database is different from that of you showed in the original post

/home/b/Downloads/pdb70/pdb70

is now

/user/lib/hhsuite/databases/pdb70

/home/b/Desktop/databases/pdb70

There may be typos in my texts so please don't copy & paste from mine ,and do not use "find" and shell script (you should not use complicated method when you are checking bugs). Please check you are surely indicating the correct path.

ADD REPLY
0
Entering edit mode

I have the database folder in both places to see if it would work if it was on my personal account.

ADD REPLY
0
Entering edit mode

Please share (copy & paste) command you used and error messages you got.

Please do not use "find".

ADD REPLY
5
Entering edit mode
6.0 years ago
Joe 21k

I finally got around to testing this on a clean Ubuntu server.

I did the following:

1. Install via apt-get:

$ sudo apt-get install hhsuite

This installed the binary hhsearch in to /usr/local/bin/hhsearch as one would expect. It also left the main hhsuite directory in my home directory, with the path /home/username/hhsuite-2.0.16-linux-x86-64/. This is important.

Note also how the version in apt is not version 3.x - for that, I believe you will need to compile from source.

2. Set environment variables:

This step is absolutely required, so ensure that this variable is set, and is pointing at the right base directory. In my case:

$ export HHLIB=/home/username/hhsuite-2.0.16-linux-x86-64/

To make this change permanent, also copy the command in to your .bashrc or similar dotfile (or run the following command):

$ echo 'export HHLIB=/home/username/hhsuite-2.0.16-linux-x86-64/' >> ~/.bashrc

If you are unsure which versions you have (since you may have confused things by compiling from source and installing from apt, run which hhsearch, and then run the whole path as a command, to get the version; e.g. in my case:

$ /usr/local/bin/hhsearch

(This will correspond to the first binary found in your PATH, and therefore the one you're invoking when you issue the hhsearch command alone)

Gives the output (header only).

HHsearch version 2.0.15 (June 2012)
Search a database of HMMs with a query alignment or query HMM
(C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser
Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951-960 (2005).

I think its important we clarify what binaries and directories are at play here since it seems there may be a bit of confusion.

3. Download and extract the databases

I made a databases folder inside the HHSuite folder discussed in Step 1, however you can put it anywhere, change to that directory, download and extract.

$ mkdir /home/username/hhsuite-2.0.16-linux-x86-64/databases ; cd !$
$ wget http://wwwuser.gwdg.de/~compbiol/data/hhsuite/databases/hhsuite_dbs/pdb70_from_mmcif_181114.tar.gz
$ tar xvzf pdb70_from_mmcif_181114.tar.gz

If you want to be certain there are no files missing/your database download was correct, the md5sum for the tar.gz for me was :

71627c864e61cf5b338bf821ccfeb9ac  pdb70_from_mmcif_181114.tar.gz

So, I have the following

$ ls /home/username/hhsuite-2.0.16-linux-x86_64/databases
md5sum              pdb70_a3m.ffindex   pdb70_cs219.ffindex             pdb70_hhm_db.index
pdb70_a3m_db        pdb70_clu.tsv       pdb70.cs219.sizes               pdb70_hhm.ffdata
pdb70_a3m_db.index  pdb70.cs219         pdb70_from_mmcif_181114.tar.gz  pdb70_hhm.ffindex
pdb70_a3m.ffdata    pdb70_cs219.ffdata  pdb70_hhm_db                    pdb_filter.dat

Then the command:

$ hhsearch -d /home/username/hhsuite-2.0.16-linux-x86_64/databases/pdb70_hhm.ffdata -i protein.faa -cpu 32

Worked without error:

Search results will be written to protein.hhr
protein.faa is in A2M, A3M or FASTA format
Read protein.faa with 1 sequences
Alignment in protein.faa contains 508 match states
1 out of 1 sequences passed filter (90% max pairwise sequence identity)
Effective number of sequences exp(entropy) = 1.0
.................................................. 1000 HMMs searched
<abridged>
.................................................. 61000 HMMs searched
..
Realigning 40 database HMMs using HMM-HMM Maximum Accuracy algorithm
...

Query        hypothetical protein 3919442:3920968 reverse MW:51681
Match_columns 508
No_of_seqs    1 out of 1
Neff          1.0
Searched_HMMs 61044
Date          Tue Nov 20 06:47:31 2018
Command       hhsearch -d /home/ubuntu/hhsuite-2.0.16-linux-x86_64/databases/pdb70_hhm.ffdata -i PAU_03380.faa -cpu 32

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM  Template HMM
  1 1PDI_R Short tail fiber protei  93.7   0.012   2E-07   55.7   0.0   26  323-348    90-116 (278)
 <abridged>
 40 6GAP_A Outer capsid protein si  32.0      30 0.00049   32.3   0.0   60  130-191   182-241 (261)


Done

I haven't tested HHSuite 3.x but I can give it a go if you desperately want 3.

ADD COMMENT
0
Entering edit mode

Thanks again jrj.healey,

At this point I owe you a beer so let me know how I can donate to you for all your help! So apt-get kept installing 3.0 for me so I had to manually install 2.0.16 from a deb file. For some reason I did not need to set the HHLIB directory nor do I have any clue where it is, however hhsearch finally successfully ran! Anyways I just have few other questions. With the resulting hhr file how would I generate a homology model? I read somewhere that I could use HHpred in the suite with Modeller to generate a 3d model. Also is there anyway to run a fasta with 37 different sequences? I tried to run it but it said "sequences in ... do not have the same number of columns"

ADD REPLY
1
Entering edit mode

Hmm perhaps its my server running on an older Ubuntu version, or a less up to date apt-get then, as I was definitely getting 2.x through apt! Perhaps you had a previous HHLIB variable set that coincidentally was correct.

No need to donate anything ;) just upvote/accept the comments/answers that worked for you.

On your questions:

  1. I’ve never personally used the modeller plugin for HHsuite (save for on their web service). You’ll have to get your hands dirty with the manual I’m afraid to see what needs to happen with the file formats (unless anyone else has information). You’ll need to make sure you have MODELLER installed for starters though.

  2. I can’t quite remember if hhsearch allows multi fastas, I think not, as it assumes if you have multiple sequences in a file that its an alignment. You’ll need to split your fastas in to 37 separate files then just run hhsearch in a loop or in parallel.

ADD REPLY

Login before adding your answer.

Traffic: 1091 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6