Given the following:
- query.fasta -> single entry
- reference.fasta -> multiple entries
I now want to 'fake' (or just be able to get) an output that looks like a proper *.hhr alignment file, i.e. as if I aligned the query.fasta against the profiles of the sequences in the reference.fasta. I think this can be achieved with
hhsearch -i query.fasta -d reference.ff{data,index}
However, i really don't want the extra steps from aligning the reference sequences to any DB and all that HMM building. I am just interested in aligning the query against the reference sequences and then getting the same format in .hhr file (ordering, alignments, statistics).
I cannot figure out how to do that. Everything in hhsuite feels very long-winded.
For a single sequence in the reference.fasta hhalign does the job. To clarify, the sequences in reference.fasta might have some similarities but they also might be very different. Doing an MSA first on these doesnt make much sense. I am probably missing something super obvious but I cannot figure out how to get to this .hhr file of indepedant sequence sin the reference.fasta.-
Thanks!
I totally get your point, maybe long-winded was also not the right description for my issue. One example, custom databases are very clunky to create (naming *wo_ss and then the ss prediction is not recommended anyway according to the guide + the resorting and renaming. That's almost impossible to think of from the CLI help. Let's also not start with the output format of most applications. Anyway, I think you might have misunderstood my information about the sequences in the reference.fasta. They were chosen because they do have some similarity with the query but not necessarily amongst each other to a higher degree. hhalign works for a single entry in the reference but would not give me out of the box a format that I can use to determine which of the sequences would have the highest similarity. So my initial idea was to essentially build that HMM profiles with single sequences and then do hhblits against that fake database (profile depth for each sequence in the reference.fasta = 1). The second option I see is to use hhalign and clumsily assemble the individual result files from 1vs1 alignments into an hhr file but that would lead to wrong e-value scorings. Thirdly, maybe phmmer / jachhmmer are better suited for that task but I'll need to check if they provide hhr output / information. So what is important to me to get the top x matches in the reference.fasta and in addition the information from hhsearch about the alignments exactly like it is provided in the hhr format.
Pretty sure that nothing other than HHsuite outputs results in
.hhr
format.