hmmsearch output multiple alignment with initial query sequences
1
1
Entering edit mode
3.7 years ago
el97004 ▴ 80

Hi all!

I'm running a hmmsearch using a list of proteins, and I'm wondering if the program is able to output a final multiple alignment which includes my query proteins + all significant sequences that were found in the run. I was able to output a multiple alignment of all significant hits using the -A flag:

hmmsearch -A output_msa proteins.hmm my_protein_database

but I can't find a way for outputting a final model including my query + new hits.

Thanks!

hmmer hmmsearch homology • 1.7k views
ADD COMMENT
4
Entering edit mode
3.7 years ago
Mensur Dlakic ★ 28k

This can be done using hmmalign. You would need to create a FASTA file with all significant hits + your query. Chances are your query may already be in the list of significant hits. If not, this will do the trick using esl-reformat which comes with HMMer, and the output you already have from the -A option:

esl-reformat -u fasta output_msa > output_msa.fasta

Now open the output_msa.fasta file and add your query to it, followed by hmmalign:

hmmalign -o output_msa_new proteins.hmm output_msa.fasta
ADD COMMENT
0
Entering edit mode

Thanks for this information! I was hoping it could be done within the hmmsearch command but this is a valid workaround. May I ask what you mean when you say:

Chances are your query may already be in the list of significant hits

Does hmmsearch actually include original query sequences into the output models?

ADD REPLY
0
Entering edit mode

Does hmmsearch actually include original query sequences into the output models?

I was assuming that my_protein_database is a modern protein database of good size. If so, it should have proteins that are at least similar if not identical to your query, even though they may come from a different organism and have a different accession number. hmmsearch can't output your query because your query is an HMM, so the only "query" it knows would be a consensus sequence emitted by that model. That consensus sequence is likely not identical to your query.

ADD REPLY
0
Entering edit mode

I see, makes sense! Thanks for the explanation!

ADD REPLY

Login before adding your answer.

Traffic: 2729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6