How does hmmbuild create a profile from .sto or .msf file? what method or procedure is followed?
How does hmmbuild create a profile from .sto or .msf file? what method or procedure is followed?
I suggest elaborating a bit more on your question. Are you asking "what steps do I need to take in order to create a profile HMM from an alignment using hmmbuild?" Or are you asking "what magical things does hmmbuild do when you run it?"
For the first question, the answer is simple: read ftp://selab.janelia.org/pub/software/hmmer3/3.1b1/Userguide.pdf, especially the tutorial (search for "hmmbuild").
For the second question: What level of detail are you hoping for? What do you already understand? It's impossible to tell if you're looking for "it counts observations of various letters in each column of the alignment, and uses them to estimate typical position-specific family composition" or something more in-depth involving Bayesian mixtures with Dirichlet priors, sequence weighting and entropy weighting ... and ... more jargony things. See links like this one for Dirichlet mixtures (http://bioinformatics.oxfordjournals.org/content/12/4/327.abstract) or this one for entropy weighting (ftp://selab.janelia.org/pub/publications/Johnson06/Johnson06-phdthesis.pdf, chapter 3) or the user guide for alternative sequence-weighting approaches (ftp://selab.janelia.org/pub/software/hmmer3/3.1b1/Userguide.pdf, "Options Controlling Relative Weights)
Hi,
You might wanna read up a bit on Hidden Markov models and then check out this paper.
Broadly, a profile is a matrix of numbers. Each number is the probability of finding amino acid X
at position i
given amino acid X'
was found at position i-1
By listing the probabilities for all positions, the profile helps calculate the cumulative probability that a new protein might belong to an existing family, for which alignment and profile building has been done.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.