Hi all,
my protein of interest has a length of aka 300 residues. I do a global multiple sequence alignment with a set of homologs and I obtain a phylogenetic tree from there. So far so good.
Now, for a subsequent task, I need to have an accurate MSA of a region of interest of about 50 aminoacids, and I do not care much about the rest. So, what would you guys do in this case? Do I build (or fetch from Pfam) an HMM profile of this region and align with hmmalign? Do I perform a local alignment? How, and with what input?
Thanks a lot!
Definitely. Grab the HMM from PFAM and then use HMMER3 to realign the domain
Would jackhmmer do as well? I enter my region of interest as input, and all sequences of interest as database, and then iteratively align the region of interest using the hmm profiles. With this, I would skip using predefined PFam domains (which I want in some cases).
Thanks!
You could, I've never done it that way so I'm not sure how well it would work. Are there cases where you don't have a PFAM profile for your region of interest/domain? I have had some success doing iterative hmmer alignments. Start with say MAFFT to generate the initial alignment, which will usually get the domain approximately correct anyway, chop, generate a profile, and realign. Generate a new profile, realign, etc. Do it a few times until it seems to converge.