Entering edit mode
2.6 years ago
symela.lazaridi
•
0
Hello everyone,
I am currently working on splitting a protein sequence into domains. The problem is that the hmmer tools like hmmscan and hmmsearch do not point out the exact location of the domain within the query protein. Is there any other way to run a batch of millions of proteins against a Pfam database and be able to get the domains and their exact locations within the protein to achieve the splitting? I cannot work with blast because it would be really time-consuming for such a large dataset.
Thanks
Have you tried InterProScan (https://www.ebi.ac.uk/interpro/search/sequence/)? This tool also has a locally installable version through which you can search for domains within proteins in bulk, and as far as I remember the output does contain the location of the predicted domains. If you specifically looking for Pfam based predictions, then you can specify this while running the tool.
Also with HMMscan, I do see that the predictions give the location of the predicted domain - https://www.ebi.ac.uk/Tools/hmmer/results/240A7018-CD33-11EC-ACFB-5661F75AEC3D/score (example scan)