I want to compare the domain composition of a few of my cell samples.
After mapping peptide sequences on Pfam, I have thousands of domain hits across my dataset. Comparing across my samples hardly makes sense.
Are there any references / databases for domain categories? Similar to the clans/ superfamilies in Pfam but something broader, e.g. kinases, zinc fingers, ankyrins... ?
To my knowledge, no - I don't think it gets much broader than "Zinc-finger containing protein" if you're specifically interested in the domains themselves.
The next level up would be pathway ontologies through Kegg/GO etc, but its not entirely the same thing.
Thank you!!!
So no way to categorise them other than manually going through the domain names? - Currently they're like zf-C3HC4, zf-C2H2, zf-RING2....
AFAIK, no; but I won’t claim to be an expert here.
If your dataset is reasonably well organised/curated you may be able to get some of the way with assorted regex magic, but that’s rarely the case with annotations.