A kinase chemically phosphorylates others proteins. You could precisely identify that your protein is a kinase by determining if your protein phosphorylates another protein.
You're asking how to do this bioinformatically (which would not be definitive), but I would look at the protein composition (amino acid sequence), the estimated folding structure, and any additional tertiary indications and see how homologous these characters are in relation to other kinase proteins in databases, such as GO. Does your protein have motifs in common with other proteins that have been fully characterized to be kinases?
You shouldn't use GO terms to define protein kinases. There are more sophisticated ways to make these predictions, such as sequence scanning, on a domain level. A solution would be to use HMMER, combined with the kinomer library (this is similar to what PFAM does), except it will predict on a kinase family level.
Run hmmer using the downloaded kinomer library:
hmmsearch --domtblout predictions.txt allPK.hmm seqs.fasta
Where seqs.fasta
is a file containing the protein sequences you want to scan in fasta format, allPK.hmm
is the HMM library downloaded from kinomer, and predictions.txt
is the file containing the resulting output.
For more help on hmmer:
hmmsearch -h
Alternatively, you could use the web interface provided by kinomer: http://www.compbio.dundee.ac.uk/kinomer/bin/runHMMer.pl ?
Good luck
If you want a list of "known" kinases according to various authorities you can use the 'Browse Categories' function of DGIdb and navigate to the generic kinase category, or more specifically to Tyrosine, Serine Threonine, PI3, or Lipid kinases.
Alternatively, if you want to query a single gene or list of genes against these categories you can use the 'Search Categories' function of DGIdb and enter your gene(s) and filter against which sources you trust as a definition of kinases.
Finally, you can use the DGIdb API's Genes in Category endpoint to get at this information programmatically.
Kinases in DGIdb are defined in various ways by different sources (including the Gene Ontology). A recent "reliable" list would be the dGene list which is based on review of the literature. See the sources and their corresponding publications for more details of how they defined kinases.
Slightly surprised no one already suggested http://www.ebi.ac.uk/interpro/interproscan.html (Pfam HMM matching is part of it). UniProt and Ensembl run this by default. As noted here for EC number assignment tools GO may happily assign function to "dead" kinases so look carefuly at the catalytic site residues (see Regarding Pseudokinases). Then, as Josh says, get someone to do the enzymology experiments (kinases are popular http://cdsouthan.blogspot.se/2013/11/drug-target-time-tracking.html so your chances are good)
Every kinase has to have a DFG motif. You can quickly screen for that. If it has, you can go more deeper into domain analysis.
Damain annotation might be the best way I think. Just have a look at whether a kinase domain exists there.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
You will have to include more info. What "kind" of protein do you have? Accession numbers, unannotated sequences? Model-organisms? Non modal-organisms?
Found the solution - UniProt has keyword annotations and some of these are also for kinase.