Tools Of Choice For Functional Annotation Of Genes And Proteins
3
11
Entering edit mode
13.3 years ago
NPalopoli ▴ 290

I would like to know which are the tools you would choose for performing the functional annotation of genes and proteins. It would be very helpful if you could specify the capabilities, the pros and cons and your own experience relating any method you may recommend or not.

I am particularly looking for alternatives to the Blast2GO annotation pathway, which I already know is widely used.

annotation function • 14k views
ADD COMMENT
11
Entering edit mode
13.3 years ago
Leszek 4.2k

I am using InterProScan for that purpose. The installation process isn't easy and you will need a computer cluster to launch it. But it pays off, as InterProScan integrates multiple databases: PROSITE, PRINTS, Pfam, ProDom, SMART, TIGRFAMs, PIR superfamily, SUPERFAMILY, Gene3D, PANTHER and HAMAP. This means, often you will get functional annotation from multiple, independent sources. In addition, another tools like SignalP can be optionally launched by InterPro.

What we normally do, beside InterPro, is phylogenetic reconstruction. We get phylogenetic tree for every protein from given species, then we predict one-to-one orthologs and transfer annotation from close model species. In this tree, you could transfer annotation from Phy000CVNF_YEAST|SHP1 onto orthologs from closely-related species like C. glabrata or other Saccharomyces. But high-throughput phylogenetic reconstruction is difficult and computationally extensive process. You can have a look at phylomeDB paper that discusses this matter in brief.

Concerning Blast2GO, it's easy-to-use approach. But it relies entirely on sequence similarity, so I recommend to be very careful with those predictions. I would rely on more specific methods like protein profiles (hmmer) or ideally phylogenetic trees. And transfer annotation only among one-to-one orthologs.

ADD COMMENT
0
Entering edit mode

Within blast2go, InterProScan searches can also be launched, but as Leszek said, usually you have to do it with a cluster, it takes forever if you run it on a desktop. For phylogeny-based method, ensembl compara may be worthwhile looking.

ADD REPLY
4
Entering edit mode
13.3 years ago

As far as I know, semi-automatic annotation of proteins or genes exist by either sequence similarity or a similarity in the naming scheme, with varying degrees of success.

Sequence similarity based: These methods use sequence similarity to classify a gene/protein to either a GO term or EC term.

Name similarity based: These are mostly used in computational modelling but the general approach is applicable to any name -> resource identifier mapping and tool therefore.

In general, have a look at MIRIAM (publication, Wikipedia), which provides a framework and guidelines to annotate entities in models but can be used in other contexts as well and identifiers.org, which essentially is the same service but the newer and broader approach.

ADD COMMENT
2
Entering edit mode
9.9 years ago
Yannick Wurm ★ 2.5k

Also have a look at Alexie Papanicolau's JAMP: Just Annotate my Proteins which features and intelligent HMM-based approach.

ADD COMMENT

Login before adding your answer.

Traffic: 2019 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6