Question

Where/how do proteins with "-like" in their names get their names?

2

Entering edit mode

4.4 years ago

Dunois ★ 2.9k

Perhaps the title of my post is misleading, but for instance consider this "Period-like" protein. According to this link:

Proteins of unknown function which exhibit significant sequence similarity to a defined protein family have been named in accordance with other members of that family.., e.g. "Holliday junction resolvase family endonuclease".

It is also possible to use "-like" in the name. Bear in mind that this should only be used for cases that are outliers to a tight homomorphic family, e.g. "Holliday junction resolvase-like protein".

(Emphasis mine.)

So I could presume that the "Period-like" protein was named so because it exhibited "significant" similarity to the Period protein family, but its function was not/has not been (experimentally?) confirmed.

However, this reductase here was also probably assigned a name based on sequence similarity. UniProt indicates it was also "predicted" just like the Period-like protein; as evidenced by the Status line on the UniProt webpage. It is probable that this particular protein does not have any additional evidence backing its identification either. So why wasn't this named "NADH:ubiquinone reductase-like" instead?

In general I would like to know under what circumstances a protein gets suffixed with the "-like" moniker (and when it wouldn't). Additionally, when one encounters a "X-like" protein, which properties of the protein can one presume to be related to the family that protein is purported to belong to? (Fold? Function?)

(I apologize if this is something that is supposed to be obvious and/or trivial.)

Edit: this NCBI link here does explain the usage of the "-like" suffix in the context of equivalog-type HMMs but the first and last points under that section appear to contradict one another somewhat? (First point states equivalogs are homologs that share a specific function but whose evolutionary relationship is unknown, but the last point claims despite obvious sequence similarity to XXX, it may or may not have the same role and function as XXX.)

protein nomenclature homology • 1.3k views

ADD COMMENT • link 4.4 years ago by Dunois ★ 2.9k

2

Entering edit mode

I could be wrong, but in my experience annotation with "-like" doesn't appear to follow an overly systematic/objective process. I don't think there are obvious criteria such as "ABC-like proteins must be 30-50% identical to ABC proteins, and lack experimental evidence". I think its a slightly more holistic way of 'organising' the information.

ADD REPLY • link 4.4 years ago by Joe 22k

1

Entering edit mode

So I could presume that the "Period-like" protein was named so because it exhibited "significant" similarity to the Period protein family, but its function was not/has not been (experimentally?) confirmed.

That is a good assumption. It is a place-holder an annotator placed on the name when they can't be reasonably certain that the protein is what they think it is. All indications point the new protein having that function/characteristics/fold but it remains a hypothesis, until someone experimentally proves it to be so.

ADD REPLY • link 4.4 years ago by GenoMax 150k

score 1 · Answer 1 · 2020-12-03

In a UniProt entry, it is important to look at the evidence label in order to be able to distinguish

expert-curated and reviewed annotation
automatic annotation
information imported from an external database

See https://www.uniprot.org/help/evidences

Looking at the entries you cite:

https://www.uniprot.org/uniprot/A0A075IMJ3 is an unreviewed entry (i.e. in UniProtKB/TrEMBL), and the protein name has been imported from nucleotide sequence database entry EMBL:AIF31262.1. Since ENA/GenBank/DDBJ is an archive, protein names and other annotations are provided directly by the submitters, without major intervention by a curator. The submitter may or may not have made the effort to follow the international protein naming guidelines that you found.

https://www.uniprot.org/uniprot/A0A2D1GRS1 is also an unreviewed entry, but the protein name was assigned by an automatic annotation pipeline, ARBA, in this case (https://www.uniprot.org/help/arba). You can click on the ARBA rule name in the evidence tag for more information: https://www.uniprot.org/arba/ARBA00012944, and you will see that the name is based on the EC number EC:7.1.1.2. You can also consult the entry history to find out how the name evolved, in particular what the submitted name was before automatic annotation: https://www.uniprot.org/uniprot/A0A2D1GRS1?version=6&version=7&diff=true

If you want to see reviewed entries with a protein name "-like", try this query: https://www.uniprot.org/uniprot/?query=name%3A*like+reviewed%3Ayes&sort=score

Note that many of the "-like" names are not the recommended names, but can be found in alternative names.

Don't hesitate to contact the UniProt helpdesk if you have any additional questions about UniProt.