On December 18, 2019
To reduce database redundancy, the UniProtKB/Swiss-Prot policy is to describe, whenever possible, all protein products that are encoded by one gene in a given species in a single entry. This includes isoforms generated by alternative promoter usage, alternative splicing, alternative initiation and ribosomal frameshifting. We assign a name and a unique identifier to each isoform and choose one of them to be the canonical sequence that is shown in the UniProtKB text and XML format (the RDF format shows all sequences). All positional annotations in the entry refer currently to this canonical sequence. Some gene products are precursors that are processed by proteolytic cleavage to generate the biologically active product(s). These products are described by their location on the canonical sequence, a name and a unique identifier.
When isoforms, or products of proteolytic cleavage, are known to differ in their function or other characteristics, we generally describe this in the text of the respective annotations. To make this information also accessible to software applications, we are going to change the way we curate this and adapt the UniProtKB text format to describe the product to which an annotation applies in a computer-processable way. The schemas of the XML and RDF format already support this and require no changes.
See https://www.uniprot.org/help/product_annotation_change for details.
Please contact the UniProt helpdesk if you have any questions.