So, if I go onto UniProt's website and type in O00142, for example. If I find the Ensembl section, it will show all of the ENST's that this protein maps to. It also shows ENSTs that map to the protein's isoforms (O00142-2,...). So, the list looks something like this
ENSTX -> O00142
ENSTY -> O00142-1
ENSTz -> O00142-2
Is there a difference between O00142 and O00142-1? Because I thought O00142-1 is the non-isoform, canonical, protein. What is the O00142 there for then? This actually causes me to run into problems as well. For example, if you take ENSTX's sequence (from the Ensembl database) and try to map each of its codons into an amino acid, the resulting sequence will NOT match O00142's aa-sequence. However ENSTY's resulting aa sequence does match O00142-1. This has always been the case when both a Uniprot and its -1 version exist. Should I just ignore the Uniprot without the -?
Note that the canonical sequence used in the UniProtKB entry is not always the '-1' isoform, and may change. So if you need to distinguish between the described isoforms the isoform identifer should always be used, do not assume that O00142 == O00142-1.
Joe is right, O00142 is the general accession number for this protein, it doesn't refer to one specific isoform. O00142-1, -2, -3, -4 (and -5) refer to the specific isoforms.
I also don't understand where you see an Ensembl transcript for O00142.
The only thing I see is:
ENST00000417693; ENSP00000407469; ENSG00000166548. [O00142-4]
ENST00000451102; ENSP00000414334; ENSG00000166548. [O00142-1]
ENST00000525974; ENSP00000434594; ENSG00000166548.
ENST00000527284; ENSP00000435312; ENSG00000166548. [O00142-2]
ENST00000545043; ENSP00000438143; ENSG00000166548. [O00142-3]