While analysing protein sequences, I often encounter fragments which I often remove from analysis. I am wondering if I am losing information in this way since I was told that fragments are due to sequencing errors.
While analysing protein sequences, I often encounter fragments which I often remove from analysis. I am wondering if I am losing information in this way since I was told that fragments are due to sequencing errors.
In UniProtKB the "sequence status" will set to 'fragment' is the protein sequence associated with the entry is considered to be incomplete. There are many causes of this by the most typical are: only a fragment of the protein has been sequenced, or the CDS from which the UniProtKB entry is derived is annotated as incomplete (has missing exons, or is not completely contained within the INSDC entry).
Where UniProt have identified sequencing errors, this will be recorded in the "Sequence caution" annotation.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
any example ?
There are many (about 10%)