Reference proteomes from uniprot in FASTA format, why is there only one sequence per gene?
1
0
Entering edit mode
23 months ago
Jobbe • 0

While downloading the human proteome in fasta format from the uniprot site, I noticed that it was mentioned that there was one protein per sequence (20,594). However, above the protein count is mentioned (81,837) and this made me wonder. I need this file to interpret spectrums obtained from bottom up proteomics experiment. Doesn't this give a very bad representation of the proteins present? Additionally, how is it decided which sequence they display if alternative splicing occurs at a gene? Lastly, is there an alternative approach that searches the entire proteome rather than the gene-centered subset?

uniprot FASTA proteome • 1.0k views
ADD COMMENT
0
Entering edit mode

You could use "unreviewed" Human set (186K): https://www.uniprot.org/uniprotkb?facets=reviewed%3Afalse%2Cmodel_organism%3A9606&query=Human

Use Protein Existence filters in left column to trim this down (transcript level etc).

ADD REPLY
0
Entering edit mode
23 months ago

In the human proteome page, https://www.uniprot.org/proteomes/UP000005640, both protein count and gene count are provided. The gene count is only provided for reference proteomes, and is algorithmically computed: for each gene, a single representative protein sequence is chosen from the proteome. Where possible, reviewed (Swiss-Prot) protein sequences are chosen as the representatives. For more detail, I suggest you look at this help page: https://www.uniprot.org/help/gene_centric_isoform_mapping

There are use cases for both approaches - some users prefer seeing only one entry per gene, others prefer using the complete proteome set with potentially several entries per gene. The latter can be downloaded from the website by clicking on the "Protein count" link in https://www.uniprot.org/proteomes/UP000005640 - or directly at https://www.uniprot.org/uniprotkb?query=proteome:UP000005640

Please don't hesitate to contact the UniProt helpdesk if you have any additional questions.

ADD COMMENT

Login before adding your answer.

Traffic: 2066 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6