Human Swiss-Prot Latest release 2017_01 = 20,171
I'd like a retro-backfill for 10 years or so, but only need maybe 2 releases per year for a nice chart I'm guessing this is not just available from an interface query box
If anyone can supply this - they are welcome to an aknowledgment in the protein number review I am just writing, if they want, and/or certainly a beer if we find ourselves in the same conference bar
(geting same data from neXtprot is fine also)
Why not email their support and ask (help at uniprot.org)? They probably have this information available.
Old UniProt releases are here (starting from release 1.0)
EDIT: See my answer below for links to the stats.
I have a second question but closely related to this one. The plot of total Swiss-Prot (or human) vs year shows almost an asymptote after about 2009. What is the cause of this?
The effect you are seeing from 2009 onward - a slowing of the rate of growth of the number of reviewed UniProtKB/Swiss-Prot entries - is due to a deliberate change in our curation policies.
Prior to 2009, we were using the HAMAP system for the rapid annotation by homology of uncharacterized protein sequences in UniProtKB/TrEMBL. HAMAP uses a rule-based system that leverages experimentally characterized templates from UniProtKB/Swiss-Prot (see our paper for more details for information). UniProtKB/TrEMBL entries annotated by HAMAP were subject to spot checks and subsequently integrated into UniProtKB/Swiss-Prot.
From 2009 the pipeline for UniProtKB/TrEMBL annotation was modified to include HAMAP data, and since that time all entries annotated by HAMAP have been made available as part of UniProtKB/TrEMBL without any further review or checks). HAMAP mimics many of the checks performed by Swiss-Prot curators, and providing the HAMAP data in UniProtKB/TrEMBL without further review allows curators to concentrate on the curation of experimental data from the literature. This has allowed us to develop our curation workflows in other ways; 2009 also marked the year in which the Swiss-Prot group began to systematically curate Gene Ontology terms to all proteins, and now contributes some 25,000 annotations per year to the GO.
Thanks, so to paraphrase, HAMAP makes a big improvement to TrEMBL allowing Swiss-Prot to increase the curation-time-per record > apparent slow down but quality/utility <
If you are referring to human entries, I guess no new proteins have been added since 2009. Or the amount of proteins removed / added is in equilibrium.
If you refer to the total number of entries (any species) in Swiss-Prot, I doubt it is asymptotic since 2009. I would expect it always be growing and growing. Curators haven't stop working
I have checked on: http://web.expasy.org/docs/relnotes/relstat.html and there was a slow down in the incorporation of entries into Swiss-Prot since 2009. I do not know why. They may have be decided to limit the incorporation of more whole-proteomes. Depending on what you want it might be better to look at the entire Uniprot database, not restricting to "curated" entries in UniprotKB (Swiss-prot).