Differences Between Ensembl Genome Release ?
2
4
Entering edit mode
11.7 years ago

Every two weeks or a month Ensembl releases Genomes(Humans or any) with a release number. What will be the differences between the previous release and the latest release.

  1. Will they add some new genes ? or delete some ?
  2. What kind of care should we take while handling data from two different releases ?
  3. Will they revise ENSP/G/T IDs ? Please explain.

Thanks in Advance.

human ensembl • 4.8k views
ADD COMMENT
14
Entering edit mode
11.7 years ago
Emily 24k

Hi Aravind

Ensembl releases come out every 2-3 months. In a release we introduce new features, for example in the most recent release (71) we have introduced a new transcript comparison view, a new expression view and a new scrollable overview panel.

We also bring out new species and new assemblies and genebuilds for existing species. In 71, we have a new genebuild for Anole lizard using RNASeq data, and a new assembly for chicken and C. elegans. In every release, we do a new genebuild for human, which may result in the gain, loss or refinement of some of our gene and transcript models. In alternate releases, we do a new genebuild of mouse or zebrafish, in 71 we did zebrafish so we'll do mouse for 72, due out in around June/July.

The stable IDs (ENSG etc) are designed to last outside of the releases. For the most part they should stay the same. The only time they may change is when there has been a massive change in the gene model. Within the gene tab, there is also a gene history option that will show you changes in a gene model over time.

The benefit of having releases in this manner, rather than rolling changes as we go along, is that you are able to keep track of changes. It is always a good idea to note down the release number when you get data from Ensembl. You can access previous release in Ensembl through our archive sites. On the bottom of any page in Ensembl, there is a button saying "View in archive site", which will allow you to choose your previous Ensembl version, and take you to the same version. Another easy way to get to archives is to take any Ensembl URL and add "e70" or "e54" or whatever release number in front of "ensembl". We only keep a limited number of releases as full websites, but we have data files in our FTP site all the way back to release 19.

We have a help document on our release cycle that explains all this in detail. To keep track of our releases, when they come out and what has changed, we put all this information on the Ensembl blog.

If there's anything you don't understand in a new release, you can contact the Ensembl helpdesk to ask us about it.

ADD COMMENT
0
Entering edit mode

Hi Emily. Funny thing, I'm looking at release 95 of ENSEMBL and overall it looks like HGNC changed the ENSG for ~950 entries. For them, the b37 ENSG cannot be found in the b38 ENSEMBL site and vice versa. E.g., WHAMMP3--ENSG00000276141--ENSG00000187667. Curious!

ADD REPLY
0
Entering edit mode
11.7 years ago
H@rry ▴ 30

Hi

See each release of genome has some modifications in terms of addition and subtractions. Several Databases such as Ensemble, NCBI, etc update there information and because of newly submitted genes or proteins from several authors. The process of annotation keep on going with several genome and I am not sure when they stabilize it.

  1. Yes they some time add new genes/protein the better annotated one. They some time remove redundancy so as to avoid chances of errors.
  2. In my opinion please mention the version of data which you are using in publication so as to avoid any chances of error, because it is not possible for a person to update every time with new information.
  3. Yes they sometime revise IDs and some time obsolete them as well.

I hope this would help to proceed

Thanks

ADD COMMENT

Login before adding your answer.

Traffic: 1898 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6