Firstly, just to say, I think GenoMax's overview is fantastic. I wanted to add a section about the data that's available in Ensembl and what we've got planned for the future.
As GenoMax pointed out, Ensembl takes the reference genome assemblies for a number of species from the publicly available primary databases (NCBI/ENA/DDBJ) and adds annotation in four broad categories:
Gene Annotation:
Gene and transcript models are annotated onto the reference genome assemblies using an automated gene annotation pipeline. For selected species (ie human, mouse, zebrafish, rat), gene annotation may also include manual curation, ie reviewed determination of transcripts on a case-by-case basis by the Ensembl-Havana curators. More information: http://www.ensembl.org/info/genome/genebuild/index.html
We link our gene, transcript and peptide features to features in other databases such as UniProt and RefSeq to help in comparison across different databases. More information: http://www.ensembl.org/info/genome/genebuild/xrefs.html
For human annotation, we are currently involved in the MANE collaboration with NCBI to annotate an agreed upon, conserved, highly expressed and biologically relevant transcript for each human gene: https://www.ensembl.org/info/genome/genebuild/mane.html
Variation data:
Ensembl imports small and large-scale sequence variants from a number of primary sources (e.g dbSNP and EVA) as well as additional supporting data relating to phenotype, allele frequency and citations. For each variant, we then calculate predicted molecular consequences according to Sequence Ontology (SO) as well as pathogenicity and conservation scores. More information: http://www.ensembl.org/info/genome/variation/index.html
We also have a tool for annotating the molecular consequences of your own variation datasets called the Variant Effect Predictor (VEP):
http://www.ensembl.org/info/docs/tools/vep/index.html
Comparative genomics:
We perform a number of comparative analyses between the genes and genome sequences of species present in Ensembl to predict gene trees and homology relationships as well as whole genome alignments. More information: https://www.ensembl.org/info/genome/compara/index.html
Regulatory data:
For human and mouse, we predict the position and activity of regulatory features in a variety of cell types through an analysis of datasets from the ENCODE, RoadMap and BluePrint epigenomics projects. More information: https://www.ensembl.org/info/genome/funcgen/index.html
All of this data is available through the web interface, but you can access the data in a variety of scales through the BioMart tool, REST API and FTP download.
The Ensembl resources mentioned above provide data primarily for vertebrate species but we have a sister-project called 'Ensembl Genomes', which provides genome annotation data and visualisation for non-vertebrate species; divided into plant, fungi, protist, bacteria and non-vertebrate metazoan categories.
We have also recently launched the Ensembl Rapid Release genome browser, which provides rapid access to gene annotation data for newly sequenced genomes, without relying not the traditional Ensembl release cycle.
We are also in the process of designing a brand new Ensembl website. The site is currently available to view but has limited functionality, which we hope to add to over time: http://2020.ensembl.org