I see the GENCODE 23 general statistics shows total number of protein-coding genes as 19797 in humans http://www.gencodegenes.org/stats/current.html, whereas Ensembl 81 contains 22017 unique protein-coding genes when I use filter for "protein_coding" Gene type (biotype) and "GENCODE basic annotation" using Ensembl BioMart services http://www.ensembl.org/biomart/martview/034b08dbbcea12ff30614193d4d293a0. As I understand ensembl imports the GENCODE gene-set and annotations, the gene counts should correlate. Could anyone please explain me what I am missing and why is this difference in the gene counts?
We don't import Gencode, we make Gencode.
I'm not sure who deleted this post, but I've undeleted it.