I want to get the information of all genes on human Y chromosome, then I found the statistics in different databases --Ensembl (GENCODE), NCBI, HGNC -- are dissimilar.
For example, protein-coding genes numbers:
CCDS 63
HGNC 45
Ensembl 63
NCBI 73
So what leads to these number be different?
By the way, is RefSeq gene data the same as NCBI homo sapiens annotation release?
Ultimately HGNC is responsible for all human gene nomenclature. Other databases may add database specific annotation but if you need a list of approved genes for Y chromosome then HGNC is authoritative source.
So other databases will contain all official gene symbols and names from HGNC and add their specific annotations?
Yes but other resources sometimes lag behind HGNC. HGNC occasionally updates official symbols and/or names and the old ones become synonyms and the changes are not always picked immediately by others, it depends on their update cycle.