Based on ensemble v75, there are 72 coding genes on Y chromosome. Does any body know how I can get a list of these genes' symbol.
Thanks for your time and attention.
Based on ensemble v75, there are 72 coding genes on Y chromosome. Does any body know how I can get a list of these genes' symbol.
Thanks for your time and attention.
First, as far as I can gather there are only 54 protein-coding genes on the human Y chromosome:
mysql -u anonymous -h ensembldb.ensembl.org
mysql> use homo_sapiens_core_75_37;
mysql> SELECT gene.biotype, COUNT(*) FROM gene, seq_region, coord_system WHERE gene.seq_region_id = seq_region.seq_region_id AND seq_region.name = 'Y' AND seq_region.coord_system_id = coord_system.coord_system_id AND coord_system.name = 'chromosome' GROUP BY biotype;
+----------------------+----------+
| biotype | COUNT(*) |
+----------------------+----------+
| antisense | 10 |
| lincRNA | 48 |
| miRNA | 14 |
| misc_RNA | 5 |
| processed_transcript | 2 |
| protein_coding | 54 |
| pseudogene | 335 |
| rRNA | 7 |
| snoRNA | 3 |
| snRNA | 17 |
+----------------------+----------+
10 rows in set (0.02 sec)
Why it does say 72 on the website, I don't know and would be something to ask the Ensembl team (helpdesk@ensembl.org).
To get the gene symbols for these genes, you best use BioMart:
Start with all human Ensembl genes:
Now, filter for the genes on the Y chromosome:
This should give you 495 / 64162 Genes.
Now filter further for genes that are protein-coding:
This should give you 54 / 64162 Genes.
Specify the attributes to be included in the output (note that a number of attributes will already be selected by default):
Have a look at a preview of the results (only 10 rows of the results will be shown):
If you are happy with how the results look in the preview, output all the results:
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.