Y chromosome's gene list
1
0
Entering edit mode
10.3 years ago

Based on ensemble v75, there are 72 coding genes on Y chromosome. Does any body know how I can get a list of these genes' symbol.

Thanks for your time and attention.

genome gene RNA-Seq • 4.5k views
ADD COMMENT
2
Entering edit mode
10.3 years ago
Bert Overduin ★ 3.7k

First, as far as I can gather there are only 54 protein-coding genes on the human Y chromosome:

mysql -u anonymous -h ensembldb.ensembl.org
mysql> use homo_sapiens_core_75_37;

mysql> SELECT gene.biotype, COUNT(*) FROM gene, seq_region, coord_system WHERE gene.seq_region_id = seq_region.seq_region_id AND seq_region.name = 'Y' AND seq_region.coord_system_id = coord_system.coord_system_id AND coord_system.name = 'chromosome' GROUP BY biotype;
+----------------------+----------+
| biotype              | COUNT(*) |
+----------------------+----------+
| antisense            |       10 |
| lincRNA              |       48 |
| miRNA                |       14 |
| misc_RNA             |        5 |
| processed_transcript |        2 |
| protein_coding       |       54 |
| pseudogene           |      335 |
| rRNA                 |        7 |
| snoRNA               |        3 |
| snRNA                |       17 |
+----------------------+----------+
10 rows in set (0.02 sec)

Why it does say 72 on the website, I don't know and would be something to ask the Ensembl team (helpdesk@ensembl.org).

To get the gene symbols for these genes, you best use BioMart:

Start with all human Ensembl genes:

  • Choose the 'Ensembl Genes 75' database.
  • Choose the 'Homo sapiens genes (GRCh37.p13)' dataset.

Now, filter for the genes on the Y chromosome:

  • Click on 'Filters' in the left panel.
  • Expand the 'REGION' section by clicking on the + box.
  • Select 'Chromosome - Y'. Make sure the check box in front of the filter is ticked otherwise the filter won't work.
  • Click the [Count] button on the toolbar.

This should give you 495 / 64162 Genes.

Now filter further for genes that are protein-coding:

  • Expand the 'GENE' section by clicking on the + box.
  • Select 'Gene type - protein_coding'.
  • Click the [Count] button on the toolbar.

This should give you 54 / 64162 Genes.

Specify the attributes to be included in the output (note that a number of attributes will already be selected by default):

  • Click on 'Attributes' in the left panel.
  • Expand the 'GENE' section by clicking on the + box.
  • Deselect 'Ensembl Transcript ID' .
  • Select 'Associated Gene Name' (and any other info you're interested in).

Have a look at a preview of the results (only 10 rows of the results will be shown):

  • Click the [Results] button on the toolbar.

If you are happy with how the results look in the preview, output all the results:

  • Select 'View All rows as HTML' or export all results to a file.
ADD COMMENT

Login before adding your answer.

Traffic: 1366 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6