Hello,
Given an Ensembl exon ID, how can I determine the exon number? For example, ENSE00003492976
should map to 3.
Exon(exon_id=ENSE00003492976, gene_name=SOD1, contig=21, start=31663790, end=31663886)
Thanks,
Joshua
Hello,
Given an Ensembl exon ID, how can I determine the exon number? For example, ENSE00003492976
should map to 3.
Exon(exon_id=ENSE00003492976, gene_name=SOD1, contig=21, start=31663790, end=31663886)
Thanks,
Joshua
Hi Joshua,
I am not sure what you mean by "exon Numer" but I guess it is the rank (position of exon in a transcript, starting with 1 for the first one, even if not coding). Bear in mind that the rank is only defined referring to a transcript. Because an exon can have different ranks in different transcripts.
Using the Ensembl Perl API a solution could look like this:
use strict;
use warnings;
use Bio::EnsEMBL::Registry;
my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_registry_from_db(
-host => 'ensembldb.ensembl.org', # alternatively 'useastdb.ensembl.org'
-user => 'anonymous'
);
#get the exon you are interested in:
my $exon_id = "ENSE00003492976";
my $exon_adaptor = $registry->get_adaptor( "human", 'Core', 'Exon' );
my $exon = $exon_adaptor->fetch_by_stable_id($exon_id);
#get the transcript for which you want to know the exons rank:
my $transcript_id = "ENST00000270142";
my $transcript_adaptor = $registry->get_adaptor( "human", 'Core', 'Transcript' );
my $transcript = $transcript_adaptor->fetch_by_stable_id($transcript_id);
#the rank of an exon is only defined on a specific transcript
my $rank = $exon->rank($transcript);
print "rank of $exon_id in $transcript_id is: $rank\n";
I hope that was what you are looking for. But the rank of this exon is 2 for both protein coding transcripts of SOD1. If you ment something else I guess you should clarify what you need.
Chris
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi Chris,
Thanks for the script. I'm looking to order these in a similar manner to GTEx. See Exon Expression on the GTEx portal. I believe the exon I provided should give 3 (based on the exon coding lengths).
Also, can this be done in Python rather than Perl?
Joshua
Hi Joshua,
I am not familiar with the Gtex portal, but looks like they really count all the exons of a gene no matter from which transcript they come. So forget the rank.
Ensembl does not provide a python API comparable to the perl API. But there is a REST API which can be used in python. But I don't know if it is possible to solve your problem with it.
If you want to give the perl API a try, have a look at get_all_Exons() in the Gene class:
This would give you the following list of all the different exons with chromosomal start and end positions:
With some grouping(overlapping exons and assigning the same number for them?) and sorts you should get the numbers you want.
Chris