Exon number from Ensembl ID
1
1
Entering edit mode
9.1 years ago
jmeier ▴ 10

Hello,

Given an Ensembl exon ID, how can I determine the exon number? For example, ENSE00003492976 should map to 3.

Exon(exon_id=ENSE00003492976, gene_name=SOD1, contig=21, start=31663790, end=31663886)

Thanks,
Joshua

Exons Transcripts Ensembl • 5.0k views
ADD COMMENT
1
Entering edit mode
9.1 years ago
crisime ▴ 290

Hi Joshua,

I am not sure what you mean by "exon Numer" but I guess it is the rank (position of exon in a transcript, starting with 1 for the first one, even if not coding). Bear in mind that the rank is only defined referring to a transcript. Because an exon can have different ranks in different transcripts.

Using the Ensembl Perl API a solution could look like this:

use strict;
use warnings;
use Bio::EnsEMBL::Registry;
my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_registry_from_db(
    -host => 'ensembldb.ensembl.org', # alternatively 'useastdb.ensembl.org'
    -user => 'anonymous'
);
#get the exon you are interested in:
my $exon_id = "ENSE00003492976";
my $exon_adaptor  = $registry->get_adaptor( "human", 'Core', 'Exon' );
my $exon = $exon_adaptor->fetch_by_stable_id($exon_id);
#get the transcript for which you want to know the exons rank:
my $transcript_id = "ENST00000270142";
my $transcript_adaptor  = $registry->get_adaptor( "human", 'Core', 'Transcript' );
my $transcript = $transcript_adaptor->fetch_by_stable_id($transcript_id);
#the rank of an exon is only defined on a specific transcript
my $rank = $exon->rank($transcript);
print "rank of $exon_id in $transcript_id is: $rank\n";

I hope that was what you are looking for. But the rank of this exon is 2 for both protein coding transcripts of SOD1. If you ment something else I guess you should clarify what you need.

Chris

ADD COMMENT
0
Entering edit mode

Hi Chris,

Thanks for the script. I'm looking to order these in a similar manner to GTEx. See Exon Expression on the GTEx portal. I believe the exon I provided should give 3 (based on the exon coding lengths).

Also, can this be done in Python rather than Perl?

Joshua

ADD REPLY
0
Entering edit mode

Hi Joshua,

I am not familiar with the Gtex portal, but looks like they really count all the exons of a gene no matter from which transcript they come. So forget the rank.

Ensembl does not provide a python API comparable to the perl API. But there is a REST API which can be used in python. But I don't know if it is possible to solve your problem with it.

If you want to give the perl API a try, have a look at get_all_Exons() in the Gene class:

use strict;
use warnings;
use Bio::EnsEMBL::Registry;#Zugriff auf Ensmebl API
my $registry = 'Bio::EnsEMBL::Registry';
$registry->load_registry_from_db(
    -host => 'ensembldb.ensembl.org', # alternatively 'useastdb.ensembl.org'
    -user => 'anonymous'
);
#get the gene for which you want to know the exons rank:
my $gene_id = "ENSG00000142168";
my $gene_adaptor  = $registry->get_adaptor( "human", 'Core', 'Gene' );
my $gene = $gene_adaptor->fetch_by_stable_id($gene_id);
#get all the exons of this gene
my @all_exons_of_gene = @{$gene->get_all_Exons()};
#print ENSE, start and end:
foreach my $one_exon (@all_exons_of_gene){
    print $one_exon->stable_id()."\t".$one_exon->start()."\t".$one_exon->end()."\n";
}

This would give you the following list of all the different exons with chromosomal start and end positions:

ENSE00003567152    31668471    31668931
ENSE00001850931    31661549    31661734
ENSE00001893924    31659693    31659841
ENSE00001898140    31659709    31660708
ENSE00001892860    31667258    31667341
ENSE00001507447    31659622    31659841
ENSE00003683357    31668471    31668931
ENSE00003562612    31663790    31663886
ENSE00003492976    31663790    31663886
ENSE00003624439    31666449    31666518
ENSE00001723323    31659666    31659784
ENSE00003551618    31667258    31667375
ENSE00003528357    31666449    31666518
ENSE00003555033    31667258    31667375

With some grouping(overlapping exons and assigning the same number for them?) and sorts you should get the numbers you want.

Chris

ADD REPLY

Login before adding your answer.

Traffic: 3380 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6