Where Are Canonical Transcripts In Recent Ensembl Genes?
1
4
Entering edit mode
12.6 years ago
Ttnguyen ▴ 70

Any ideas why recent releases of Ensembl human genes (e.g. 66 & 67) do not provide canonical transcripts?

ensembl gene • 7.8k views
ADD COMMENT
0
Entering edit mode

Not come across this myself! Not wanting to sound doubting, but, what makes you think that? Have you run some code that has returned undef for the canonical transcripts?

ADD REPLY
0
Entering edit mode

I am just guessing the concept of canonical transcript is no longer helpful.

ADD REPLY
2
Entering edit mode

Canonical transcripts are just the name we give to the transcripts used to build the gene trees. As such, they still exist in Ensembl. Maybe it would be helpful if you could describe where you use to find them. Are you talking about the Perl API, the FTP files, the web or biomart?

ADD REPLY
0
Entering edit mode

I were using BioMart to find them. But from Steve's reply below, I should get the canonical transcripts by API.

ADD REPLY
4
Entering edit mode
12.6 years ago

I have run the following code using the release 67 Perl API (http://www.ensembl.org/info/docs/api/index.html):

#!/usr/bin/env perl

# Check EnsEMBL Transcripts
# Coded by Steve Moss (gawbul [at] gmail [dot] com)
# http://about.me/gawbul

# make things easier
use strict;
use warnings;

# import modules
use Bio::EnsEMBL::Registry;
use Data::Dumper;

# setup registry
my $registry = 'Bio::EnsEMBL::Registry';

# connect to EnsEMBL
$registry->load_registry_from_db(-host => "ensembldb.ensembl.org",
                -user => "anonymous");

# get gene adaptor object from registry for human core
my $gene_adaptor = $registry->get_adaptor("Human", "Core", "Gene");

# get list of gene stable IDs
my $gene_ids = $gene_adaptor->list_stable_ids();

# traverse gene IDs
my $count = 0;
my $defined_count = 0;
my $undefined_count = 0;
print "Processing " . scalar(@{$gene_ids}) . " gene IDs...\n";
while (my $gene_id = shift(@{$gene_ids})) {
    # let user know count
    local $| = 1;
    print "[$count/" . scalar(@{$gene_ids}) . "]\r";

    # get gene object
    my $gene = $gene_adaptor->fetch_by_stable_id($gene_id);

    # get canonical transcript
    my $canonical_transcript = $gene->canonical_transcript();

    # check defined
    if (defined $canonical_transcript) {
        $defined_count++;
    }
    else {
        $undefined_count++;
    }
    $count++;

    # undef the transcript
    $canonical_transcript = undef;
}

# let the user know
print "$defined_count defined \& $undefined_count undefined in $count.\n";
print "...done!\n";

and I get the following output:

w232-244:Code stevemoss$ perl check_canonical_transcripts.pl 
Processing 56478 gene IDs...
[56478/0]
56478 defined & 0 undefined in 56478
...done!
ADD COMMENT

Login before adding your answer.

Traffic: 2053 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6