Getting Only Orthologs From Ensembl Database
1
2
Entering edit mode
10.8 years ago
jaip217 ▴ 30

Hi all,

I'm trying to get the main transcript for all orthologous human Mx1 genes. Essentially, I want the first transcript of all genes listed here: http://www.ensembl.org/Homo_sapiens/Gene/Compara_Ortholog?db=core;g=ENSG00000157601;r=21:42792442-42831141;t=ENST00000398600

Ideally, I'd like to exclude any non-vertebrates and write a fasta file with the format: Species/Ensembl ID/Name. I'm pretty unfamiliar with Perl but here's my code.

 #!/usr/bin/perl
    use warnings;
    use strict;
    use Bio::EnsEMBL::Registry;
    use Bio::SeqIO ;
    use Data::Dumper;

    my $registry = 'Bio::EnsEMBL::Registry';
    print "Connecting to Ensembl..." ; print "\n" ;
    $registry->load_registry_from_db(
        -host => 'useastdb.ensembl.org', 
        -user => 'anonymous'
    );
    print 'Succesfully connected to Ensembl Database' ; print "\n" ;


    my $gene_member_adaptor = Bio::EnsEMBL::Registry->get_adaptor('Multi', 'compara', 'GeneMember');
    my $gene_member = $gene_member_adaptor->fetch_by_source_stable_id('ENSEMBLGENE','ENSG00000157601');


    my $homology_adaptor = Bio::EnsEMBL::Registry->get_adaptor('Multi', 'compara', 'Homology');
    my $homologies = $homology_adaptor->fetch_all_by_Member($gene_member); #homologies = hash array


    my $outseq = Bio::SeqIO->new(    -fh => \*STDOUT,      
                                -format => 'FASTA');

    #open (my $fh, ">", "homology_output_2") ;
    foreach my $homology (@{$homologies}) {
        foreach my $member (@{$homology->get_all_Members}) {

            #print  "DESCRIPTION: ", $member -> description(), "\n" ;
            #print   "Stable ID = ", $member ->get_Transcript()->stable_id() ,"\n" ;
            #print  "Name = ", $member -> get_Transcript() ->external_name() ,"\n" ;
            #print  $member -> get_Transcript() -> seq()->seq() , "\n" ;    
            print $outseq -> write_seq($member -> get_Transcript() -> seq()) . "\n" ;
    } }

    #close $fh;

The biggest problem is isolating the transcripts of just the orthologs (don't need paralogues) I need. Any suggestions would be much appreciated!

ensembl perl api fasta • 3.2k views
ADD COMMENT
2
Entering edit mode
10.8 years ago
Emily 24k

Hello. The thing is, the homology module just pulls out all members of the group, without referring back to what gene you used to get the homology in the first place. That means you can't tell the module itself to just get orthologues. What you could do is put in a loop that excludes all members of the group from the same species, eg

unless ($member->taxon->binomial eq 'Homo sapiens') { do some stuff }

You can exclude non-vertebrates by having another loop that picks just the vertebrates

if ($member->taxon->classification =~ /Vertebrata/) { do some stuff }
ADD COMMENT
0
Entering edit mode

Perfect! Thanks Emily, that works great.

ADD REPLY

Login before adding your answer.

Traffic: 1627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6