How To Retrieve Mrna Split Locations From Genbank Flatfile?
1
0
Entering edit mode
11.6 years ago
mluypaert ▴ 10

Hi all,

I got some trouble parsing the genbank flatfile format that ncbi is using for data export. I got a genbank flatfile containing genomic regions with mRNA features in it, which I am parsing with perl (and Bioperl). The mRNA features were retrieve with the get_SeqFeatures() function and I can retrieve all information about each mRNA using the get_all_tags() and the get_tag_values() functions from Bioperl, but I also need the genomic locations for each exon in the mRNA. For that I need to find the genomic location of the gene it belongs to (which don't seem to be in the flatfiles I downloaded) but more importantly, I need to be able to get the split locations for each exon in the mRNA from the mRNA line like:

 mRNA            complement(join(4468..4717,4801..4940,6511..6767,
                     6933..7071,9260..9344,9478..9593))

How can I retrieve this bit of information (from the SeqFeature object I am using in BioPerl)?

genbank ncbi parsing perl bioperl • 2.6k views
ADD COMMENT
0
Entering edit mode
11.6 years ago
mluypaert ▴ 10

I found the answer myself after some browsing in the Bioperl manuals. The following chunck of perl code solved my problem:

        $location_obj = $feat_object->location();

        # retrieve split location

        my $location_ref = ref($location_obj);
        if($location_ref eq 'Bio::Location::Simple'){
            $sub_locations[0] = $location_obj;
        }elsif($location_ref eq 'Bio::Location::Split'){
            @sub_locations = $location_obj->sub_Location();
        }

I made a Genomic Region For Ncbi Transcript(/Gene) Accessions for retrieving the genomic location instead of the contig locations (which are retrieved directly from the genbank flatfiles in this case).

ADD COMMENT

Login before adding your answer.

Traffic: 2394 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6