How to skip a genebank file that has no dna sequence?
1
1
Entering edit mode
9.8 years ago
fr ▴ 220

Hi!

I'm parsing a series of gbff taken from NCBI. However, some of them don't have the DNA sequence inside (e.g.: see the .gbff.gz here).

What I want to do is to skip these files, but I dont know what to call on Bio::SeqIO. Could anyone just tell me what should I call?

A snippet of my code is:

my $in=Bio::SeqIO->new(-file=>"$mygbkff",-format=>'genbank',-alphabet=>"dna");

while (my $seq=$in->next_seq){
    for my $feat ($seq -> get_SeqFeatures){
        next unless (($feat->primary_tag eq "CDS") and (defined $feat->spliced_seq->seq)); #this was my attempt at finding only cds that had sequences annotated
        print "\>";
        print $feat->spliced_seq->seq,"\n";
    }
}
perl genome genbank • 2.1k views
ADD COMMENT
1
Entering edit mode

You found the right solution but I would recommend adding that as an answer so it can be marked as solved. Just for reference, the parentheses aren't needed and you may want to check if the seq is defined (flow control) rather than not being defined. For example, next unless defined $seq->seq or if (defined $seq->seq) { .... } are perhaps more expressive.

ADD REPLY
0
Entering edit mode

Thanks for the suggestion, I used it in the answer below. And thanks for the suggestion of the parentheses, it is cleaner

ADD REPLY
2
Entering edit mode
9.8 years ago
fr ▴ 220

Adding this as an answer as requested.

The solution was, rather than verifying if $feat is defined, to verify if $seq was defined:

my $in=Bio::SeqIO->new(-file=>"$mygbkff",-format=>'genbank',-alphabet=>"dna");

while (my $seq=$in->next_seq){
    next unless defined $seq->seq;#This ignores those sequences whose DNA was missing
    for my $feat ($seq -> get_SeqFeatures){
        next unless (($feat->primary_tag eq "CDS") and (defined $feat->spliced_seq->seq)); #this was my attempt at finding only cds that had sequences annotated
        print "\>";
        print $feat->spliced_seq->seq,"\n";
    }
}
ADD COMMENT

Login before adding your answer.

Traffic: 2278 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6