Entering edit mode
9.8 years ago
fr
▴
220
Hi!
I'm parsing a series of gbff taken from NCBI. However, some of them don't have the DNA sequence inside (e.g.: see the .gbff.gz here).
What I want to do is to skip these files, but I dont know what to call on Bio::SeqIO. Could anyone just tell me what should I call?
A snippet of my code is:
my $in=Bio::SeqIO->new(-file=>"$mygbkff",-format=>'genbank',-alphabet=>"dna");
while (my $seq=$in->next_seq){
for my $feat ($seq -> get_SeqFeatures){
next unless (($feat->primary_tag eq "CDS") and (defined $feat->spliced_seq->seq)); #this was my attempt at finding only cds that had sequences annotated
print "\>";
print $feat->spliced_seq->seq,"\n";
}
}
You found the right solution but I would recommend adding that as an answer so it can be marked as solved. Just for reference, the parentheses aren't needed and you may want to check if the seq is defined (flow control) rather than not being defined. For example,
next unless defined $seq->seq
orif (defined $seq->seq) { .... }
are perhaps more expressive.Thanks for the suggestion, I used it in the answer below. And thanks for the suggestion of the parentheses, it is cleaner