Hello all,
I've a list of IDs, and I would like to extract those sequences from a fasta file, that are not included in my list. Anyone know how to do it with perl/bioperl? Anyway to modify this script in order to get the sequences NOT present in my list?
use Bio::DB::Fasta;
my $fastaFile = shift;
my $queryFile = shift;
my $db = Bio::DB::Fasta->new( $fastaFile );
open (IN, $queryFile);
while (<IN>){
chomp;
$seq = $_;
my $sequence = $db->seq($seq);
if (!defined( $sequence )) {
die "Sequence $seq not found. \n"
}
print ">$seq\n", "$sequence\n";
}
Thanks in advance
Thanks a lot RamRS. I would take a look. So, if I negate the sentence, I would get all the sequences of fasta file that are not included in my ID list, it's correct?
That is correct. It creates a hash of all IDs in the ID file and then, iterates through the FA file seq by seq. If the ID of the current sequence is not in the hash, it should print that seq out.
Hi ram,
I am trying to use your script but I am getting this message:
To have this module I used cpan. Second point, how its possible to negate the sentence in perl? many thanks in advance.
pd. I am just starting with perl
The syntax to use the script is
To negate the functionality, change the
if(exists(...))
at line 62 tounless(exists(...))
In Perl, you can negate
if($condition)
with eitherif( ! $condition)
orunless($condition)
Many thanks for your answer. Actually still I cannot run the script. I created a fake sequence.fasta:
list.id:
and the return is this:
I dont know what is going wrong. Thanks in advance
Hmmm. I was working on fixing a bug in this script, and now I see what it could be. For the moment, could you just comment out or remove lines 24-28 please? The script should work fine after that.