Hey everyone! I think i'm having a buffering issue since i need to read and parse big text files (created by myself in previous lines of the code) to finally print things in another file. At some point, after reading a file with 90855 lines, the script is not reading a line of the next file completely. I have counted the number of characters read until this happens: 233467, and therefore tried to flush the buffer and sleep before reading the next line of the file. Doesn't work. Any suggestion, please? thanks a lot. The part of the code coming:
for my $o (0..1){
if ($o==0){
@files = reverse <*_SITES_3utr>;
}else{
@files = reverse <*_SITES_cds>;
}
undef(%pita_sites_nu);undef(%pita_tot_score);my($comp_p);undef(%allowed_wobbles);#undef(%site_nu);
foreach $i(@files){
my $buff=0;
print "Analyzing $i\n";sleep(1);
$program= $1 if $i=~ /(\w+)_SITES/;
open(FIL, $i) or die "$!: $i\n";
while(<FIL>){
$buff += length($_); if ($buff >= 230000){$buff=0;sleep(1);select((select(FIL), $|=1)[0]);} #FLUSH THE BUFFER, NOT WORKING!!!
undef($a);
unless($.== 1){
if ($o==0){
if (/^\d+\t(\S+)\t(\S+)\t(\d+)\t(\d+)\t(\S+)\t(\S+)\t(.*)/){
$mirna= $1; $target= $2; $start= $3; $end= $4; $site= $5; $comp_p= $6;$a= $7;$j= "${mirna}_${target}_${start}_$end";
$site_nu{$j}= "$mirna\t$target\t$start\t$end\t$site\t$comp_p";#Store each site in a hash
}else{die "$buff characters, in line $.:$_\n"} #DIES HERE!!!
}else{
if (/^\d+\t(\S+)\t(\S+)\t(\d+)\t(\d+)\t(\S+)\t(.*)/){
$mirna= $1; $target= $2; $start= $3; $end= $4; $site= $5;$a= $6;$j= "${mirna}_${target}_${start}_$end";
$site_nu{$j}= "$mirna\t$target\t$start\t$end\t$site";#Store each site in a hash
}
}
Ii dies at the "DIES HERE!!" die, after reading 3413 characters of the second file. Happens because the regex doesn't work since only half of the line is in $_. Help please! Thanks again.
Stupid question, but when you look at the line in the second file where your program is dying, does it have the right number of fields? Are they properly delimited? In my experience, Perl is able to handle files that have millions of lines without any special attention to buffering on my part, so I would be very skeptical that your issue lies there.
hey Mitch. Thanks. No stupid question at all, is well received. Yes, all the data is in the file. In the end, I had to flush the filehandle I was using to write to the files. Before start parsing them. Maybe because I had to open and write to many files earlier in the script I got a buffering problem. I'm new to Perl, so....Thanks so much.
My pleasure Danny. I've worked with Perl on and off quite a bit, so I'm happy to help when I can. The only other thing to think about is maybe closing all the files you had open earlier in the script? As for stupid questions, usually when I encounter a programming bug that looks like a fault with the language (e.g. no more buffer), the answer is more likely to be a mistake that I made than exposing the shortcomings of a programming language. In my experience, when it's the people who wrote a programming language and/or cosmic rays magically changing output from a program vs my own mistakes, my own mistakes are always the cause ;)