Hi! :)
I'm new here and I'm also at my first approaches to Bioinformatics. I found this Perl script and I don't really understand all the passages, in particular in the end. Could someone explain me it, maybe telling me why programmers used these ways to write the script and not others?
I know maybe this is not a real good question but I'd really like to understand this script and I don't know how to do it...
Thank you very much and Ihope you could understand my English!
$file=shift;
print "Window length\n";
$kmer=<STDIN>;
chomp ($kmer);
print "Minimum quality score cut-off\n";
$cut_off=<STDIN>;
chomp ($cut_off);
open (MYFILE, ">R.txt");
if (open(FASTQ,$file))
{
while($header1=<FASTQ>)
{
$dna=<FASTQ>;
$header2=<FASTQ>;
$qual=<FASTQ>;
@dna=split ('', $dna);
@qual=split ('', $qual);
@num_value=();
@scores=();
foreach $qscore (@qual)
{
$num_value=ord($qscore)-33;
push(@num_value, $num_value);
}
foreach $value (@num_value)
{
if ($value<$cut_off){
pop(@qual);
}else{
last;
}
}
$sub1=substr($dna,-$#qual);
$sub11=substr($qual,-$#qual);
@qscopy=reverse @num_value;
foreach $value (@qscopy)
{
if ($value<$cut_off){
pop(@qual);
}else{
last;
}
}
$sub2=substr($sub1,0,$#qual);
$sub22=substr($sub11,0,$#qual);
for ($i=0;$i<=$#qual-($kmer-1);$i++)
{
@scores=@num_value[$i..$i+$kmer-1];
$sum=0;
foreach $score (@scores)
{
$sum+=$score;
}
if (($sum/$kmer)<$cut_off){
last;
}
}
$sub3=substr($sub2,0,($i-1));
$sub33=substr($sub22,0,($i-1));
print MYFILE "$header1$sub3\n$header2$sub33\n\n";
}
}else{
print "Error!\n";
}
close MYFILE;
For Perl, the reasons may vary: stylistic decision, personal preference, performance, ease of understand, or just because you can. One of Perl mottos is There's more than one way to do it.
I wonder the same. Why didn't they use Python? < /just kidding >