Explanation of a Perl script
1
0
Entering edit mode
6.8 years ago
uuuiii647 • 0

Hi! :)

I'm new here and I'm also at my first approaches to Bioinformatics. I found this Perl script and I don't really understand all the passages, in particular in the end. Could someone explain me it, maybe telling me why programmers used these ways to write the script and not others?

I know maybe this is not a real good question but I'd really like to understand this script and I don't know how to do it...

Thank you very much and Ihope you could understand my English!

       $file=shift;

        print "Window length\n";
        $kmer=<STDIN>;
        chomp ($kmer);

         print "Minimum quality score cut-off\n";
         $cut_off=<STDIN>;
         chomp ($cut_off);

         open (MYFILE, ">R.txt");   

         if (open(FASTQ,$file)) 
          {
         while($header1=<FASTQ>)
             {      
           $dna=<FASTQ>; 
           $header2=<FASTQ>;
           $qual=<FASTQ>; 
           @dna=split ('', $dna);  
           @qual=split ('', $qual); 
           @num_value=();
           @scores=();

        foreach $qscore (@qual) 
        {
           $num_value=ord($qscore)-33;
           push(@num_value, $num_value);
        }


                foreach $value (@num_value)
                {
                    if ($value<$cut_off){   
                        pop(@qual);                                                 
                    }else{
                        last;
                    }   
                }
                $sub1=substr($dna,-$#qual);
                $sub11=substr($qual,-$#qual);

                @qscopy=reverse @num_value;
                foreach $value (@qscopy) 
                {
                    if ($value<$cut_off){
                        pop(@qual); 
                    }else{
                        last;                           
                    }
                }
                $sub2=substr($sub1,0,$#qual);
                $sub22=substr($sub11,0,$#qual);

                for ($i=0;$i<=$#qual-($kmer-1);$i++)
                {
                    @scores=@num_value[$i..$i+$kmer-1];

                    $sum=0;
                    foreach $score (@scores)
                    {
                        $sum+=$score;
                    }
                        if (($sum/$kmer)<$cut_off){
                            last;
                        }
                }                               
                $sub3=substr($sub2,0,($i-1));
                $sub33=substr($sub22,0,($i-1));

print MYFILE "$header1$sub3\n$header2$sub33\n\n";
}   
   }else{
print "Error!\n";
   }

  close MYFILE;
fastq • 1.7k views
ADD COMMENT
1
Entering edit mode

why programmers used these ways to write the script and not others?

For Perl, the reasons may vary: stylistic decision, personal preference, performance, ease of understand, or just because you can. One of Perl mottos is There's more than one way to do it.

ADD REPLY
1
Entering edit mode

me why programmers used these ways to write the script and not others?

I wonder the same. Why didn't they use Python? < /just kidding >

ADD REPLY
5
Entering edit mode
6.8 years ago

Hi! Since you're new to bioinformatics, I suppose this will be your first time of:

https://en.wikipedia.org/wiki/RTFM

Apart from the jokes, you can search for every function and command in the official perl documentation (http://perldoc.perl.org/) or googling the command and the word "perl" together with it.

You will most likely find another post here in Biostars or in Stack Overflow (https://stackoverflow.com/) where people already asked that, and most of the times got roasted for it :)

What you're asking is very general and not really related to bioinformatics, rather to perl itself. Things you can solve yourself searching for the answers on your own.

If, instead, you have a question about one command that is giving you trouble with biological data: this is the place!

You'll find nice and juicy tutorials at:

ADD COMMENT
0
Entering edit mode

And the other thing the OP will need to know is the structure and contents of a FastQ file (Description here).

ADD REPLY

Login before adding your answer.

Traffic: 1948 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6