Perl script for discarding sequences less than 200 nucleotides before running CPC in rnaseq analysis
1
0
Entering edit mode
6.2 years ago

Ii want the perl script to discard sequences less than 200 nts from fasta file to run CPC

RNA-Seq perl • 1.4k views
ADD COMMENT
0
Entering edit mode

what have you tried so far ?

ADD REPLY
6
Entering edit mode
6.2 years ago
karthic ▴ 130

I hope this below script works...just save and run with script name followed by fasta file and trim_length (integer)

#!/usr/bin/perl
use strict;
use warnings;

my $minlen = shift or die "Error: `minlen` parameter not provided\n";
{
    local $/=">";
    while(<>) {
        chomp;
        next unless /\w/;
        s/>$//gs;
        my @chunk = split /\n/;
        my $header = shift @chunk;
        my $seqlen = length join "", @chunk;
        print ">$_" if($seqlen >= $minlen);
    }
    local $/="\n";
}
ADD COMMENT

Login before adding your answer.

Traffic: 2550 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6