Hi all,
I've written one perl script, which compares two vcf files and write output in third file. it is running good with small files, but when i am inputting big files in Gbs, my system becomes slow and it would hang.
I'm using dbSNP vcf file as input which is around 9GB.
Please Suggest me something that will make my perl script run faster. This is my first perl script and i'm new to perl.
Any help would be appreciated !!
Thank you !!!!
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
use List::MoreUtils;
open (FILE, "<", "dbSNP_in.vcf") or die "failed to open: $!\n";
my @array=(<FILE>);
my @CHR;
my @location;
my @rs;
my @ref_n;
my @alt_n;
foreach (@array)
{
chomp;
my ($chrom, $pos, $id, $ref, $alt, $qual, $filter, $info) = split(/\t/, $_);
push @CHR, $chrom;
push @location, $pos;
push @rs, $id;
push @ref_n, $ref;
push @alt_n, $alt;
}
open (FILE1, "<", "trial_rs.vcf") or die "failed to open: $!\n";
my @array1=(<FILE1>);
open (OUT,">trial_output.vcf");
my @columns;
foreach (@array1)
{
chomp;
@columns=split(/\t/, $_);
my $i;
for ($i=0; $i<@array; $i++)
{
if (($columns[0] eq $CHR[$i]) and ($columns[1] eq $location[$i]) and ($columns[3] eq $ref_n[$i]) and ($columns[4] eq $alt_n[$i]))
{
$columns[2]=$rs[$i];
}
}
print OUT join("\t", @columns), "\n";
}
"Please Suggest me something that will make my perl script run faster. This is my first perl script and i'm new to perl."
It seems obvious that one needs to post your script for this purpose, doesn't it?
This is the script
... moved ...