Hi, I made a perl script to compare the files on the basis of two Ids. But could not get the success. If anyone can help in this ??
File 1:
chr7 151046672
chr7 151047369
chr3 127680920
chr3 127680920
file2 :
chr1 66953622 66953654
chr1 67200451 67200472
chr1 67200475 67200478
chr1 67058869 67058880
chr1 67058881 67058885
chr7 151046672 127680920
chr7 151047369 127680920
chr3 127680920 151046672
chr3 127680920 151047369
#!/usr/bin/perl -w
$pwd = `pwd`;
chomp($pwd);
$file=$ARGV[0];
$file1=$ARGV[1];
open(IN,$file);
while ($line=<IN>){
chomp($line);
@ary = split(/\t/,$line);
chomp($ary[0]);chomp($ary[1]);
open(SK,$file1);
while($line1=<SK>)
{
chomp($line1);
@any = split(/\t/,$line1);
chomp($any[0]); chomp($any[0]);chomp($any[1]);chomp($any[2]);
if (($ary[0] eq $any[0] and $ary[1] == $any[1]) or ($ary[0] eq $any[0] and $ary[1] == $any[2]))
{
print "$line\tE\n";
}
else
{ print "$line\tM\n";}
}
}
This code is giving multiple lines with 'M' results only. Then I tried another code ..
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my $file1 = $ARGV[0];
open($infile1,$file1);
my $file2 = $ARGV[1];
open($infile2,$file2);
my %file2_hash;
while (my $line = <$infile1>)
{
chomp $line; #so that output with E or M can be on same line
next if $line =~ /^\s*$/; #skip blank lines (a common infile goof
+)
my ($chr, $val1, $val2) = split /\s+/,$line;
}
close $infile1;
while (my $line = <$infile2>)
{
chomp $line;
next if $line =~ /^\s*$/; #skip blank lines (a common infile goof)
my ($key, $value1, $value2) = split /\s+/, $line; # use better "nam
+es" I have
# no idea of what a chr col
$file2_hash{"$key:$value1:$value2"} = 1;
close $infile2;
if (exists $file2_hash{"$chr:$val1:$val2"})
{
print "$line\tE\n"; # match exists with file 1
}
else
{
print "$line\tM\n"; # match does NOT exist with file 1
}
}
But again the same error..
What will be the possable solution ??
What are you trying to achieve exactly? If it is compare two lists of positions to see what they have in common you have the R library GenomicRanges that has a lot of nice functions to do that:
findOverlaps(file1, file2) countOverlaps(file1, file2) etc