Perl Script To Select Snp Assays
3
0
Entering edit mode
11.3 years ago

Hi, I have data like this for exacmple

ASSAY MP Cm chr# Qty poly#
XY1723 408068 0.3 1 27 0
XY1727 463360 0.3 1 28 0
XY1708 880744 25.7 1 14 1
XY1709 900596 25.8 1 11 1
XY487 1174585 0 1 290 1
XY1641 1239146 27.3 1 5 0
XY1714 3057570 34.2 1 28 0
XY1721 3522443 34.2 1 13 1

This is biology data, my aim is to select ASSAY on every CHR# like this i have 12 chromosomes, The criteria to select assay is i want to select Assay based on MP column and CM column and i want to select ASSAY after a range of every100,000 (hundred thousand, 100kb) value in MP column and every 5 Cm in CM coulmn. For example if i select XY1723 and i want to select another having MP column value equal to 508068 and CM column value difference must be 5. Can any one help me with perl, any help would be appreciated Thanks Regards, Genetist

bioperl • 3.2k views
ADD COMMENT
0
Entering edit mode

its still not clear...can you elaborate and give a potential output you are expecting...

ADD REPLY
0
Entering edit mode

Dear Rm,

Thank you very much for your reply. My aim is to select assays after every 100,000 difference in MP coulmn and 5 centimorgans distance in cM column like this i have go through all the assays till last that is chromosome 12, for example if i select assay XY487 and i want to select next assay with MP difference (Mp difference of xy487 (1174585+ 100,000 i.e 1274585). Presently i am doing manually in excel and it is hurting me lot. Regards,

ADD REPLY
0
Entering edit mode
11.2 years ago
jing ▴ 10

Hi You may try this code and see if it works. It assumes an exact gap of 100,000 in MP and exact gap of 5 in Cm.

use 5.16.0;
use warnings;


my %data;
my $MP_count = 100_000;
my $Cm_count = 5;
my $match = 0;
my @fields = qw/MP Cm chr Qty poly/;

while (<DATA>){
    next unless /XY\d+/;
    chomp;
    my $line =[split];
    my $key = shift @$line;
    $data{$key} = {map {$_ => shift @$line} @fields};
}

my $assay = $ARGV[0];
die "No argument given" unless $assay;
die "Assay name is not valid" unless $data{$assay};
my ($mp,$cm) = ($data{$assay}->{MP},$data{$assay}->{Cm});
for my $key (keys %data){
    if ($data{$key}->{MP} == $mp + $MP_count and abs($data{$key}->{Cm} - $cm) == 5){
        say "Match found: $key";
        $match++;
    }
}

say "Match not found" unless $match;


__DATA__
ASSAY MP Cm chr# Qty poly#
XY1723 408068 0.3 1 27 0
XY1727 463360 0.3 1 28 0
XY1708 880744 25.7 1 14 1
XY1709 900596 25.8 1 11 1
XY487 1174585 0 1 290 1
XY1641 1239146 27.3 1 5 0
XY1714 3057570 34.2 1 28 0
XY1721 3522443 34.2 1 13 1

ADD COMMENT
0
Entering edit mode
11.2 years ago

Dear jing, Thank you very much for your help to solve my problem. I tried your script to execute and i am getting error like no argument given at line 21, <data> line10 and after that code is not running. what could be the possible reason for this? Thanking you very much, With kind Regards

ADD COMMENT
0
Entering edit mode

Hi, if you save the script as tst.pl, and you want to find an assay after a range of 100,000 in MP and 2 in CM of XY1723, then the way to use the script is:

$ ./tst.pl XY1723
ADD REPLY
0
Entering edit mode
11.2 years ago

Dear jing,

thanks lot for your help and for spending your valuable time in helping. I am PERL version 5.16 and i am runnind code like this C:\users\desktop>perl snpselection.pl $ ./snpselection.pl XY1723. i saved code as snpselection.pl and i am getting message like Assay name is not valid at snpselection.pl at line 21, <data> line11. I think still some issues are there that we have to fix. Thanking you very much,

Regards

ADD COMMENT
0
Entering edit mode

Sorry, didn't realise you were using Windows. You can run the program as: C:\users\desktop>perl snpselection.pl XY1723

ADD REPLY

Login before adding your answer.

Traffic: 1057 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6