Perl Script To Make Statistics Of Mirna Abundances For Many Samples

0

Entering edit mode

11.3 years ago

biolab ★ 1.4k

HI everyone,

I am making statistics of mirna abundances for many samples. Below is an example for input file.

SAMPLE    MIR    ABUNDANCE
sample1   mir1   30
sample1   mir3   100
sample1   mir4   120
sample2   mir1   40
sample2   mir2   200
sample3   mir1   190

......

I need to change the directions of tthe above matrix, and an ideal output is like below.

          sample1    sample2    sample3
mir1      30           40         190
mir2      0            200         0
mir3      190          0           0
mir4      120          0           0
......

i tried to write perl hash of hash to sort out the problem (see below). However, I am new with this relatively complex hash. Could anyone provide suggestions. I believe it's a good stuff to learn for other perl beginners too. Thank you very much!

open FH, '<', $ARGV[0] or die "open failed:$!";
my %h;
while (<>){
        my ($sample, $mir, $abun) = /(.+?)\t(.+)\t(.+)/;
        $h{$sample}{$mir} = $abun; 
}
foreach my $sample (keys %h){
        foreach my $mir (keys %{h{$sample}})
                print "   "      # i am stuck here. Need your help!
}

perl mirna • 4.4k views

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.3 years ago by biolab ★ 1.4k

5

Entering edit mode

May I ask, is perl an absolute requirement? What you are looking for can be accomplished by a few commands using Pandas in Python, and I am happy to provide code for you if you would like to.

ADD REPLY • link 11.3 years ago by ericmajinglong ▴ 120

2

Entering edit mode

I agree with eric. Though it can be done with perl, it can more easily be accomplished by other means. In my experience, knowing how to work with R is very useful for numerical analyses. Have a look into the aggregate() function.

ADD REPLY • link 11.3 years ago by Irsan ★ 7.8k

0

Entering edit mode

Thanks Eric and Irsan, I agree. Learning R is really necessary.

ADD REPLY • link 11.3 years ago by biolab ★ 1.4k

7

Entering edit mode

11.3 years ago

Neilfws 49k

I realise that you asked for a Perl solution but as others noted in the comments, sometimes it's good to know that the right tool for the job exists already.

With that in mind, here is acast() from the R/reshape2 package.

library(reshape2)
mi <- read.table("mi.txt", header = T, stringsAsFactors = F)
acast(mi, MIR ~ SAMPLE)

#      sample1 sample2 sample3
# mir1      30      40     190
# mir2      NA     200      NA
# mir3     100      NA      NA
# mir4     120      NA      NA

ADD COMMENT • link 11.3 years ago by Neilfws 49k

0

Entering edit mode

Thanks a lot for all solutions. They are really helpful!

ADD REPLY • link 11.3 years ago by biolab ★ 1.4k

4

Entering edit mode

11.3 years ago

umer.zeeshan.ijaz ★ 1.8k

Okay quickly wrote this perl one-liner for you assuming that your input file is tab-delimited (just redirect the output to another file):

	$ cat test.txt
	MPLE MIR ABUNDANCE
	sample1 mir1 30
	sample1 mir3 100
	sample1 mir4 120
	sample2 mir1 40
	sample2 mir2 200
	sample3 mir1 190
	sample3 mir1 400
	sample4 mir4 20
	sample5 mir1 19

	$ perl -ane 'if ($. > 1){$r{$F[0].":".$F[1]}=$F[2];unless($F[0]~~@s){push @s,$F[0];}unless($F[1]~~@m){push @m,$F[1];}}END{print "Samples\t".join("\t",@s)."\n";for($i=0;$i<@m;$i++){print $m[$i];for($j=0;$j<@s;$j++){(not defined $r{$s[$j].":".$m[$i]})?print "\t".0:print"\t".$r{$s[$j].":".$m[$i]};}print "\n";}}' test.txt
	Samples sample1 sample2 sample3 sample4 sample5
	mir1 30 40 400 0 19
	mir3 100 0 0 0 0
	mir4 120 0 0 20 0
	mir2 0 200 0 0 0

view raw gistfile1.sh hosted with ❤ by GitHub

Best Wishes, Umer

ADD COMMENT • link updated 3.7 years ago by Ram 45k • written 11.3 years ago by umer.zeeshan.ijaz ★ 1.8k

3

Entering edit mode

This rather pushes the definition of "one liner" :)

ADD REPLY • link 11.3 years ago by Devon Ryan 105k

0

Entering edit mode

saves u from one "wget" though ;)

ADD REPLY • link 11.3 years ago by umer.zeeshan.ijaz ★ 1.8k

0

Entering edit mode

This answer doesn't make sense.

ADD REPLY • link 11.0 years ago by Alex Reynolds 36k

0

Entering edit mode

Looks like something got lost in the transition to the new biostars.

ADD REPLY • link 11.0 years ago by Devon Ryan 105k

Login before adding your answer.