HI everyone,
I am making statistics of mirna abundances for many samples. Below is an example for input file.
SAMPLE MIR ABUNDANCE
sample1 mir1 30
sample1 mir3 100
sample1 mir4 120
sample2 mir1 40
sample2 mir2 200
sample3 mir1 190
......
I need to change the directions of tthe above matrix, and an ideal output is like below.
sample1 sample2 sample3
mir1 30 40 190
mir2 0 200 0
mir3 190 0 0
mir4 120 0 0
......
i tried to write perl hash of hash to sort out the problem (see below). However, I am new with this relatively complex hash. Could anyone provide suggestions. I believe it's a good stuff to learn for other perl beginners too. Thank you very much!
open FH, '<', $ARGV[0] or die "open failed:$!";
my %h;
while (<>){
my ($sample, $mir, $abun) = /(.+?)\t(.+)\t(.+)/;
$h{$sample}{$mir} = $abun;
}
foreach my $sample (keys %h){
foreach my $mir (keys %{h{$sample}})
print " " # i am stuck here. Need your help!
}
May I ask, is perl an absolute requirement? What you are looking for can be accomplished by a few commands using Pandas in Python, and I am happy to provide code for you if you would like to.
I agree with eric. Though it can be done with perl, it can more easily be accomplished by other means. In my experience, knowing how to work with R is very useful for numerical analyses. Have a look into the aggregate() function.
Thanks Eric and Irsan, I agree. Learning R is really necessary.