Hi, I have a question similar to this one:
http://www.biostars.org/post/show/50142/any-modules-available-to-parse-this-file/#50156
I adapted my code from JCs answer in that post. Thanks JC.
Here is an example of the data file data I am opening and trying to read the columns of. The values are delimited by 4 spaces.
A bunch of junk up here. Paragraph before getting to table.
NO. RES DSC_SEC PROB_H PROB_E PROB_C
1 k C 0.047 0.240 0.713
2 l C 0.067 0.365 0.568
3 n C 0.067 0.365 0.568
4 f E 0.045 0.613 0.342
...
Here is the code I have tried, which doesn't print anything. I want to be able to gather the data from PROB_H, PROB_E, PROB_C and have them in separate lists so that I can do stuff like take the averages of them.
use strict;
use warnings;
open(FILE, "file_data.txt") or die "Cannot open file: $!";
my @data = <FILE>;
while (<FILE>) {
next if m/^No./;
chomp;
my ($NO, $RES, $DSC_SEC, $PROB_H, $PROB_E, $PROB_C) = split(/\s+/, @data);
print "$PROB_H";
}
close(FILE);
Why would I be downvoted?
Some people are harsh :) Someone probably thought this was a rather basic Perl programming question, as opposed to a bioinformatics research question.
Two obvious errors straight off: (1) you have not escaped the period in your regular expression (so it will match "all characters"); (2) your data contains lines starting with NO (all upper-case) but your regular expression is looking for lines starting with No (lower-case "o").
Basically, you want to implement the 'cut' unix command in Perl? Specifically, something like
tail -n +2 | cut -c18-26
?