How to create a tab delimited file?
6
2
Entering edit mode
5.7 years ago

Hi! I'm doing something wrong here. I have a long text file that look like this:

AB Ana Biba 1029293.34341

And I want to print out the following to a new tab delimited file:

AB         Ana Biba        1029293.34341

Here's my script. Why doesn't it work?

    my $infile = $ARGV[0];

    open (my $infile, "<", "namconvmars.txt")
    or die "Can't read from $infile: $!";

    my (@group1, @group2, @group3);

    while (<$infile>){
        my @cols = split(/\t/);
        push @group1, @cols[0];
        push @group2, @cols[1];
        push @group3, @cols[2];
        print "@group1\t@group2\t@group3";
    }
    close $infile

Thanks in advance!

perl • 2.1k views
ADD COMMENT
0
Entering edit mode

how is it related to bioinformatics ?

ADD REPLY
2
Entering edit mode
5.7 years ago

You want to split your input on space, not tab.

e.g.

       # my @cols = split(/\t/); # Change this
       my @cols = split(' ');  # To this.
ADD COMMENT
1
Entering edit mode
5.7 years ago

BTW, you want:

tr " " "\t" < namconvmars.txt
ADD COMMENT
0
Entering edit mode

I haven't seen this command before. Awesome!

ADD REPLY
1
Entering edit mode
5.7 years ago
Bill Pearson ★ 1.0k

You do not need three "@group"s -- you either need three scalars ($field0, $field1, $field2) or one @group, which you could print with join("\t",@group);

A simpler solution is to:

while (my $line = <$input>) { 
  chomp($line)
  print join("\t",split(/\s+/,$line),"\n"
}

or

$line =~ s/\s+/\t/;
print $line
ADD COMMENT
1
Entering edit mode
5.7 years ago
JC 13k

There are some Perl-ings you need to understand first:

my $infile = $ARGV[0];

This line reads the first command line argument after your script name and pass to the variable $infile

open (my $infile, "<", "namconvmars.txt")
or die "Can't read from $infile: $!";

You are declaring again $infile (that is what my does), also you are reusing the variable to be a file pointer. So, you don't need the first line my $infile = $ARGV[0] because you never used it.

my (@group1, @group2, @group3);

while (<$infile>){
    my @cols = split(/\t/);
    push @group1, @cols[0];
    push @group2, @cols[1];
    push @group3, @cols[2];
    print "@group1\t@group2\t@group3";
}
close $infile

On this part I think you want to collect the values, but if your intention is to simply convert each line, you don't need the arrays, just read, modify and print each line. The complex part I see, when you split the line using spaces, the second element is splitted too ("Ana Biba" -> ["Ana", "Biba"], to avoid this you will need to reconstruct that element. Something like:

#!/usr/bin/perl
use strict;
use warnings;
my $file = "namconvmars.txt";
open (my $infile, "<", $file)
or die "Can't read from $file";

while (<$infile>){
    my @cols = split(/\s+/, $_);  # break line using spaces
    my $first = shift(@cols);  # grab first element
    my $last  = pop(@cols); # grab last element
    my $mid   = join " ", @cols; # reconstruct middle element
    print join "\t", $first, $mid, $last;
}
close $infile
ADD COMMENT
0
Entering edit mode

Thank you so much!

Actually, my file has several elements that looks the same:

XX Xxxxx_Xxxx YyyYy
XY Xyxyx_Xyxyx YxYx

So I need to go to then next row after each row. How do I do this?

ADD REPLY
1
Entering edit mode

It's complaining about or die "Can't read from $infile: $!";. Which makes sense, if open (my $infile, "<", "namconvmars.txt") fails for some reason, then $infile wont be set, so you can't use it in your error message (which would print the content of the file anyways, probably not what you wanted.) You weren't seeing this error originally because you were declaring $infile before the open statement, so you were making sure it was declared even if open fails.

You probably want to do something like:

my $filename = "namconvmars.txt"; # Or set it via $ARGV
open (my $infile, "<", $filename) or die "Can't read from '$filename' !";

So if for some reason $filename is not readable, you'll see: Can't read from 'namconvmars.txt' ! at tabs.pl line 6.

ADD REPLY
0
Entering edit mode

true, I modify the code to read the file name from another var

ADD REPLY
0
Entering edit mode

The while (<$infile>) {} loop reads the file line per line

ADD REPLY
1
Entering edit mode
5.7 years ago
5heikki 11k
awk 'BEGIN{FS=" ";OFS="\t"}{print $1,$2" "$3,$4}' in > out

edit. More general solution where the first and last space are replaced with tabs

awk 'BEGIN{FS=" "}{L=$NF; NF--; sub(" ","\t",$0); print $0"\t"L}' in > out
ADD COMMENT
1
Entering edit mode
5.7 years ago

with sed:

$ sed 's/\s\+/\t/g' test.txt          
AB  Ana Biba    1029293.34341
ADD COMMENT

Login before adding your answer.

Traffic: 2311 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6