Dear all, i found this great script to replace sequence identifiers in a text file (newick format) with strings from a tab delimited file in this post
use strict;
use warnings;
my $treeFile = pop;
my %taxonomy = map { /(\S+)\s+(.+)/; $1 => $2 } <>;
push @ARGV, $treeFile;
while ( my $line = <> ) {
$line =~ s/\b$_\b/$taxonomy{$_}/g for keys %taxonomy;
print $line;
}
It works perfect for one text file (newick) but i have thousonds (<20000) of such text files (newick) and one tab delimited file. How can i change this script for all of my text files? Any help would be great!
ibasan : If you have access to a cluster you could use this script to fire those jobs in parallel. Adjust
do perl
line to include relevant job scheduler commands (warning: test with a small set of files first before you fire off 20000 incorrect jobs).Hi genomax2, atm sridhar56 script is running. I tried your version like this:
But this did not work. Can you please explain what i have to do?
You either need to run that on the terminal using the line,
Or copy the script into a shell file, Ex: tree.sh,
and run on the terminal,
I tested it with 1000 files and it works! Thanks for helping me again!