Entering edit mode
6.7 years ago
ArusjakGevorgyan
▴
30
Hi everyone! I have some problem with my Perl script. So I'm trying to compare two files with each other. An example, in my tablefile.txt I have genes that show me what kind of species it contains:
> EOG090X0039 8 IE_sup1,IM_nor1,OE_aff1,IT_ras1,IT_ine1,OH_azt1,OD_pul1,OD_mag1
In my other file( fast format), my gene file, I have species and its sequences.So I,m trying to see if the species in my gene file(fasta file) exist in my tablefile.txt file. All species that tablefile.txt contain I want to store and remove the rest of the species from my fasta file. I will be really grateful for all help.
#!/usr/bin/perl -w
use warnings;
use strict;
use 5.010;
my $tablefile = shift @ARGV;
if ( ! open ( FILE , "<" , $tablefile ) ) {
die "Error can't find the file: $tablefile because $!";
}
if ( ! open ( FILE_GENE , "<" , @ARGV ) ) {
die "Error can't find the file: @ARGV because $!";
}
####TABLEFILE.TXT#####
my $table_gene;
my @table_specie = ();
my %table_taxa;
####GENE_FILES########
my $fasta_taxa;
my @fasta_seq = ();
my %fasta_hash;
####COMPARISON########
my @matches = ();
#######################READ FIRST FILE(TABLE.TXT)########################
while(my $line = <FILE>){
chomp $line;
$line =~ s /\s+/:/ig;
if ( $line =~ m /^(E\w+)\:/){
$table_gene = $1;
}
if ($line =~ m /\w+\:\d+\:(\w.+)$/){
@table_specie = $1;
}
$table_taxa{ $table_gene } = "@table_specie";
}
########################OPEN GENE FILES##################################
foreach my $genefile (@ARGV){
if($genefile == $table_gene){
while(my $gene_line = <FILE_GENE>){
chomp $gene_line;
if($gene_line =~ m /^\>(\w+)/){
$fasta_taxa = $1;
}else{
$fasta_hash{$fasta_taxa} .= $gene_line;
}
}
@matches = grep { exists $fasta_hash{$fasta_taxa} } @table_specie;
}
}
you should just use a simple linux pipeline with cut/sort/join/ etc...
please, validate/close your previous questions:
C: Import files in newick format with python script ; C: Detecting error in Ubuntu environment ;
If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.