Entering edit mode
11.1 years ago
biolab
★
1.4k
Dear all, i have a file and want to know if the word in column 2 contains the word in column 1. If so output yes at column 3, otherewise output no. For example,
abc abcde
abc abddd
abc dabcc
I want to output like below:
abc abcde yes
abc abddd no
abc dabcc yes
My following script doesn't work. I am a perl beginner, could u pls briefly indicate erros to me? Any comments are ok. THANKS!
my $file= @ARGV;
open IN, <$file>;
my @file=<IN>;
my $i=0;
if ($i < $#file){
if ($file[1]=~/.*$file[0].*/) {
print "$file[0]\t$file[1]\tyes\n"; }
else {print "$file[0]\t$file[1]\tno\n";}
$i +=1;
}
close IN;
What is the relevance to bioinformatics? Also, you could just do that with an awk one-liner:
cat foo.txt | awk '{if(match($2,$1)) {print $1,$2,"yes"} else { print $1,$2,"no"}}'
Your one-liner can be shortened to:
cat File.txt | awk '{print $1,$2,match($2,$1)?"yes":"no"}'
You can skip
cat
and shorten this to:awk '{print $1,$2,match($2,$1)?"yes":"no"}' File.txt
powerful awk. Actually I am learning perl now. I am eager to find some rules to program. Anyway, THANKS a lot!
Hi dpryan79, it is relavant to bioinformatics. I only made an example there. My column 1 lists mature miRNA sequences, while column 2 lists predicted miRNA precursor sequence. I need to find those mature miRNAs that locate precisely within the miR precursors. So you can see how this command work in bioinfromatics. THANKS
In the future, you might want to state that in advance. Some editors would tend to close questions like this upon reading it due to lack of relevance.
Have in mind that they can be encoded in - strand.
I used it in bioinformatics!! thank you!!
Stack Overflow - is a question and answer site for professional and enthusiast programmers
As others said: it is important to phrase your question in terms of a research problem in bioinformatics. Otherwise, it appears to be a "straight programming" question and we will direct you to StackOverflow. Although to be honest, this is a "straight Perl" problem even with the bioinformatics content.
Here's a Perl one-liner for the task: perl -lane 'print "@F ",(index$F[1],$F[0])>-1?"yes":"no"' foo.txt`
In addition to StackOverflow, PerlMonks is another site for Perl questions.
thanks for all answers!