Hi,
I have a bash script and it looks like this:
*#!/bin/bash
for i in *dat.gz
do gunzip $i
echo uniprot_sprot_archaea.dat | perl -slane '$a=(split /\_/, $_)[2]; $a=~/(\w+).dat/; $b=$1; print "perl screen_complete_proteome_from_uniprot_division.pl \$i >> uniprot_".$b.".fasta"' -- -i=$i
done*
I don't know coding. But I need to understand this perl commands. From echo to end of the command, I don't understand. Could someone please explain them?
Thanks a ton, and sorry for these silly request.
I have some doubts that this is working as intended. What is it you're trying to do ?
For instance, while the bash script will unzip all dat.gz files, the perl line will repeatedly work on the string uniprot_sprot_archaea.dat. The split part extract the string archae and so the perl one-liner will print the following every time the bash script unzips a file:
perl screen_complete_proteome_from_uniprot_division.pl $i >> uniprot_archaea.fasta
Note the presence of $i in the output, this is because the \ preceding $i, tells perl to not interpret the $ sign as indicating a variable.
I am very confused here. I think the main execution here is based on the perl script. I have provided it below.
The entire idea is to extract sequences with “Complete Proteome” in the Keyword from files downloaded (Swiss-Prot and TrEMBL). All I am trying to do is repeating some analyses from this paper. http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.2002266#sec014 (method section HGT analyses)
The script screen_complete_proteome_from_uniprot_division.pl is never executed when you run the bash script you posted. If you want to execute it from within the perl one-liner, one option is to use the qx operator, i.e. replace print by qx, but that's not the only problem you have.
Thanks very much. I will replace the print with qx. Also, if you don't mind and have time to spend, is it possible to point out the other problems?
Thanks like a ocean!!
For every file that is unzipped, the bash script passes the string 'uniprot_sprot_archaea.dat' to perl, i.e. it's always printing the line: perl screen_complete_proteome_from_uniprot_division.pl $i >> uniprot_archaea.fasta
Maybe you want run the script screen_complete_proteome_from_uniprot_division.pl on each unzipped file ? Then try something along these lines:
Great. Thank you! I will try these...
I'd recommend redirecting stdout of the bash script to a file and executing that file. Running the perl script from a loop will make debugging more difficult.