using txt file to pass pathways for pipelines
0
0
Entering edit mode
2.5 years ago
Hugo • 0

Hi everyone,

I don't have much background on bioinformatics (I'm studying and having some trainee about it at the same time). Sorry if it's a stupid question, I could not find an answer by my own.

I'm using several different pipelines repeatedly times and I would like to know how could I automate it, at least part of the job. The pipelines that I'm using are: muscle via anvi'o for multiple sequence alignment and seqkit to manipulate fasta files.

I'm using macOS and terminal.

All the commands that I'm using requests an input file and output which I've been trying to pass via loop and cat.

For an example:

for entrada saida in `cat name_path3.txt`; do muscle -in $entrada -out $saida ; done

name_path3.txt looks like (with no space between lines)

/User/hdc/.../file_input_1.fasta
/User/hdc/.../file_output_1.fasta
/User/hdc/.../file_input_2.fasta    
/User/hdc/.../file_output_2.fasta

The loop woks fine but it seams that the pipeline does not recognize/read the variable correctly and do not work. If I pass just the pathways one by one works fine.

  1. Is correct to use cat ? Should I use another function?
  2. Should I modify the txt file with something?

By the way, I already did by hand. But I don't want to do the way I did if I have to do things like that in the future :)

Thanks so much!

muscle seqkit bash fasta • 1.6k views
ADD COMMENT
0
Entering edit mode
 cat name_path3.txt | paste - - | while read entrada saida ; do muscle -in "${entrada}" -out "${saida}" ; done
ADD REPLY
0
Entering edit mode

Thanks for the help Pierre, but it did not work. I receive the same message for every iteration (just change the file pathway):

' errno=2 *** Cannot open ' /User/hdc/file_input_1.fasta

If I copy and paste the path into muscle -in /User/hdc/file_input_1.fasta -out /User/hdc//file_output_1.fasta it runs perfect.

Do you have any clue about it? Do I need to give any permission in order to run?

Thanks again!

ADD REPLY
0
Entering edit mode

If you're copy-pasting the error right, you have a leading white space before / in /User, like so /User/.... Do this:

sed -Ee 's/^[ ]+//' name_path3.txt | paste - - | while read entrada saida ; do muscle -in "${entrada}" -out "${saida}" ; done
ADD REPLY
0
Entering edit mode

Thanks for the reply Ram!

I did what you mention but I received the same error. So, considering what you said about space on the txt file I went thought the txt file just to make sure it was right. The funny thing is that when I tried to move the cursor to the right on the last letter of the line it did not move (kind of weird).

Like: /User/.../file_input_1.fasta | pretend | is the cursor and I was trying to move to > using the keyboard. The cursor did not move, it was kind of stuck! It should go to the beginning of the next line.

So, I copy and paste the first two lines with no changes on the others and ran the code again and just the modified worked. I end up copying and pasting all the other lines and voila, it worked. I compared the result to the output that I did by hand and was the same.

So, thanks again for all help. Now I'll pay attention on the next files.

By the way, I tried all the options mentioned before, even what I did at the beginning and worked also!

ADD REPLY
0
Entering edit mode

Can you run file name_path3.txt and paste the output here? Also this:

head name_path3.txt | cat -A
ADD REPLY
0
Entering edit mode

Your path list is not well formatted.

I create an simulated example. You can tell the difference between them.

Example 1: 1.txt:

a1  b1
a2  b2

Run script:

cat 1.txt | while read ina inb;
do echo "muscle -input ${ina} -output ${inb}";
done

Result:

muscle -input a1 -output b1
muscle -input a2 -output b2

Example 2: 2.txt:

a1
b1
a2
b2

Run script:

cat 2.txt | while read ina inb;
do echo "muscle -input ${ina} -output ${inb}";
done

Result:

muscle -input a1 -output 
muscle -input b1 -output 
muscle -input a2 -output 
muscle -input b2 -output

Do you see what happen?

Now you can modify your pathname file and run the script. It will work now.

ADD REPLY
0
Entering edit mode

Again, your solution moves the paste step from Pierre's solution to a manual step. It doesn't add to Pierre's solution, it takes away from it. While your explanation of the mechanics is of value, the "modify the pathname file" part is not.

Edit your comment so it serves as an explainer of the paste, maybe.

ADD REPLY

Login before adding your answer.

Traffic: 1924 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6