Hi everyone,
I am slowly learning NextFlow and I am trying to put together a very basic pipeline with a few processes (just a series of bash commands to manipulate a csv file). It worked until I added the last process (see below)
process ReplaceI5 {
input:
path 'test4.csv'
path 'test1.csv'
output:
path 'test5.csv'
script:
"""
awk -v FS=, -v OFS=, 'FNR==NR{hash[FNR]=$0; next}{$8 = hash[FNR]}1' test4.csv test1.csv > test5.csv
"""
}
This process requires the files test4.csv that comes from the previous process and test1.csv which is the output of the first process of the pipeline. If I run the pipeline like this, it gives me an error and I think it is connected to the fact that maybe I cannot input 2 files as input.
I am not familiar with Nextflow enough to understand exactly how to proceed or which options I have here. I was thinking, maybe re-directing test4 and test1 to the same directory when generated and then collect them from a shared folder in this process?
Would appreciate some help/suggestions or examples you may have.
Best,
Giulia
show us the error
wrong
Unfortunately, this has not worked either. I posted the error below.
Best,
Giulia
Thank you for your thorough reply dthorbur you are right.
Here's the error
This pipeline only had 1 process at the beginning and I followed one of the example pipelines from downloaded from github (
process-collect.nf
but instead of using fq.gz files, I am using csv files. Then, I slowly worked my way through the other processes.I struggle to understand the example you reported, but I will read into tuple and into the documentation a bit more.
Thanks,
Giulia
'$' in awk must be escaped. See my code.
The tuple was just to show you how the logic ais applied to other input types. The nextflow learning curve can be pretty tough. Pierre has a point I forgot. The input declarations shouldn't be the file name, but a nextflow variable that will be assigned as the actual file when nextflow stages in the
.command.sh
file.So if you use
if should be referred to in the script block as
${file}
. It so happens in your example that your hard coded variable names also happened to be the actual file names, so it works, but is bad practice.