Is there anyway to get a unix parallel command to write to standard output
2
0
Entering edit mode
4.1 years ago
curious ▴ 820

parallel bcftools stats chr{}.bcf | grep "number of records:" | cut -f 4 > chr{}.txt ::: {1..22}

This does not seem to work for me, I have encountered similar issues with bcftools utilities and gotten around it by specifying my output with -o instead of > where possible, but obviously this won't work after piping to grep. Is there anyway to do what I want?

unix • 1.5k views
ADD COMMENT
2
Entering edit mode

I'm not sure to understand what you're trying to do here. May be you need to export a function ?

from the parallel manual:

  # Only works in Bash
  my_func() {
    echo in my_func $1
  }
  export -f my_func
  parallel my_func ::: 1 2 3
ADD REPLY
2
Entering edit mode

You probably should quote your command:

parallel 'bcftools stats chr{}.bcf | grep "number of records:" | cut -f 4 > chr{}.txt' ::: {1..22}
ADD REPLY
0
Entering edit mode

This seems to make no different in terms of getting this to work but thanks for taking the time

ADD REPLY
3
Entering edit mode
4.1 years ago
ole.tange ★ 4.5k

@Pierre's and @Wouter's solutions should work for you.

The problem is that this:

parallel bcftools stats chr{}.bcf | grep "number of records:" | cut -f 4 > chr{}.txt ::: {1..22}

is interpreted as this:

parallel bcftools stats chr{}.bcf |
  grep "number of records:" |
  cut -f 4 > chr{}.txt ::: {1..22}

So you need a way to pass grep and cut to parallel. A function would work:

doit() {
   bcftools stats chr$1.bcf | grep "number of records:" | cut -f 4 > chr$1.txt
}
export -f doit
parallel doit ::: {1..22}

The neat thing about a function is that you can test it before parallelizing it:

doit 1

You can also quote the whole command:

parallel 'bcftools stats chr{}.bcf | grep "number of records:" | cut -f 4 > chr{}.txt' ::: {1..22}

This works in this case, but it would have been a bit more tricky: If you had used both ' and " in your command you would need to escape those.

The benefit by doing it this way is that --dryrun will give you what is run:

parallel --dryrun 'bcftools stats chr{}.bcf | grep "number of records:" | cut -f 4 > chr{}.txt' ::: {1..22}

You can then test each of the commands that would be run, and check that they work as expected.

If neither of these work, your installation of GNU Parallel is not normal. Try upgrading and try parallel --version.

ADD COMMENT
0
Entering edit mode
4.1 years ago
2nelly ▴ 350

Why don't you try xargs?

find *.bcf | sed "s/.bcf//g" | xargs -n1 -P10 -I{} sh -c "bcftools stats {}.bcf | grep 'number of records:' | cut -f 4 > {}.txt" -- {}

-P stands for the number of parallel processes you want to run.

I don't know what are exactly the names of your files, so excuse any mistyping. In any case, I think you can get the point.

ADD COMMENT

Login before adding your answer.

Traffic: 1583 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6