Question

Running BWA parallel on 1000 samples

3

Entering edit mode

9.6 years ago

ebrown1955 ▴ 320

So I have 1000 samples (about 2000 fastq files) that I need to do "bwa mem" on. I'm trying to do this is the most time-efficient way possible. Is there a way to parallelize this command?

I figure that I can use GNU/Parallel, however what do I do with the header definition (i.e. -R)

For example, my current command reads:

parallell bwa mem -R "@RG\tID:{}\tPL:ILLUMINA\tLB:lib1" $REF {}_1.fastq {}_2.fastq > {}.sam

Does this make sense?

alignment BWA • 5.1k views

ADD COMMENT • link updated 2.7 years ago by Ram 45k • written 9.6 years ago by ebrown1955 ▴ 320

2

Entering edit mode

What arguments are you supplying to parallel? Sample names/IDs, full files, what? Parallel should be able to add the value that you want to the RG header. What is the exact issue you're having?

Karl is correct, there is a trade off between parallelization and IO. Run too many at once and you'll clog up the IO and the overall runtime will be equal to that of a smaller number of simultaneous jobs.

ADD REPLY • link 9.6 years ago by pld 5.1k

0

Entering edit mode

Equal is not the lower bound. It can get much slower than sequential operation if the operating system tries to run them all together and has to swap out memory, or load/unload/reload data as different subtasks take over the limited cpu resources.

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 9.6 years ago by karl.stamm 4.1k

0

Entering edit mode

Yep, you can slow it down to worse than serial.

ADD REPLY • link 9.6 years ago by pld 5.1k

0

Entering edit mode

I was hoping to pass sample IDs into parallel. For example, if an id is XX0001, then the fwd fastq is called XX0001_1.fastq and the rev is XX0001_2.fastq.

I think there would be some variant of parallel bwa mem -R "@RG\tID:{}\tPL:ILLUMINA\tLB:lib1" $REF {}_1.fastq {}_2.fastq > {}.sam

What command would actually accomplish this. Should I create a listfile with the sample IDs?

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 9.6 years ago by ebrown1955 ▴ 320

0

Entering edit mode

You should check out the documentation on GNU parallel some more, you have to feed parallel something to use and there are tons of ways you can do that. You should try out some of the examples and play around seeing how it behaves using echo.

You could do something like this (although parsing ls is bad):

Say my files are in the format of (sampleID1_1.fq, sampleID1_2.fq) and so on:

parallel -j 4 bwa mem -t 3 -R "@RG\tID:{}\tPL:ILLUMINA\tLB:lib1" <ref genome> /input/dir/{}_1.fastq /input/dir/{}_2.fastq ">" /output/dir/{}.sam ::: `ls /input/dir/*.fastq | cut -d '_' -f1 | uniq`

Where input and output are wherever your input and outputs are/go. Again, tune -j and -t to whatever your system uses. Unless you're running these on a large number of nodes (which won't work with the above example), I wouldn't run 1000 of them at a time.

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 9.6 years ago by pld 5.1k

score 2 · Answer 1 · 2015-10-14

2

Entering edit mode

9.6 years ago

Pierre Lindenbaum 166k

I use a make or SGE+qmake

see also: https://github.com/lindenb/ngsxml "generating workflows with XML and XSLT"

ADD COMMENT • link 9.6 years ago by Pierre Lindenbaum 166k

1

Entering edit mode

Could you share an example qmake script for a simple bwa/samtools alignment?

ADD REPLY • link 9.6 years ago by dfornika ★ 1.1k

Ram · Answer 2 · 2015-10-14

1

Entering edit mode

9.6 years ago

karl.stamm 4.1k

For my computer, the bwa's built in parallelism (-t argument) can use up all the CPU and become bottlenecked by harddrive speed. How fast it can read in and write out files, or it will have used up all the CPU on a single sample.

That means you'll get as good or better speed by doing each sample sequentially.

Just use a bash for loop to run bwa mem with -t set to your cpu count. GNU Parallel can do something like that with the --jobs parameter to limit the number of simultaneous tasks.

If you have multiple compute nodes on a grid, then my answer is wrong and you should distribute the workload.

ADD COMMENT • link updated 2.7 years ago by Ram 45k • written 9.6 years ago by karl.stamm 4.1k

0

Entering edit mode

-t limits the number of cores each BWA instance can use. -j (parallel) limits the number of simultaneous instances.

So parallel -j 4 bwa mem -t 4 {etc} would be 4 BWA instances each using 4 cores, for a total of 16 cores.

ADD REPLY • link 9.6 years ago by pld 5.1k