I would like to parallelize bwa mem on multiple cores across multiple CPUs, on our high-performance computing cluster. This was previously discussed seven years ago in this thread: Parallelizing Bwa On Multiple Cpus. With that in mind, I'm now considering that the best way to do this is either:
1) Use Parallel BWA (pBWA).
2) Split the large input fastq files into multiple smaller fastq files, map each using its own instance of bwa mem on its own core, then merge them all together.
However, pBWA has not ben updated since October 2012. Further, I'd prefer option 2, as our cluster restricts the size of input file sizes. That said, as I'm using paired-end data (with two input fastq files), I'm not sure how best to split those up and ensure that reads in all the resulting files are still paired.
Does anyone have any insight on this, and -- given the time that's elapsed -- might there now be a better approach?
Thanks!
Thanks for this. My understanding is that -t only refers to number of threads, whereas I need to parallelize over multiple physical servers and CPUs. Thus, I decided to go with brute-forcing as suggested in '3.'
For anyone else who stumbles across this, I chose to used bbmap to verify that paired FASTQ files (downloaded from published datasets) are properly paired. I then used Trimmomatic to remove low quality reads.
I then split the resulting paired files (in which both reads survived the quality check), also as described in 3., before mapping.
Just a suggestion. If you were using
bbmap
then you could have done everything in that suite.reformat.sh
to split (not really needed but if you wish),bbduk.sh
to scan/trim andbbmap.sh
to align. Both programs are multi-threaded and can use all cores you can afford to throw at them.