I've returned to the bioinformatics world after an absence of several years, so the following question may seem naive:
I've inherited scripts from a former colleague for an alignment pipeline. Most of it is straightforward, but I'm unsure about one of the preliminary pre-processing steps.
Namely, there's a perl script that splits the fastq files into subsets of reads set to some input argument value. If the alignments (using bwa) were parallelized, I could see the reason for doing this, but in the absence of parallelization, what is gained computationally by splitting fastq files?
Could you clarify what is meant by "brute-force parallelization" (as opposed to parallelization proper)?
Starting multiple alignment jobs using fastq file pieces. You can then merge the resulting BAM files to produce a final single file.