Dear all,
has anyone done any benchmarking on speeding up long read alignment algorithms ?
I mainly use minimap2, but its' runtime varies by a factor of 10 across our cluster. I've been trying mm2-fast https://github.com/bwa-mem2/mm2-fast, the partially accelerated version, but without much success so far.
Is for example PAF output faster than SAM ?
Have others worked out how to scale minimap2 for Promethion scale datasets ? I expect LRA https://github.com/ChaissonLab/LRA has a similar runtime from their presented results, and others seem slower still (ngmlr etc).
Thanks
You are already using multiple threads and are asking for an additional speedup? You could split the data files up and start multiple jobs in parallel as a sledgehammer solution. I have never worked with Promethion size data so don't have a direct insight there.
The input splitting is an interesting idea, which comes at the expense of using far more CPU resources. I'll give this a try but also benchmark with some other options.
edit: I am using 24 threads as I have found that to be fastest on my infrastructure when using hyperfine for benchmarking.
Thanks