Hi folks,
I'm using bbmap to filter out contaminate reads by mapping to human, mouse, and phiX genomes.This can be really slow or fast, for roughly the same number of reads.
Here's the command I used:
bbsplit.sh in1=reads.trim.1.fq in2=reads.trim.2.fq \
ref=ref-genomes/phiX174.fa,ref-genomes/GRCm39.fa,ref-genomes/GRCh38.fa \
outu1=dedupe_reads/dedupe_reads.1.fq.gz outu2=dedupe_reads/dedupe_reads.2.fq.gz threads=40 \
overwrite=true
Here's the speed stats for 28 million reads:
Mapping Mode: normal
Reads Used: 28695864 (2590370926 bases)
Mapping: 3695.342 seconds.
Reads/sec: 7765.42
Here's the speed stats for 36 million reads:
Reads Used: 35647394 (4046619973 bases)
Mapping: 186.885 seconds.
Reads/sec: 190745.37
kBases/sec: 21653.03
Same settings were used for both. Any idea as to what I can do make everything faster?
GenoMax @Brian Bushnell