I am currently using BWA-MEM to map metagenomic reads to a small (~12kb) virus reference genome for a large number of samples. I have a lot of reads per sample - upwards of 10 million.
The majority of the reads (~99%) do not match my reference - I am not interested in analysing these reads. However, the .sam file that BWA-MEM produces stores the unmapped reads, and I am stuck with these huge files that are taking up disk space (I do convert these into sorted .bam files and delete the .sam file, but they're still huge).
Is there any way to stop BWA-MEM from storing the unmapped reads in the .sam file and only keep the mapped reads?
Thank you for letting me know.