Hi, I am working this pipeline for grad school, and I noticed a thing I can't really explain.
I simulated two fastq samples, ran them through an alignment pipeline (fastp to bwa mem), and got alignment stats from qualimap (pictures below) Then, I combined the two samples into one sample using cat, and ran it through the same pipeline. When I did that, none of the alignment stats added up. For example, there were less unmapped reads when combining the two samples than when you ran the two samples on their own. i have no idea why.
Here are the qualimap outputs for the three runs. I can give any other details if needed. Thanks!
bwa mem is not fully deterministic, see from the developer: BWA mem output inconsistent on same but re-ordered FASTQ input
I can't come up with an explanation, but two things you can try for debugging are: 1) look at the fastp stats - do they add up and make sense? 2) scan the bam files for reads that changed their status between runs. Also, please include the fastp and bwa.commands you used.
Maybe it is caused by fastp. https://github.com/OpenGene/fastp/issues/506