I have been trying to align a few bacterial species with the human genome to check the alignment rate (using BWA-mem and Bowtie2). However, both tools give very different alignment rates. While using bowtie2 all species have an overall alignment rate < 1% whereas while using bwa-mem, the alignment rates are very high (10% - 30%). Does anyone know why this is happening?
What exactly does this mean? What form is the query data in? Fastq reads or fasta genomes?
I am using fastq reads of bacterial species against human genome reference
Curious as to the reason behind doing this? Trying to align data to non-homologous reference is going to produce alignments that may be found simply due to chance sequence similarity in short reads.
You also need to pay attention to default parameters that allow certain number of mismatches (likely different for the two programs). This will affect results as you are observing.
Actually, I am trying to find if my samples have any human contamination. That's why I thought of aligning it with a non-homologous reference. As for the parameters, I have used the default parameters for both tools
Better option then would be try and bin the reads using the human genome. You may want to look at
removehuman
from BBMap suite.