Read pair-end fastq files for a bacterial strain can be aligned to reference genomes by BWA. Using bedtools, the coverage of the reads on reference genome can be computed using the option 'coverageBed'. I think the coverage means how many reads were aligned to each position of the reference genome. I have new question about the coverage analysis, how do we know which reads were not aligned on the reference genome? Because the sequenced genome may be larger than reference genomes, or has rearrangements or duplicated regions, some reads may not find corresponding regions in reference genome. Is there any tools that can find unaligned reads? Thanks!
Great, that is it! Thank you!