I would like to understand how the algorithm behind BWA MEM works... So, I tried reading the papers on NCBI about BWA MEM and I even saw the posters about this but, I still can't understand completely how it works... Can someone explain it to me with some simple words?
BWA execution from the high level can be observed as creating indexing structures (run bwa index which outputs 5 index files) out of sub-sequences of the reference genome (fasta file) in order to enable search of the sequence w from input reads (fastq format) in constant complexity - O(|w|). Indexing is based on the Burrows-Wheeler transformation (Google video with explanation of BWT and the interview with Mike Burrows) and FM index https://en.wikipedia.org/wiki/FM-index.
After watching mentioned video and reading wiki you will be ready for the final treat, the BWA MEM paper from its author Heng Li: http://arxiv.org/abs/1303.3997
Thank you for your question. I do understand the basic concept of BWA. But, What makes BWA-MEM very special because nowadays everyone is using it. And how it works? Can someone explain it to me with some simple words?
How what works, exactly? The Burrows-Wheeler transformation? Mapping in general? BWA? The MEM algorithm?
Thank you for your answer. Yes, I would like to know how it works in general. So, BWA Mapping in general and the MEM algorithm....
Regards