Hi everyone,
I'm sorry if it is answered anywhere else, I have raw dna-seq paired-end sequence reads data from different libraries of read length for different samples. Ranging from 50-90, 50-101, 50-150 (after quality trimming). In FASTQC check i noticed that about 10% of reads are lesser than 70 bp for each sample.
As BWA-mem is pretty mature now but the bwa docs say that bwa mem is preferable for longer reads ( > 70 bp). and BWA-aln is good for shorter reads, but i guess most of the reads from my library are not shorter.
I want to use one mapping algorithm for consistency reasons.Can i still use bwa-mem ignoring small fraction of smaller read lengths? or i should use BWA-aln ?
Thanks
sohail
bwa-mem could be used for short reads as well, but this blog suggests that
bwa-aln
performs better for 50bp reads but over allmem
has a good accuracy at different mismatch levels. Would it be useful if you align the 10% of the short reads withmem
andaln
and compare the stats to see if it makes a big difference.yes, i think i should try to compare..