Question

Alignment to contings with Illumina pe with bwa or minimap2?

0

Entering edit mode

6.8 years ago

likoo27 • 0

Hello :)

I'm pretty new to the stuff and after reading some papers and posts I'm still not sure what I should do so maybe somebody with more experience could shed some light.

After I got my contings with Canu I would liike to use Pilon and polish them with Illumina reads. After trimming and quality control they are 20-301 in length. In Pilon docs they wrote about using BWA MEM but then there's minimap2 which is faster (kinda important for me) but then again not so good with short-reads. What would you recommend? What about filtering those reads for 100-301 in length (that would be around 95% of all reads I think) and then going with minimap2? Would appreciate any information :D

alignment bwa minimap2 canu pilon • 5.7k views

ADD COMMENT • link updated 6.8 years ago by h.mon 35k • written 6.8 years ago by likoo27 • 0

1

Entering edit mode

I would start with correction using long reads. Depending on the complexity of your organism, there are going to be a bunch of small missassemblies that can be fixed using long reads, but that would be a nightmare for pilon to try to correct. I think long read polishing will also help get more unique alignments for reads in low complexity regions. Increasing the number of errors you can correct or reducing the errors you introduce into repetitive sequences.

Also, BWA seemed pretty fast for me, and wasn't really the bottleneck in my pipeline. Minimap2 is crazy fast with long reads though. A long read polishing option is minimap2/racon (https://genome.cshlp.org/content/27/5/737). Its pretty fast compared to blasr and arrow/quiver, but I haven't compared the quality of sequence returned.

ADD REPLY • link 6.8 years ago by EarlyEvol ▴ 30

score 2 · Answer 1 · 2018-10-11

This blog post should help you decide:

Minimap2 and the future of BWA

As you don't provide much information about your reads besides they are short, there is really no good recommendation. According to the paper, bwa mem is just slightly better than minimap2, but 2-3x times slower. So if your reads are good quality (low error rate), go for minima2, if they aren't, go for bwa mem.

As the blog post don't really say how high of an error rate makes minimap2 slower than bwa mem, you may align a couple of your datasets to check which one performs better.

edit: 20bp is really short, and not even bwa mem is the best performer for such small reads: bwa aln beats bwa mem in this case. But those reads are too short in my opinion, in general I filter out reads shorter than 50-65 bp.