Hi there,
I have amplicon data that I want to align with BWA - I set the softclipping penalty very high (-L 50
) to force bwa to align end-to-end rather than soft-clipping in the middle of the read.
Now I have a read pair that is identical in R1 and R2, but I cannot make the aligner output the same alignments for both:
M06706:14:000000000-CN3FL:1:1110:20686:16364 99 xxx 12 43 72M22D2M20D36M = 12 72 ATGATGGGCTCTTCCCCCGCGTGCAGCAGTGGAGCCACCAGCAGCGGGTGGGCGACCTCTTCCAGAAGCTGGCAATTCGGCCATCACGGCTCTCCTCCAGTGGGACTCCC
M06706:14:000000000-CN3FL:1:1110:20686:16364 147 xxx 12 45 72M38S = 12 -72 ATGATGGGCTCTTCCCCCGCGTGCAGCAGTGGAGCCACCAGCAGCGGGTGGGCGACCTCTTCCAGAAGCTGGCAATTCGGCCATCACGGCTCTCCTCCAGTGGGACTCCC
There should be two long deletions in the read - as shown in the first one. But in the second, it prefers to soft-clip no matter what I do. I have tried everything, raising the soft-clip penalty even higher, changing match, gap open, gap extension penalties. But no matter what I do, I cannot change the 38S
to 22D2M20D36M
.
I assume it has to do with the deletions being closer to the start of the second read, but shouldn't the high soft-clip penality force an alignment?
Any help how I can properly align from both directions would be very much appreciated!
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.If you have long deletions, bwa might not be the right tool.
bwa is a short read aligner. Perhaps try minimap2 or another aligner.