Recently I've ported bwa over to AVX2 instructions by simply converting the SSE2 instructions to their AVX2 counterparts.
https://github.com/IvantheDugtrio/bwa
So far it compiles successfully however I'm in the process of debugging it. I get a "double free or corruption (!prev): " error. It seems there are issues with memory dereferencing, which makes me suspect lines 81-86 in ksw.c:
81 q = (kswq_t*)malloc(sizeof(kswq_t) + 256 + 16 * slen * (m + 4)); // a single block of memory
82 #ifdef __AVX2__
83 q->qp = (__m256i*)(((size_t)q + sizeof(kswq_t) + 15) >> 4 << 4); // align memory
84 #else
85 q->qp = (__m128i*)(((size_t)q + sizeof(kswq_t) + 15) >> 4 << 4); // align memory
86 #endif
Could someone explain what is happening here in regards to the struct kswq_t and how it aligns memory?