I have some paired end short read sequencing data. I'd like to know the fragment alignment overlap between the paired reads, and prune that part out for one of the reads to avoid processing the same region twice. For my specific downstream purposes, it has to be done in C.
My current idea is to use the MC
tag, which contains the CIGAR string of the mate read, and pass it to bam_cigar2rlen()
. Together with core.mpos
, I should be able to determine the end position of the mate read.
However, bam_cigar2rlen()
takes int n_cigar
and const uint32_t *cigar
for the arguments, and it is unclear to me how to get these parameters from the MC
tag. I am also not sure if parsing the CIGAR string like this is the most efficient way.
I'm a novice to C and I'd highly appreciate any help. Thanks!
Thank you so much, it worked wonderfully! I wasn't aware of the
bam_aux2Z()
function and was messing around withbam_aux_get_str()
, which caused a few unexpected behaviors.