Is there a library or piece of code that I can use to parse the absolute reference positions of large indels ( +100 bp ) from a cigar string like one shown below? I looked in Picard but I couldn't find it, maybe I missed it? It would be nice to have the code in java.
I could try to write something myself but there is a ton of tools that use bam files and they all must be able to parse a bam file and the cigar string. But I can't find a library or piece of code that I can reuse to parse absolute locations of indels from the cigar string.
The cigar string shown below is from the actual alignment of a bac contig with bwa-mem against a reference. I would like to use bac alignments like one to extract a truth set of large indels. To see how good my NGS data based SV calling is.
bac_contig1 2064 1 13884324 60 1330M4D814M4D2891M1D82M136I973M6I2277M1D3M5D2M4D8M4I12M124D5M30D5M53D6M84D4M156D4M12D11M206D10M435D5M16D6M27D3M72D6M19D4M73D5M254D3M103D5M29D7M898D23M6D202M24D37M2D1341M16I2454M3D1293M11I8M1D9M30D2M3D24M7D5M18D3M8D10M4D4M1D19M1I5M7D10M9D4M2D4M7D13M5I5M13D7M2D5M10I5M31D14M27D12M15D6M11D12M4D7M3D3M1I1M3D9M2D6M1D8M2I10M7D3M10D6M4I21M2I2M3D4M5I6M9I7M1D6M6D11M5D3M8D6M9D7M20D4M1I16M68D5M61D9M25D7M6D5M2D2M2I3M1D7M7D4M14D5M14D4M1I6M18D3M8D9M31D11M3D5M53D3M12D7M22D3M5D8M17D6M1I4M1I3M5D9M129D4M1I5M73D888M6I1589M2I1156M2I4702M1D649M2I2957M12I1730M1D3647M6D2438M1I1614M28D220M8I4424M882I5215M1D1867M1I457M9I3621M2I59M4D996M3D852M1I372M136921H