Entering edit mode
7.7 years ago
abascalfederico
★
1.2k
Hi,
There are some MD tags I do not understand. I find no problems understanding most of them, but in some cases the NM (number of mismatches) says there is 1 mismatch and the MD tag is simply "74" (instead of, for example, something like 56A17, 3T71, etc).
How is it possible that there is a mismatch but the MD tag has no information about the mismatch? BTW, I see no mismatch with IGV.
The data was aligned with bwa sampe
Thanks, Federico
I vaguely remember that the NM and MD tags are determined in different processes, which is probably why samtools has long had a calmd subcommand.
Thanks Devon, It's strange... Looking at the cigar string in the CIGAR column and the MD tag, I have realised that all those cases unexpected to me are related to insertions in the read. For instance, the MD=74 posted before is associated to 50M1I24M. In contrast, deletions are well described in the MD field: Cigar=50M1D25M, MD=50^T25. This doesn't change by running samtools calmd.
I'll have to consider both the CIGAR string and the MD tag (if I want to spare myself from decoding the real alignment)...