If I take a random read from a BAM file, and want to know what the reference base would be at position 1 of that read, is there a way to do this? I was thinking I could do something with the POS column of the BAM file and then add to it the position of the base in the read I am looking at to get the position on the reference sequence.
Maybe a different way to say this is does there exist a way to get the nth base from a specific chromosome in a reference sequence (Fasta).
Thanks for the explanation of the MD flag and the detailed examples Matt :)
It blows my mind that the MD tag doesn't include a way to encode insertions... but I can't say it surprises me :P
Your parser, if his data contains the MD flag, is the way to go I think!