gi|110640213|ref|NC_008253.1|_4832863_4833322_0:0:0_0:0:0_11 163 gi|110640213|ref|NC_008253.1| 4832863 23 70M = 4833253 460 TACCGCAATGTGCTTATTGAAGATGACCAGGGAACGCATTTCCGGCTGGTTATCCGCAATGCCGGAGGGC 2222222222222222222222222222222222222222222222222222222222222222222222 XT:A:U NM:i:0 SM:i:23 AM:i:0 X0:i:1 X1:i:1 XM:i:0 XO:i:0 XG:i:0 MD:Z:70 XA:Z:gi|110640213|ref|NC_008253.1|,+4019608,70M,1;
Suppose I have the following alignment (illustrated above). As you can see there are multiple alignments in the form (rname, pos, cigar, NM), where NM defines edit distance. I have noticed that the pos field can either be positive or negative.
Where do positional values beginning with a positive or negative sign start in the sequence specified by rname?
The XA flag is not part of the sam specification; it's a custom extension specific to the aligner. What program are you using?
BWA.
I'd like to assume that it refers to the positive and negative strand, but I can't find anything confirming that.
Since this is a non standard field I think one needs to do a bit a reverse engineering to find out how exactly is created. It makes sense that it would be the strand since neither of the other fields specify that information yet it is important to know.