I am new to pysam and I'm trying to figure out the output of each column of pysam.AlignmentFile. My code is as follow:
for b in pysam.AlignmentFile(bam,'rb'):
print (b)
The output I see is this:
HWI-ST1285:282:C8MPEACXX:6:2211:11377:16289 163 0 10281 12 31S29M40S 0 10369 29 CCCTCACCCTCACCCTCACCCTCACCCTCACCCCTACCCCCAACCCCAACCCCAACCCCACACCCCACCCCAAACCCACCACCCACACCTCCCCCCACCC array('B', [34, 34, 31, 37, 37, 37, 35, 37, 30, 35, 39, 39, 35, 40, 38, 31, 37, 34, 38, 38, 40, 36, 38, 35, 37, 39, 40, 40, 8, 34, 38, 33, 35, 31, 31, 30, 8, 23, 33, 37, 39, 33, 33, 37, 14, 20, 31, 26, 31, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]) [('NM', 1), ('MD', '5A23'), ('AS', 24), ('XS', 35), ('RG', '6_AD016')]
HWI-ST1285:282:C8MPEACXX:6:2211:11377:16289 83 0 10369 25 58M1I12M1D29M 0 10281 100 CACCCTCACCCTAACCCTAACCCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA array('B', [2, 2, 33, 35, 32, 27, 27, 18, 33, 34, 30, 32, 23, 7, 33, 35, 33, 33, 31, 28, 33, 32, 32, 27, 24, 18, 6, 25, 33, 34, 32, 26, 29, 21, 32, 33, 30, 30, 26, 13, 36, 39, 34, 35, 28, 22, 32, 40, 38, 34, 40, 37, 40, 38, 37, 28, 30, 21, 41, 39, 38, 35, 39, 37, 33, 38, 39, 39, 35, 37, 36, 34, 38, 39, 36, 34, 34, 39, 37, 38, 36, 36, 36, 40, 40, 38, 38, 32, 35, 39, 37, 37, 26, 33, 32, 37, 35, 27, 30, 31]) [('NM', 5), ('MD', '0A5A63^C28C0'), ('AS', 78), ('XS', 78), ('RG', '6_AD016')]
However, when I use samtools view to view bam, I got this:
HWI-ST1285:282:C8MPEACXX:6:2211:11377:16289 163 chr1 10282 12 31S29M40S = 10370 188 CCCTCACCCTCACCCTCACCCTCACCCTCACCCCTACCCCCAACCCCAACCCCAACCCCACACCCCACCCCAAACCCACCACCCACACCTCCCCCCACCC CC@FFFDF?DHHDIG@FCGGIEGDFHII)CGBD@@?)8BFHBBF/5@;@################################################### NM:i:1 MD:Z:5A23 AS:i:24 XS:i:35 RG:Z:6_AD016
HWI-ST1285:282:C8MPEACXX:6:2211:11377:16289 83 chr1 10370 25 58M1I12M1D29M = 10282 -188 CACCCTCACCCTAACCCTAACCCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTA ##BDA<<3BC?A8(BDBB@=BAA<93':BCA;>6AB??;.EHCD=7AIGCIFIGF=?6JHGDHFBGHHDFECGHECCHFGEEEIIGGADHFF;BAFD<?@ NM:i:5 MD:Z:0A5A63^C28C0 AS:i:78 XS:i:78 RG:Z:6_AD016
I notice that the 9th column is different (29 and 100 from pysam and 188 and -188 from samtools view). Why is this so? Also, from the samtools view output, I can see that the difference between 10370 - 10282 is 88, but why is 100 being added to the 9th column (template length) to yield 188?
Thanks for your help!