Hi, I am new in sequencing.
I am confused about Flag in Sam format.
I know 0 stands for mapping to forward strand and 16 stands for mapping to reverse strand. And 4 stands for unmapping.
But what are other flag means? I really do not see any of them.
I am a computer student, so I do not know much biology.
As you're a CS student, you understand they are bitwise flags?
So they can be combined. So you can | (or) the flags that are powers of 2 to convey multiple pieces of information in the single number.
This python script:
def asbin(n):
"""converted a number to its binary rep (padded with 0's)"""
return str(bin(n))[2:].zfill(17)
print "value\thex\tbinary"
for pow in range(17):
val = 2 ** pow
print "%-5d\t%-4x\t%s" % (val, val, asbin(val))
# set all flags
all_ones = reduce(lambda x, y: x | 2**y, range(17), 1)
print "\nall flags set:", asbin(all_ones)
0x1 template having multiple segments in sequencing
0x2 each segment properly aligned according to the aligner
0x4 segment unmapped
0x8 next segment in the template unmapped
0x10 SEQ being reverse complemented
0x20 SEQ of the next segment in the template being reversed
0x40 the first segment in the template
0x80 the last segment in the template
0x100 secondary alignment
0x200 not passing quality controls
0x400 PCR or optical duplicate
The numbers in second column of the SAM file hexadecimal numbers transformed to decimal scale.
For example, 16 in hexadecimal is 0x10 which it's means "SEQ being reverse complemented", as you already knew.
This is a very useful website!