Entering edit mode
3.6 years ago
ja4123
▴
30
Hey! I am using pysam iterator like this:
alignments = pysam.AlignmentFile("file.bam", "rb")
for line in alignments.fetch(until_eof=True):
print(line)
break
Output looks like this:
HISEQ:157:HAM0GADXX:1:1101:1635:2143 16 15 73530482 42 102M -1 -1 102 TGGTGGGAAGGTTTGCTCTTCACCAATTAACGAAGGATGGGTAAGGAAGTTAGTTGGTGGTTGGACTCTGCTCTCAGATTCAACCCTCCCTAGCCTTCTATT array('B', [22, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37, 40, 40, 37, 37, 27, 37, 33, 33, 33, 27, 37, 37, 33, 37, 40, 40, 40, 40, 40, 40, 37, 40, 40, 40, 37, 33, 40, 40, 40, 40, 40, 40, 40, 37, 40, 40, 40, 40, 37, 37, 40, 37, 40, 37, 37, 27, 37, 37, 33, 37, 37, 33, 27, 37, 37, 37, 37, 37, 37, 37, 33, 37, 37, 37, 37, 33, 33, 33, 37, 37, 37, 37, 40, 40, 37, 33, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 33, 33, 33]) [('AS', 0), ('XN', 0), ('XM', 0), ('XO', 0), ('XG', 0), ('NM', 0), ('MD', '102'), ('YT', 'UU')]
I thought that chromosome number is on third position in line which in this example is 15, but after further analysis I think I am wrong. Maybe someone know? Kindly help.
Check if this gives you the result you expected.
Your code gives me all chromosemes from 1 to 23, X and Y, and some None. What does mean None, that it is not mapped? Also I received some other like in a header of sam file like: chrUn_gl000234, chr1_gl000191_random and some similar but not much. Do you know what does mean? Thanks for the answer.
Please refer to reference fasta file headers used in alignment (resulting in bam)
But in general I have got what I wanted. Thanks!
yes, it is 15.
what is your "analysis" ?
Later I saw something like for example "chr66" and then I noticed that on this position are numbers above 46. I am working on human sample.