How to extract information about which chromosome it is, from bam file using pysam?
0
0
Entering edit mode
3.6 years ago
ja4123 ▴ 30

Hey! I am using pysam iterator like this:

alignments = pysam.AlignmentFile("file.bam", "rb")
for line in alignments.fetch(until_eof=True):
    print(line)
    break

Output looks like this:

HISEQ:157:HAM0GADXX:1:1101:1635:2143    16  15  73530482    42  102M    -1  -1  102 TGGTGGGAAGGTTTGCTCTTCACCAATTAACGAAGGATGGGTAAGGAAGTTAGTTGGTGGTTGGACTCTGCTCTCAGATTCAACCCTCCCTAGCCTTCTATT  array('B', [22, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37, 40, 40, 37, 37, 27, 37, 33, 33, 33, 27, 37, 37, 33, 37, 40, 40, 40, 40, 40, 40, 37, 40, 40, 40, 37, 33, 40, 40, 40, 40, 40, 40, 40, 37, 40, 40, 40, 40, 37, 37, 40, 37, 40, 37, 37, 27, 37, 37, 33, 37, 37, 33, 27, 37, 37, 37, 37, 37, 37, 37, 33, 37, 37, 37, 37, 33, 33, 33, 37, 37, 37, 37, 40, 40, 37, 33, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37, 33, 33, 33])    [('AS', 0), ('XN', 0), ('XM', 0), ('XO', 0), ('XG', 0), ('NM', 0), ('MD', '102'), ('YT', 'UU')]

I thought that chromosome number is on third position in line which in this example is 15, but after further analysis I think I am wrong. Maybe someone know? Kindly help.

pysam bam • 2.5k views
ADD COMMENT
1
Entering edit mode
> import pysam
> samfile = pysam.AlignmentFile("test.bam", "rb")
> [print(i.reference_name) for i in samfile]

Check if this gives you the result you expected.

ADD REPLY
0
Entering edit mode

Your code gives me all chromosemes from 1 to 23, X and Y, and some None. What does mean None, that it is not mapped? Also I received some other like in a header of sam file like: chrUn_gl000234, chr1_gl000191_random and some similar but not much. Do you know what does mean? Thanks for the answer.

ADD REPLY
1
Entering edit mode

Please refer to reference fasta file headers used in alignment (resulting in bam)

ADD REPLY
0
Entering edit mode

But in general I have got what I wanted. Thanks!

ADD REPLY
0
Entering edit mode

yes, it is 15.

but after further analysis I think I am wrong.

what is your "analysis" ?

ADD REPLY
0
Entering edit mode

Later I saw something like for example "chr66" and then I noticed that on this position are numbers above 46. I am working on human sample.

ADD REPLY

Login before adding your answer.

Traffic: 1987 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6