Getting genomic locations from cigar string
0
0
Entering edit mode
7.0 years ago
KVC_bioinfo ▴ 600

edited post

Hello all,

I want to get all genomic locations (start and end) where the alignment occurred. For this, I am trying to write a python script. I am planning to use cigar string from sam file to find a number of matches and starting position of the alignment.

I have multiple lists of the tuple. (i, j)

[(0, 117), (3, 29773), (0, 253), (2, 1325), (0, 145)]

[(0, 116), (2, 1), (3, 3419), (0, 327), (3, 21529), (0, 286), (2, 1)]

[(0, 117), (3, 25275), (0, 180), (1, 1), (0, 1), (3, 5895), (0, 145)]

And I have another list which consists of some numbers.

[66905968, 66906104, 66905996]

In desired output:

I want to add the values (j) from the tuple if i = 0 or 2 for each number on my list. With one condition: every time value of i is 3 it should stop adding and use that number as next starting point.

For example for:

[(0, 117), (3, 29773), (0, 253), (2, 1325), (0, 145)]

and

66905968

I want:

66905968 , 66905968+117

66905968+117+29773, 66905968+117+29773+253+1325+145

I have the following code so far:

import pysam
import sys


pos = []

new = []
reffile = pysam.Fastafile("ref.fasta")
pure_bam = pysam.AlignmentFile('sample.bam', "rb")

for read in pure_bam:
    for read in pure_bam:
    pos.append(read.pos)
    for pp_sam in pos:
        for i , j in read.cigar:
            while i == 0 or i == 2:
                new.append(j + pp_sam)

This is definitely not giving the desired output. Could someone help me? Thank you very much. Thank you very much in advance.

cigar python • 3.5k views
ADD COMMENT
2
Entering edit mode

Unless this is a learning exercise others have done this already:
Going From Cigar String In Sam To Genomic Coordinates?
Python Cigar String - Finding Indels Break Points Positions
and possibly others.

ADD REPLY
0
Entering edit mode

Could anyone get what am I doing wrong here?

ADD REPLY

Login before adding your answer.

Traffic: 2799 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6