pysam: filtering reads based on flag
1
0
Entering edit mode
9.3 years ago

I would like to return reads with a flag of 0x02 similar to this

$ samtools view -f 0x02 in.bam 1:100-200

without using system calls in pysam (which are slower to my knowledge)

#!/usr/bin/python

import pysam
pos = "1:100-200"

# open bam
bam = pysam.AlignmentFile(bam_fh,"rb")

# I would like to do something like this, syntax is not correct I imagine
reads =  bam.fetch(region=pos,until_eof=True)
good_reads = reads.flag("0x02")
print len(good_reads)

or is it impossible to subset reads using a flag in pysam without iterating? In that case I'll default to the csamtools commands.

python samtools pysam hts • 7.9k views
ADD COMMENT
2
Entering edit mode

Could you use named pipes instead?

Filtering Stream Bam Output Idealy From Pysam/Python

Basically what I'm thinking is that you could create the named pipe programmatically (using Python's subprocess library, for example) but outside the context of pysam. Then you can open that named pipe using the pysam API, just like any other BAM file.

ADD REPLY
0
Entering edit mode

I ended up using the pipes

ADD REPLY
2
Entering edit mode

Even samtools will do the subsetting by iterating over the alignments, there's no other way to do it.

ADD REPLY
3
Entering edit mode
9.3 years ago

I haven't benchmarked it formally, but you might try using my simplesam module. The BAM decompression is offloaded to samtools in a subprocess, and iteration is simply about as fast as Python's readline will support. If you're checking only for flag 0x02, you can use the Sam.paired attribute. This reminds me that I need to actually write documentation for this module :)

ADD COMMENT
1
Entering edit mode

What's the advantage to pysam? I would love to take your module out for a spin

ADD REPLY
1
Entering edit mode

I'll have to give an explanation on the project page, but in addition to not requiring a C compiler (and the added underlying complexity of calling a foreign function in Python) there is not much advantage. In fact, pysam has much more functionality. The perceived advantage for me is that "simplesam" is pure Python, and so easier to test, inherit from, and extend.

ADD REPLY

Login before adding your answer.

Traffic: 2661 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6