Pysam under pypy?
2
1
Entering edit mode
9.9 years ago
John 13k

As pysam is compiled C, i've never been able to run pysam under pypy.

This is a shame, since for most Bioinformatical operations (string manipulation, typical data structures, etc) pypy is considerably faster than python2.x

I think its probably worth getting pysam to run over pypy if possible - but before I start down that road, has anyone ever figured out how to get it to work already? :)

Thanks so much!

- John

pysam python pypy • 3.5k views
ADD COMMENT
1
Entering edit mode

If pysam spends most of time on zlib or the samtools/htslib C code, pypy won't help.

ADD REPLY
0
Entering edit mode

Yes, but most python programs that make use of pysam do something with the read data.

String manipulation, moving things around in memory, comparisons, etc.

So I appreciate we're not going to be able to read BAM files quicker, but our scripts overall would be a lot faster :)

ADD REPLY
1
Entering edit mode
8.4 years ago

Because this problem comes up quite frequently for me, I reimplemented a lot of the pysam functionality in python code so it could be run with pypy. Feel free to fork it and clean it up. https://github.com/nijibabulu/pypysam/

ADD COMMENT
0
Entering edit mode

God job Rob! I wish you had mentioned it a long time ago before I tried the same thing here. My code only works for BAMs but it looks like yours does FASTA and everything else of htslib too! Very nice work :)

ADD REPLY
1
Entering edit mode
8.4 years ago
Eric T. ★ 2.8k

hts-python is another Python wrapper for htslib, the C library underlying pysam and samtools. It uses CFFI instead of Cython and is compatible with Pypy. It is a less mature project, but the author (brentp) has showed some promising performance benchmarks.

ADD COMMENT
0
Entering edit mode

I also highly recommend hts-python, for what it's worth. Actually anything made by Brent. It's probably the most stable thing right now if you want to read BAMs on pypy.

Having said that, the pure-python methods like what robert and I posted are probably better ideas going forward than trying to hook htslib, assuming all you want to do is read a BAM file. There's no C to compile, it's just as fast if not faster, and there's no dependancies on other python or C projects that users may or may not have.

ADD REPLY

Login before adding your answer.

Traffic: 1134 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6