Question

Recommendations For Python Vcf Parser/Writer?

11

Entering edit mode

13.0 years ago

Reece ▴ 310

I'm looking for a VCF 4.1 parser and writer. I'm aware of these:

Do you know of other options or have recommendations to share?

vcf python variant variation • 22k views

ADD COMMENT • link updated 5.1 years ago by Dataman ▴ 380 • written 13.0 years ago by Reece ▴ 310

score 14 · Answer 1 · 2012-01-02

14

Entering edit mode

13.0 years ago

brentp 24k

I've looked at the ones you mention and any others I could find. This one seems to be the most complete and easiest to use: https://github.com/jdoughertyii/PyVCF

usage is like:

for rec in VCFReader(open('some.vcf')):
    print rec.CHROM, rec.POS, rec.filter, rec.info["AF"]

though, it does not have a writer class.

EDIT:

This, has become the official fork and it has a writer class.

ADD COMMENT • link 12.9 years ago by brentp 24k

1

Entering edit mode

I am using that library as well (with a couple of minor mods) for another project. Works okay for me.

ADD REPLY • link 13.0 years ago by Aaronquinlan 12k

1

Entering edit mode

The idea for the UPPER was to distinguish native (upper) fields from derived (lower) attributes/methods. For better or worse...

ADD REPLY • link 12.9 years ago by Aaronquinlan 12k

0

Entering edit mode

thanks. any idea why UPPERCASE field names?

ADD REPLY • link 12.9 years ago by Haibao Tang 3.0k

0

Entering edit mode

Not sure other than that's how they appear in the VCF filter. You could file a bug at https://github.com/jamescasbon/PyVCF

ADD REPLY • link 12.9 years ago by brentp 24k

0

Entering edit mode

Pyvcf is too slow ... Is there anything else in python using C++ as backend ?

ADD REPLY • link 8.5 years ago by sacha ★ 2.4k

1

Entering edit mode

CyVCF2 https://github.com/brentp/cyvcf2

ADD REPLY • link 7.0 years ago by Eli Korvigo ▴ 230

score 2 · Answer 2 · 2012-02-09

For C++, I've written vcflib. It has utilities for a number of functions, such as haplotype-based file comparisons (for accurate indel comparisons), filtering, and statistical summarization. It can operate on uncompressed or compressed and tabix indexed VCF files. Mostly, I've used it as a reader/writer class for other projects.

score 1 · Answer 3 · 2019-11-11

I know this question is rather old and has an answer but it is still a relevant question. A recent, alternative for parsing VCF files in Python (both versions 2 and 3) is cyvcf2 which is made by two well known bioinformaticians: Brent Pedersen and Aaron Quinlan.
GitHub link: http://brentp.github.io/cyvcf2/ and https://github.com/brentp/cyvcf2.
The Journal article: https://academic.oup.com/bioinformatics/article/33/12/1867/2971439