Question

Is bcf just a bgzipped vcf?

0

Entering edit mode

7.7 years ago

endrebak ▴ 980

99% sure, but could not find the answer on google. Creating this q for other bcf newbs.

bcf vcf • 6.7k views

ADD COMMENT • link updated 3.3 years ago by Pierre Lindenbaum 166k • written 7.7 years ago by endrebak ▴ 980

0

Entering edit mode

VCF AND BCF

ADD REPLY • link 7.7 years ago by venu 7.1k

score 9 · Accepted Answer · 2017-09-20

9

Entering edit mode

7.7 years ago

Paul ★ 1.5k

BCF is binary counterpart of VCF file. Please look here for description. Similar to SAM vs BAM.

ADD COMMENT • link 7.7 years ago by Paul ★ 1.5k

4

Entering edit mode

and, sadly, BCF is not always bgzipped... (raw bcf)

There is also “raw BCF” which bcftools outputs with -Ou. Historically HTSJDK has only supported raw BCF not real BCF https://t.co/EqDIAxtsWH
— John Marshall (@jomarnz) September 15, 2017

ADD REPLY • link 7.7 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

"Historically"? The linked github issue is still open. And e.g. bcftools still internally converts all formats to uncompressed BCF for processing.

Can somebody please explain to me why should I use BCF? I only read that it should be faster to parse, without further explanation. The same source also says I should use VCF for data sharing and processing by custom scripts (DOI: 10.1093/gigascience/giab008).

Why is BCF faster to parse? Why not just use indexed & (b)gzipped VCF? That would have benefits of both being small, fast and portable. What am I missing?

ADD REPLY • link 3.3 years ago by jena ▴ 320

0

Entering edit mode

Can somebody please explain to me why should I use BCF? I only read that it should be faster to parse, without further explanation.

vcf is a text file that can be compressed (vcf.gz).

BCF is not a text format but a BINARY format which is de-facto faster to read.

e.g: how to read an integer in C / text (yeah I could use fscanf too )

int n=0,c;
while((c=fgetc(in))!=EOF && isdigit(c)) { n= n*10 + (c-'0')}

in C/binary

fread(in,sizeof(int),1,&n);

ADD REPLY • link 3.3 years ago by Pierre Lindenbaum 166k