99% sure, but could not find the answer on google. Creating this q for other bcf newbs.
- VCF AND BCF
99% sure, but could not find the answer on google. Creating this q for other bcf newbs.
and, sadly, BCF is not always bgzipped... (raw bcf)
There is also “raw BCF” which bcftools outputs with -Ou. Historically HTSJDK has only supported raw BCF not real BCF https://t.co/EqDIAxtsWH
— John Marshall (@jomarnz) September 15, 2017
"Historically"? The linked github issue is still open. And e.g. bcftools still internally converts all formats to uncompressed BCF for processing.
Can somebody please explain to me why should I use BCF? I only read that it should be faster to parse, without further explanation. The same source also says I should use VCF for data sharing and processing by custom scripts (DOI: 10.1093/gigascience/giab008).
Why is BCF faster to parse? Why not just use indexed & (b)gzipped VCF? That would have benefits of both being small, fast and portable. What am I missing?
Can somebody please explain to me why should I use BCF? I only read that it should be faster to parse, without further explanation.
vcf is a text file that can be compressed (vcf.gz).
BCF is not a text format but a BINARY format which is de-facto faster to read.
e.g: how to read an integer in C / text (yeah I could use fscanf too )
int n=0,c;
while((c=fgetc(in))!=EOF && isdigit(c)) { n= n*10 + (c-'0')}
in C/binary
fread(in,sizeof(int),1,&n);
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.