I've seen a similar example in other places. It's piping pileup output to bcftools. This made sense before. However, newer versions of samtools (1.0+) have a --VCF flag. Why is it still necessary to pipe to bcftools? Wasn't bcftools there just to convert to VCF format?
Apologies for a late response, but I have recently been looking at samtools mpileup. I found that the following works very well (here, I use multiple samples over which variants are called):
Because bcftools call is actually doing the variant calling. mpileup is only outputting genotype likelihoods (in VCF or BCF format), which are then used by bcftools to call variants (in VCF of BCF format). Does that make sense?
You could try to run samtools mpileup with --VCF argument and see how the "vcf" output looks like. I'm telling you this because some time ago I had the same doubt, and after doing the check, I answered myself :-)
Yes, that is what is happening. But why pipe to bcftools instead of just using the --VCF option and save the extra step?
Because bcftools call is actually doing the variant calling. mpileup is only outputting genotype likelihoods (in VCF or BCF format), which are then used by bcftools to call variants (in VCF of BCF format). Does that make sense?
You could try to run
samtools mpileup
with--VCF
argument and see how the "vcf
" output looks like. I'm telling you this because some time ago I had the same doubt, and after doing the check, I answered myself :-)