Empty VCF file with bcftools call
1
0
Entering edit mode
8.6 years ago
AP ▴ 100

Hello,

I am trying to produce a vcf file using bcftools call but it produces an empty vcf file containing only the header. In short, here is what I do:

  1. Alignment with BWA
  2. With samtools, make sorted.bam files
  3. With samtools, index the sorted.bam files
  4. Run samtools mpileup in the following way: samtools mpileup -C 50 -E -t SP -t DP -u -I –f /genome/refgenome.fa -b bam_list.txt > output.bcf
  5. Run bcftools call: bcftools call -v -c output.bcf > output.vcf

I am using versions 1.3.1 of samtools, bcftools and htslib. I tried reinstalling these programs but it did not change the issue. I also tried with versions 1.2. Same problem. As far as I know, the bcf file seems fine, it contains lots of data and is 20GB.

I tried producing a basic vcf file using bcftools view: bcftools view output.cf > output.vcf and it works. The vcf file seems completely normal.

Could anyone help me with this? Why would bcftools call produce an empty output?

Thanks

bcftools call • 6.1k views
ADD COMMENT
0
Entering edit mode

How big is your vcf file? whats is the last line of your vcf file? Have you tried to run bcftools on Galaxy with your vcf file?

ADD REPLY
0
Entering edit mode

Thanks for your answer. My BAM files varie in size from 150MB to 2GB. My bcf file is ~ 22GB. I am running my analysis on a HPC and have plenty of power. The last line of my "empty" vcf file is just the end of the header with CHROM POS ID ...etc. I did not try to run bcftools on Galaxy but I am not sure that it will help. I suspect an issue with mpileup or bcftools call.

Please let me know if you have other suggestions.

ADD REPLY
0
Entering edit mode

Ok, I guess the bam_list.txt might be the issue. If you're running your analysis on cluster make a flat file containing your bam locations. Then include the following to your pipeline:

cat BAM_Path | while read path sample; do
samtools mpileup -C 50 -E -t SP -t DP -u -I –f /genome/refgenome.fa -b "$location" > output.bcf...

Hope this helps!

ADD REPLY
3
Entering edit mode
8.6 years ago
AP ▴ 100

For some reasons, an empty line was added at the bottom of the index genome file. Removing it solved the problem…

ADD COMMENT
1
Entering edit mode

well-done on sharing this.

ADD REPLY

Login before adding your answer.

Traffic: 2551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6