I'm using samtools multisample SNP calling to call SNPs from different samples. The command I'm using is,
samtools mpileup -uf ref.fa *.bam | bcftools view -vcg - > out.vcf
Then, I'm using vcf-to-tab tool to convert vcf to tab file.
The problem is, No calls (missed in sequencing) are called as homozygous reference calls. How to differentiate No calls from homozygous reference calls.
I appreciate any sort of help.
Edit: My vcf file looks like this:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT samp1.bam samp2.bam 000000F 634700 . T C 3.55 . DP=1;AF1=1;AC1=4;DP4=0,0,0,1;MQ=60;FQ=-27.4 GT:PL:GQ 0/1:31,3,0:5 0/1:0,0,0:3
Generally, you'd do that by depth. If the depth for that sample is below your criteria for safely considering something homozygous-ref at a specific position, then consider it a no-call.
Thanks, Brain. That's a very good idea. But, raw read depth is not printed for individual samples, how to force bcftools to print raw read depth?
Ah, that's kind of unfortunate. I wonder if there's an option for that somewhere? I'm not all that familiar with bcftools, but for some reason depth and allele fraction do not seem to be default fields for each sample. Maybe you're supposed to guess from the PL field ("phred-scaled genotype likelihoods"); since they are all zero for the second sample, the implication is that the depth is zero, even though it was called 0/1 (het). Well, that doesn't make much sense!
One alternative is to use a different tool to calculate coverage at that location for that sample. Bedtools probably allows that, and BBMap's pileup.sh does as well. Not a convenient option, though, for sure.