bcftools not calling variants that appear high quality
1
0
Entering edit mode
5 weeks ago
bertb ▴ 20

Hello,

I am trying to run a variant call using bcftools on a yeast (haploid) DNAseq .bam file. Using the following command I am able to generate a list of variants, but scanning through the .bam file on IGV, I can see variants that appear to be of high quality (100+ read coverage, good cigar scores on individual reads in the read track, etc.) that are not being called using this command.

bcftools mpileup -Ou -f sacCer3.fa -q 10 -d 250 myfile.bam |   bcftools call -Ou -mv --ploidy 1 |   bcftools filter -i 'DP>=50' |   bcftools norm -f sacCer3.fa -Oz -o myfile.vcf.gz

Reading other posts, I have loosened the quality criteria (-q 10) and these variants are still not being called.

Is there a way to interrogate the .bam file to ensure that it is meeting calling criteria at a particular position for bcftools? Any other suggestions?

bcftools 1.19

Thanks, Ben

bcftools DNAseq snps • 451 views
ADD COMMENT
0
Entering edit mode
5 weeks ago

remove `bcftools filter -i 'DP>=50' , remove the '-v' option of "bcftools cal"l , run bcftools mpileup in a small region known to contain a variant and show us the line of the resulting VCF.

ADD COMMENT
0
Entering edit mode

Thank you for your response. Here is the adjusted code I ran:

$bcftools mpileup -Ou -f sacCer3.fa -q 10 -d 250 A7_sorted.bam |   bcftools call -Ou -m --ploidy 1 |   bcftools norm -f sacCer3.fa -Oz -o A7pl.vcf.gz
$bcftools view -H C2pl.vcf.gz chrI:26973
chrI    26973   .       A       G       18.4764 .       DP=32;VDB=1.56112e-05;SGB=-0.69312;MQ0F=0;AC=1;AN=1;DP4=0,0,32,0;MQ=11  GT:PL   1:48,0

It appears the variant is now being called with the revised code - thank you! However, the issue I am having is that I am trying to compare two bam files to identify unique variants. When I run the following command to compare variants:

$bcftools isec -n-1 -c none C2n.vcf.gz A7n.vcf.gz -Oz -p var/

many of the sites on this list appear to be calling differences where there don't appear to be any (as viewed in IGV). It seems this is because one of the vcf files will generate two calls. As an example, for position chrI:1469, calling file 1:

$bcftools view -H A7pl.vcf.gz chrI:1469
chrI    1469    .       C       .       284.59  .       DP=202;MQ0F=0;AN=1;DP4=107,91,0,0;MQ=26 GT      0

whereas calling file 2:

$bcftools view -H C2pl.vcf.gz chrI:1469
chrI    1469    .       C       .       284.59  .       DP=185;MQ0F=0;AN=1;DP4=101,81,0,0;MQ=25 GT      0
chrI    1469    .       C       .       29.597  .       INDEL;IDV=12;IMF=0.0648649;DP=185;SGB=-0.379885;RPBZ=2.5594;MQBZ=-5.13732;MQSBZ=-2.0529;BQBZ=-3.44995;SCBZ=1.04845;MQ0F=0;AN=1;DP4=99,71,0,1;MQ=26    GT      0

Given many more reads support the reference (DP4=99,71,0,1), why does bcftools generate this potential variant? Is there a way to screen these lower quality calls out before running my comparison?

Thank you

ADD REPLY
0
Entering edit mode

However, the issue I am having is that I am trying to compare two bam files to identify unique variants. When I run the following command to compare variants:

this is another question.

ADD REPLY
0
Entering edit mode

DP4=0,0,32,0

you have a bias with the forward/reverse strand

ADD REPLY

Login before adding your answer.

Traffic: 1242 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6