Entering edit mode
16 months ago
I performed whole exome analysis using GATK pipeline. After annotation of variant using annovar, I performed these steps:
Filtered variants that have passed all filters Using Gnomad_exome_all, looked for variants less than 0.01
Then tried to confirm if these variants are also present in bam files in igv. Some of the variants which were heterozygous in the bam file were called as homozygous in jointcalled vcf file.
Has anyone come across such findings
Thank you
What is the read depth and allele depth of your variant? It would be hard to comment if there are only 5 reads.
But best way to confirm your variants is to show them with another (orthogonal) method like Sanger or PCR.
depth was comparable in bam and vcf. For example, if it 210 reads in bam. It is 220 reads in vcf, but the genotype is not same
What are the allele depths? Could you post example alignments and sample column from the VCF. You can emit the positions so you wouldn't be posting any identitiable information.
GATK may be calling your reads with reference calls as consistently lower quality across all samples, so they are more likely to be called artefactual and not factor into the genotyping. But I agree if a variant is that rare you wouldn't expect someone to have two copies.
I checked other variants, they have the same genotype in the bam file and vcf files. Discrepancy happens in only some of the regions. There is a possibility that when reads are misaligned with reference, it would be look as heterozygous (bam file). When we do the joint variant calling step, the variant is confidently called as homozygous? . I agree that not called variants are true variants, there could be false positives. Is there any other explanation?