How do we deal with `MIXED` type variants in VQSR mode of Variant filtering ?
1
11
Entering edit mode
3.9 years ago
DareDevil ★ 4.3k

I have gone through filtering variants through VQSR and hard-filter from here

My understanding about VQSR is that the we don't want to combine SNP and INDEL where as they are combined in Hard-filter.

  1. In VQSR, we run VariantRecalibrator with mode SNP and INDEL and we get .recal files for both snp and indels.

  2. Next, we apply ApplyVQSR with mode INDEL with indels.recal file to generate indel.recalibrated.vcf.

  3. In the next step, we apply ApplyVQSR for SNP with vcf input as indel.recalibrated.vcf and .recal file generated from VariantRecalibrator with mode SNP.

  4. This step generate file with snp.recalibrated.vcf.gz which will contain both SNP and INDEL as filtered and will be final filtered data from VQSR

Is my understanding correct about variant filtering here ?

If this is right how do we deal with MIXED type ?

VQSR GATK VariantRecalibrator • 1.8k views
ADD COMMENT
5
Entering edit mode

If you look into the this tutorial

Note that mixed records are treated as indels.

ADD REPLY
6
Entering edit mode
2.7 years ago
DareDevil ★ 4.3k

It is better to split the multi-allelic site to bi-allelic sites: use bcftools

bcftools norm \
-m-any \
--check-ref -w -f /path/to/reference/hg38.fasta \
input.vcf -o output.vcf
ADD COMMENT

Login before adding your answer.

Traffic: 2499 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6