Hey, has anyone worked with two steps of VQSR (i.e. the successive application of VariantRecalibrator and ApplyRecalibration) for the non-human genomes?
You know, the human genome data have good truth and known resource datasets (HapMap , Omni , 1000G, dbSNP) which are already available for VQSR analysis. However, I do not have idea how to do it with mouse genome because I need to generate my own resource set. Anyone has experiences on this?
I did HARD SNP variant filtering using VariantFiltration of GATK since my variant file was produced by UnifiedGenotyper. Is VQSR unnecessary and I can skip this step since it looks a little challenge to me?
Thank you,
Skip it - even Broad is not doing VQSR regularly
Hello Jeremy, I had the same question than Tonyzeng and was wondering how I could filter my variants.
However, despite working on a non-model organism, I do have a good-quality list of variants for the species I am working on. So I was wondering if I could maybe make .tranches and .recal files with it.
Regarding your advice to skip any filtering, I am a bit confused. GATK recommend at least some level of hard filtering on the link down there. Would you think it would reduce complexity of the dataset? https://gatk.broadinstitute.org/hc/en-us/articles/360035532412-Can-t-use-VQSR-on-non-model-organism-or-small-dataset