Hi everyone,
I am using family trios data to run GATK's SNP calling pipeline. The HaplotypeCaller gave me the father's SNPs file, which I have recalibrated using VariantRecalibrator. But the final VCF file still has 3 million “PASS” records. Actually, one human has no chance to carry so many SNPs.
Any advice for adjusting the parameters?
The parameters I used refer to the literature below.
Roazen, D., Thibault, J., Banks, E., Garimella, K., Altshuler, D., Gabriel, S. and DePristo, M. (2013). From FastQ Data to High-Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline. Current Protocols in Bioinformatics, pp.11.10.1-11.10.33.
Thanks.
Why do you think so? What is the expected number of SNPs for one human subject on average?
A little more than 2 million is appropriate, as far as I know.