Question

Is it necessary to do genotype quality filteration after snp calling with GATK

1

Entering edit mode

12 months ago

IdaHao0921 ▴ 20

I have whole genome sequencing data from different sources. And the sequencing depth varies greatly among different samples. And I joint-called SNPs with all samples. When I do SNP filteration after SNP calling, I found that the genotype quality is influenced by sequencing depth. For example, I set GQ >= 30 as my threshold, a genotype that does not meet this threshold is considered a missing genotype. Then after max-missing filteration, I found only very few SNPs are retained, which can not meet the requirements of subsequent analysis. Even if I lowered the threshold to 20, SNPs left are still inadequate. So I calculated the proportion of variants that meet the threshold in each individual. I found that the higher the sequencing depth of the individual, the higher the proportion of variants that met the threshold . How can I fix this problem? Can I just skip the GQ filteration? Does that affect the analysis later on, like population structure analysis, demographic analysis or selective sweep indentification?

snp filteration genotype quality GATK • 933 views

ADD COMMENT • link updated 11 months ago by mufernando ▴ 10 • written 12 months ago by IdaHao0921 ▴ 20

0

Entering edit mode

see VQSR https://gatk.broadinstitute.org/hc/en-us/articles/360035531612-Variant-Quality-Score-Recalibration-VQSR and https://gatk.broadinstitute.org/hc/en-us/articles/360035890471-Hard-filtering-germline-short-variants

ADD REPLY • link 12 months ago by Pierre Lindenbaum 166k

1

Entering edit mode

The species I study is not a model species, VQSR can not be applied here. I already used gatk hard-filtering. I mean, after hard-filtering, is it necessary to do genotype quality filteration?