Best practice for running GATK VQSR on X chromosome

0

Entering edit mode

3.3 years ago

samuelandjw ▴ 260

According to GATK best practice, it is recommended that different VQSR models be built for SNPs and INDELs, because the annotations for high-quality SNPs and INDELs are systematically different (if I understand it correctly). Since annotations for good variants on autosomes could be different from those on X chromosome, e.g., DP for good variants on X chromosome could be substantially smaller than DP on autosomes due to having large number of male samples, it seems reasonable to build VQSR models separately for X chromosome and autosomes. However, no such advice is proposed on GATK website.

My question is:

Should we build VQSR models separately for autosomes and X chromosome? If yes, we will have 4 VQSR models: Auto SNPs, Auto INDELs, X SNPs and X INDELs.

WGS GATK WES • 1.7k views

ADD COMMENT • link 3.0 years ago by samuelandjw ▴ 260

0

Entering edit mode

Is it for WES or WGS ? If it's for WES it's not adviced to use DP in VQSR

ADD REPLY • link 3.3 years ago by Nicolas Rosewick 11k

0

Entering edit mode

WGS. I guess DP is not the only annotation that differs in X chr and autosomes.

ADD REPLY • link 3.3 years ago by samuelandjw ▴ 260

0

Entering edit mode

But I am very doubt that only X/Y chromosome is suffercient enough to train the model, maybe we should use hard-filtering on sex chomosomes?

ADD REPLY • link 3.0 years ago by MatthewP ★ 1.4k

0

Entering edit mode

For Y chromosome VQSR could be impossible. At least for hg19, there are no reference true positive variants for Y chromosome, and hence VQSR for Y chr is impossible. For X chromosome, it is a matter of sample size. But if you are doing hard filtering for X/Y chromosomes, what would you suggest as the thresholds for hard filtering? Same as autosomes?

ADD REPLY • link 3.0 years ago by samuelandjw ▴ 260

Login before adding your answer.