I have a sequenced locus on chromosome 9 that I am attempting to phase with SHAPEIT. My original vcf file has1617 variants. After phasing with SHAPEIT I am left with only 513 variants. I made sure to check the quality of my vcf with checkVCF and found no duplicates or reference mismatches.
I use the following code to run SHAPEIT:
shapeit4.2 --input my_data.vcf.gz --map chr9.b37.gmap.gz --region 9 --reference 1000g_phase3_nomulti_allelic/ALL.chr9.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz --thread 8 --log shapeit_chr9.log --output my_data_phased_SHAPEIT.vcf --mcmc-iterations 10b,1p,1b,1p,1b,1p,1b,1p,10m --pbwt-depth 8 &
Is there a way to minimize the loss of variants?
It's likely that it's because these variants aren't in your reference panel - can you check for how many of your 1617 variants are in the reference panel?