I called variants on 200 WGS samples, each got around 4 mil variants, however, most were unique and only 1 mil variants overlapped between most individuals.
I suppose it is normal behaviour that GATK won't output info about homozygote reference variants in genome, right? So the missing spaces should be simply filled with reference notation "0/0". Or shouldn't it?
Is there a correct way to fixt for low overlap? Otherwise it greatly complicates rare variant analysis.
Would GATK put 0/0 if reads contained only reference allele? Or is it the expected behaviour only in GVCF mode?
If i called them as a regular .vcf's is there a way to fix it and recall in GVCF mode apart from doing the whole thing anew?