Hi everyone
I have been playing with Beagle to try and phase and impute my samples.
Basicaly I have a vcf of a family containing parent and siblings and possibly other family members run on a NGS sequencing panel.
My initial wish was to use a PED file to phase genotyped regions in samples and for samples where the genotype is missing, to impute the genotype from the family members.
I first problem is that PED file analysis has been abandoned back at BEAGLE 4.0 so I tried using that older version to run my vcf file :
java -Xmx30720m -jar bin/beagle.r1399.jar \
nthreads={threads} \
map=plink.all.map\
gtgl=input.vcf \
out=output\
ped=input.ped \
tabix -p vcf output.vcf.gz
Whoever it became abundantly clear that this process changes the genotype calls on the input samples. In fact this even more obvious in the sex chromossomes, where male samples start getting het calls on positions where they had clear homozigous calls (as it is an necessity for a male sample only having a single X and Y chromossome.
I also considered using a up-to-date version of BEAGLE. But in that case the impute option seems to be to impute variants contained on my input vcf on the reference panel... which is not what I need anyway... and the fact that the PEDIGREE information (which i assume is valuable in this case) is completly ignored.
Can someone tell me what I am missing? Is there a way to prevent BEAGLE from changing the genotype of already well genotyped positions on the samples on the input vcf?
Many thanks
Hi LChart
Thank you for your answer.
I coded my male sample as biallelelic homozygous as i was aware of limitations for allosomes. I was hoping this was enough.
But it is not only doing this on sex chromosomes...
I will try your idea of assigning a high GL to the genotypes i do not want changed. Thanks for the suggestion