Entering edit mode
5.5 years ago
amitgourav.ghosh12
▴
70
Hello,
I have a vcf file of about 565 individuals. I want to replace the missing SNPs of one of them (Ancient sample) with the reference alleles.
I was thinking about trying out the following-
$ bcftools +fixploidy phasedVCF-short02.vcf.gz -- -f 2|bcftools +missing2ref - -- -p > phasedVCF-short03.vcf
But it will probably replace the missing sites in all the individuals.
I am bit confused if there is any function in vcftools or bcftools which would specify to do this operation in only one individual instead of all.
Does it make sense to fix it for just one sample?
Good question, I am not so sure. Let me find out how the PCA comes out.
Meanwhile, I have figured out a possible way to do it. I converted my ld-pruned bed, bim and fam file to vcf file in plink. It only has the genotypes without the quality parameters. Probably it would be much easier to convert the "./." to "0/0" using awk for my target sample.
The reason to do so because of that sample being an ancient individual had many missing genotypes, thereby messing up my final pca output.