Entering edit mode
5.3 years ago
Sam
▴
150
Dear Biostars
I used the vcftools with command below to calculate inbreeding coefficient (F) of a subpopulation but the number of site (N_SITES) in the output file is much less than the sites which are available in the vcf file. The number of sites in the vcf file is about 47000 but in the output file of vcftools is 15000 and less
vcftools --vcf test.vcf --het --out test_het
output file
INDV O(HOM) E(HOM) N_SITES F
401 3418 9376.7 15488 -0.97504
401.2 3397 9560.7 15850 -0.98005
A.02 7090 7906.6 12952 -0.16186
A.18 4969 9473.1 15652 -0.72896
A.25 5602 9312.6 15403 -0.60926
A.27 7218 9373.3 15524 -0.3504
would you have any idea?
Thanks
How complete percentage-wise are the called genotypes in your VCF file? What I mean is perhaps VCFtools can only calculate
F
if all samples have a genotype called for a given SNP/INDEL (aka, no missing genotypes).