different windows and snps for same vcf file while i calculate genetic diversity using vcftools
0
0
Entering edit mode
3.0 years ago
reza ▴ 300

I have a multi-sample VCF file (20 individuals) and I want to calculate Pi (nucleotide diversity) in each population for detection of the signature of selection. I do this with following commands:

vcftools --gzvcf Whole.vcf --keep pop1_list --window-pi 40000 --window-pi-step 20000 --out pop1.pi

vcftools --gzvcf Whole.vcf --keep pop2_list --window-pi 40000 --window-pi-step 20000 --out pop2.pi

these commands outputted two files with different windows numbers (86415 windows vs 86430) and different SNP numbers in the same windows, for example:

pop1

CHROM   BIN_START   BIN_END N_VARIANTS  PI
NC_044511.1 1   40000   49  0.000265416
NC_044511.1 20001   60000   24  0.000146456
NC_044511.1 40001   80000   38  0.000386449
NC_044511.1 60001   100000  68  0.000650799
NC_044511.1 80001   120000  96  0.000888518

pop2

CHROM   BIN_START   BIN_END N_VARIANTS  PI
NC_044511.1 1   40000   39  0.00030515
NC_044511.1 20001   60000   7   2.97E-05
NC_044511.1 40001   80000   39  0.000375541
NC_044511.1 60001   100000  78  0.000694135
NC_044511.1 80001   120000  102 0.000900462

while I run the following command I get 60 SNPs

bcftools stats -r NC_044511.1:1-40000

Why there is no correspondence between the results?

vcftools pi • 720 views
ADD COMMENT

Login before adding your answer.

Traffic: 2879 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6