Question

Tajima's D calculation for specific genes in multi-sample VCF

0

Entering edit mode

4.0 years ago

Muhammad • 0

Hi, I have my data in vcf called using GATK's short variant discovery pipeline. Data is for 398 genes and sample size is 401. This is a targeted sequencing data not WGS one. I want to calculate tajima's D for each gene. I know vcftools can calculate Tajima's D in bins but due to varying length and gaps within genes it can't be applied here. Pegas and popgenome packages in R also calculate tajima's D either for entire dataset or in bins. How can I calculate Tajima's D for each gene separately?

neutrality genetics tajimad • 1.2k views

ADD COMMENT • link updated 12 months ago by xoaib • 0 • written 4.0 years ago by Muhammad • 0

score 0 · Answer 1 · 2024-07-29

0

Entering edit mode

12 months ago

xoaib • 0

You can use this tool called vcf2tajima https://github.com/xoaib4/vcf2tajima

ADD COMMENT • link 12 months ago by xoaib • 0