Entering edit mode
3.4 years ago
Muhammad
•
0
Hi, I have my data in vcf called using GATK's short variant discovery pipeline. Data is for 398 genes and sample size is 401. This is a targeted sequencing data not WGS one. I want to calculate tajima's D for each gene. I know vcftools can calculate Tajima's D in bins but due to varying length and gaps within genes it can't be applied here. Pegas and popgenome packages in R also calculate tajima's D either for entire dataset or in bins. How can I calculate Tajima's D for each gene separately?