Hi everyone, so basically I'm an undergrad student who is researching possible selection on the gene ADAM33 which is associated with asthma through various SNPs. Using Tajima's D I can speculate if the increased amount of asthma cases are a product of our post-industrial environment (this would show neutral selection due to the recent development of this environment) or if they have always been affecting people (which should show some form of negative selection on SNP alleles). Basically I know how to get the VCF data from the 1000 genomes project (here is my URL they gave me for my area of interest
and input it into Rstudio by "importing dataset from URL". From there I am unsure of how to go about calculating Tajima's D with R using this VCF data. My knowledge of R is extremely limited to many of the people on here but I was wondering if anyone had done anything similar or had any advice to give? I'm assuming because VCF data is really just SNP data that using it to calculate Tajima's D is possible. Hope to hear from you all, thank you!