I am trying to replicate a study published in 2004 in Journal of Genetics called "Mitochondrial DNA sequence variation in the Anatolian Peninsula (Turkey)" by Mergen, et al. In the 2nd part of this study, a neighbor-joining tree of Turkish, Turkic Central Asian, and European populations is built using the HVS-I of their mtDNA. In the tree, the Turkish, Central Asian (Kazakh, Kyrgyz, Uighur), British, and Finnish populations form one pole and the other European populations (Bulgarian, French, German, Greek) form the other. The Turkish are found to be closest to the British of all Europeans, and in the tree and in Nei's genetic distances and identities, the British and Turkic populations appear to cluster together away from the other Europeans.
Needless to say, this is an intriguing result, and I've tried to find citations. There were 4, all of which appeared in studies that were aggregates of other studies. I could not find any substantial commentary or criticism on the finding. In fact, I don't even see any further studies that test the mtDNA of the populations that were tested in this study. It appears that Central Asian populations are not sampled very often, and it's similar for the Turkish populations, so there doesn't seem to be much data. So I want to try to replicate the findings myself.
Unfortunately, I'm a newbie. Though I understand the basic concepts of neighbor-joining trees, which compares the differences in the genomes in the population and tries to find how similar populations are overall, it's difficult to go about doing. I'm wanting to look at data from 1000 Genomes, which though it doesn't have data from any Turkic population, seems like a good basic resource to start accumulating data. I'm looking at using a program like Phylip or Mega7, but I'm having difficulty getting the data into a format that can be useful for those programs. The 1000 Genomes project only provides data in BAM or VCF format which don't appear to be used by programs that build neighbor-joining trees.
Can y'all provide any help for this newbie or am I in way over my head?
Turkish genomes have been published (https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-963). I am not sure how you access them, but they have been sequenced.