Hello,
I have 30 vcf files from 30 individuals that were created from RNA-Seq alignments on the mitochondria genome of the hard clam (Mercenaria mercenaria). I am trying to Identify SNPs and create haplogroups to look at the maternal relationship between all my individuals (30).
My questions are:
Do I merge all my vcf files together first? (bcftools merge). If so, how do I know which sample had which SNPS at which location?
Do I create haplotypes for each vcf/individual first? Then do I combine samples into one big dataset.
I have been using pegas on R, but there are so many commands I don't really know where to start.
In the end I want a phylogenetic tree that shows the relationship among my sample/indivudals from the mitochondrial SNPs. What program or package to use here? What does the structure of my data need to look like to create a tree?
I have access to R, an HPC system (but installing knew software can be a pain), and MEGA. I am also working on a Mac.
Hi
Once you have merged the VCF files using BCFtools, you can simply use the online program VCF2Poptree (https://github.com/sansubs/vcf2pop) published in PEER J (https://peerj.com/articles/8213/). It is pretty stright forward and very simple to use by clicking a few options and submit.