I’m working with SNP data generated from a de novo assembly of 93 samples (from a ddRADseq pipeline using STACKS). The SNPs were called into a VCF file, and I’m interested in performing lineage analysis to understand evolutionary relationships or genetic clustering among individuals in my dataset.
What I’ve Done: I renamed the "chromosomes" in my VCF file to contig1, contig2, etc., as there are thousands, and converted the data to PLINK binary format. I successfully performed a KING kinship coefficients calculation using --make-king-table, which worked fine.
What I Need: I now want to perform a lineage analysis, ideally grouping individuals into lineages or genetic clusters based on SNP data.
My Questions:
What tools or workflows would you recommend for lineage analysis using SNP data from a de novo assembly? Are there tools that handle VCF files with non-standard chromosome names (e.g., contig1) directly? If specific formats are required, what are the best ways to convert my data for these tools?
I’m particularly looking for tools or software that: Work directly with VCF files or offer simple conversion workflows. Are designed for SNP-based lineage analysis (e.g., clustering or ancestry analysis), rather than whole-genome assembly. Thank you in advance for your help! Example workflows, tools, or scripts would be highly appreciated.