I have genomes from clinical bacterial infections. I'm trying to find out whether any of the isolates have infected multiple people.
As such I want to find out how many SNPs difference each genome has across the dataset.
There aren't many good quality reference genomes available so individual mapping is hard, relying on the draft assemblies produced within.
Any good tools/pipelines to achieve this? I've used MEGA's pairwise matrix, but I'm dubious about the results. I'm not sure if the quality of the assemblies will have a large impact on the % of distance detected (i.e. if genome A's assemblies is 2.541mb and genome B's assemlies is 2.543, will it score that extra .002 as distance even though it's likely an error)?
Do you have a VCF file? Look into vcftools.