Hi
I would like to use fineSTRUCTURE to access the population structure of a bacterial species. Thus I will be inputting SNP data.
However, I don't understand how to create the 'phased' data format that fineSTRUCTURE requires. The fineSTRUCTURE manual lists multiple programmes to help with this phasing process, such as phase, beagle, shapeit, impute2 etc however, I don't know were to even start with these....
For example PHASE requires me to input my data in the following format...
NumberOfIndividuals
NumberOfLoci
P Position(1) Position(2) Position(NumberOfLoci) LocusType(1) LocusType(2) ... LocusType(NumberOfLoci) ID(1)
Genotype(1)
ID(2)
Genotype(2)
.
.
.
ID(NumberOfIndividuals)]
Genotype(NumberOfIndividuals)
But how to I get this?!?!?!
As it stands I have the core genome alignment, the SNP alignment and a VCF of my data. How do I use these formats to phase my data?? Can anyone help to point me in the right direction??
Many many thanks!!!
https://people.maths.bris.ac.uk/~madjl/finestructure/fs_4.0.1.zip