mach1 has an option "-h" to specify reference haplotypes for phasing, but I am having trouble finding such a reference haplotypes file. I would like to use haplotype reference from the 1000 Genome project, and I have tried
"-h mydir/ALL.chr20.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz" but mach1 does not recognize it. The full command is as follows:
mach1 --compact -h ALL.chr20.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz --hapmapFormat -r 100 -d my.dat -p my.ped --prefix my.phased --phase
A warning message is generated saying "WARNING -- Since no legend file was provided, haplotype file will be ignored ".
Am I using the right 1000 Geome haplotype file (you can tell by the full file name) or do I have to supply a "-s snpList.txt"? If the latter, what's the format of the snpList.txt and what SNPs it should include?
Thanks!