Hello Biostars,
I have a cohort of whole genomes that I am about to phase.
Before doing so, I am seeking recommendations on how to do this optimally.
The whole genomes I have are mostly ASW genomes, with a few Caucasians, and I was planning to use BEAGLE. Currently, I had planned to carry out the following procedure:
- Download the CGI-sequenced CEU, YRI, and ASW genomes from 1kG
- Phase these control genomes first (I do not believe the CGI genomes are phased, and these were the ones I was going to use).
- Using these as a reference, phase the case ASW genomes
I had planned to use something like the following:
java -jar b4.r1274.jar ref=1kG.vcf gt=case.vcf out=out.gt
Is this a bad plan? Does anyone have recommendations about a better procedure? Better program? Good place to do background reading on the number and type of phased genomes it is helpful to have before phasing your own?
Thank you
Thank you very much. I've seen your comments elsewhere as well, and I wanted to thank you for responding here as well.
Local ancestry estimates, here I come!