Hi everyone,
I am anew in haplotype phasing. I have 14 WGS individuals data from one human population and i want to perform Haplotype phasing by utilizing Shapeit2. In 1000 Genomes paper supplementary information it is mentioned phasing performed in two steps:
- Creation of Haplotype scaffolds from microarray genotypes
- Joint phasing of biallelic SNPs, Indels and high-confidence deletions onto the haplotype scaffold.
My questions are:
I have VCF files containing both SNPs and INDELs called by standard GATK pipeline as starting point. I dont have genotype array data for sequenced individuals, What latest haplotype reference panel i should utilize in order to perform phasing?
Is that haplotype Phasing is population specific (I.e. different population individuals can have different haplotype structure according to their respective population)? I mean can variant sets from different sets of individuals from different populations be phased together?
Is there any comprehensive tutorial available online that details utility of shapeit2 starting from VCF files step-by-step?
I will be very happy to read any suggestions as starting point regarding "how to perform haplotype phasing with shapeit?".
Thanks in advance!
I am also interested in your question 2 about whether we should phase population together or separately. This is the information I found (see figure 4):
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3415548/
Note: I haven't confirmed whether the findings are applicable to other phasing algorithms.