Entering edit mode
7.5 years ago
Krisr
▴
470
Hello,
I am interested in obtaining haplotypes (and their frequencies) from a region of the human genome from a particular population from the Phase3 1000 genomes data. I have downloaded the corresponding chromosome (ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/) and used tabix and VCF tools to obtain the genotype data for the region of interest from a subset of of subjects corresponding to CEU. Does anyone know of a workflow to utilize data of this sort to infer haplotypes and their frequencies across the region specified in the subsetted VCF?
Thanks,
Does this prior post help you: Haplotype frequencies from 1000 genomes
Thanks for the reply, it is a solution I may try. I was wondering if there was a workflow using pre-existing tools already coded and available.
Not that I see from a quick search of the net, but you are right to post it on a blog like this cause people have almost certainly done it.
I would: 1) download 1kg phased VCF file 2) trim to just the SNPs I wanted 3) convert into my desired input format for Haploview (https://www.broadinstitute.org/haploview/input-file-formats) 4) offload haplotype frequency calculation to haploview (easy to code, but this gives you everything else in the GUI).