Entering edit mode
18 months ago
Lynne-95
▴
20
Hello,
I have filtered/processed phased bcf files from wgs. I would like to extract the haplotype data per sample, so that I have a tab delim file which looks like this:
Sample | Chr | Pos | hap1 | hap2 |
---|---|---|---|---|
AW23 | chr1 | 1234 | A | C |
AW45 | chr1 | 1245 | G | T |
.
Currently I've tried to use bcftools as below to extract genotype data, but then that only gives the output in 0s/1s and not the alleles. I was also considering trying reading the data in Plink2 and converting to oxford gen format.
bcftools view -c1 -s ${sample} ${in_file}.bcf \
| bcftools query -f '%CHROM\t%POS[\t%GT]\n' \
| sed 's:|:\t:g' - > ${sample}_hap.txt
.
Any ideas of how best to go about this?
Ty for any help in advance! + apologies if I've missed this in the manual somewhere.
Thanks that worked perfectly!