Entering edit mode
3.0 years ago
michael.flower.14
▴
200
I've got paired end sequencing data from a ~500 bp amplicon. I've aligned the data and called variants using gatk to phase the variants, as follows. The phasing information is now under the PGT
tag.
gatk HaplotypeCaller -R $REF -I "$BAM" -O "$DIR"/variants/${SN}_HaplotypeCallerPGT.vcf -ERC GVCF
I now want to output the two phased consensus sequences. I know I can output a phased fasta file like so:
bcftools consensus -p "$PREFIX" -f "$REF" "$PREFIX".vcf.gz > "$DIR"/consensus/${SN}_bwa_consensus.fa
And I can also output the first and second allele calls using the following, but that ignores phasing:
bcftools consensus -p "$PREFIX" -f "$REF" -H 1 "$PREFIX".vcf.gz > "$DIR"/consensus/${SN}_bwa_consensus.a1.fa
bcftools consensus -p "$PREFIX" -f "$REF" -H 2 "$PREFIX".vcf.gz > "$DIR"/consensus/${SN}_bwa_consensus.a2.fa
But how do I get it to pay attention to the phasing in the PGT field?