Consensus sequence for phased variant calls
0
0
Entering edit mode
3.0 years ago

I've got paired end sequencing data from a ~500 bp amplicon. I've aligned the data and called variants using gatk to phase the variants, as follows. The phasing information is now under the PGT tag.

gatk HaplotypeCaller -R $REF -I "$BAM" -O "$DIR"/variants/${SN}_HaplotypeCallerPGT.vcf -ERC GVCF

I now want to output the two phased consensus sequences. I know I can output a phased fasta file like so:

bcftools consensus -p "$PREFIX" -f "$REF" "$PREFIX".vcf.gz > "$DIR"/consensus/${SN}_bwa_consensus.fa

And I can also output the first and second allele calls using the following, but that ignores phasing:

bcftools consensus -p "$PREFIX" -f "$REF" -H 1 "$PREFIX".vcf.gz > "$DIR"/consensus/${SN}_bwa_consensus.a1.fa
bcftools consensus -p "$PREFIX" -f "$REF" -H 2 "$PREFIX".vcf.gz > "$DIR"/consensus/${SN}_bwa_consensus.a2.fa

But how do I get it to pay attention to the phasing in the PGT field?

gatk phasing bcftools consensus • 630 views
ADD COMMENT

Login before adding your answer.

Traffic: 2568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6