Hi,
I have called some somatic variants in tumor-only mode with Mutect2. The first line of the output VCF file looks like this:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 156390
chr1 17961 . TG T . PASS . GT:AD:AF:DP:F1R2:F2R1:SB 0/1:178,59:0.251:237:96,32:72,24:37,141,10,49
Where "156390" is the sample ID. For a downstream tool, I would need the variant allelic fraction (AF) to be either in column 8 (INFO) or column 11, formatted as AF=x
.
I have tried to solve this with awk:
awk -F '\t' 'BEGIN{OFS="\t"} {split($10, arr, ":"); $11="AF="arr[3]; print}' test_156390.vcf > test.vcf
But this is clearly a suboptimal solution - the new column 11 looks right, but AF= is now also the column name, and is also appended to every line in the header.
Is there any better way of doing this?
Thank you
Christian
Thank you for pointing me in the right direction! This is how I solved it and applied it to every VCF in dir. I could not quite get bcftools annotate to work, likely due to a local server issue, but vcf-annotate works fine as well.