Entering edit mode
19 months ago
jason
•
0
I got a VCF which I need to calculate variant allele frequency for each variant at each position.
My understanding is that variant allele frequency is AD / DP
There are multiple sample for each position (NA0001, NA0002, NA0003). Do I get the average for each of them as they each have their own AD
Example VCF
##fileformat=VCFv4.1
##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##INFO=<ID=AD,Number=1,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA0001 NA0002 NA0003
chr1 808922 . G A 249 PASS . GT:AD:DP:GQ:PL 0/0:52,0:52:99:0,120,1800 0/0:54,0:54:99:0,120,1800 0/0:50,0:50:99:0,120,1800
it's not.
https://samtools.github.io/hts-specs/VCFv4.1.pdf
right. but the AF is not in the INFO column, so how do I calculate the allele frequency for each variant in the VCF?
bcftools +fill-tags https://samtools.github.io/bcftools/howtos/plugin.fill-tags.html