Entering edit mode
3 months ago
Whirlingdaf
▴
60
I am attempting to annotate a phased vcf file with ancestral alleles (AA), but it is not working.
My original vcf file (less the full header) format is:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1 Sample2 ...
chr1 2602 . C A . . AC=0;AF=0.0155039;CM=0;AN=22 GT 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
chr1 2614 . G A . . AC=0;AF=0.0193798;CM=1.2e-05;AN=22 GT 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
My annotation file with the AAs I would like to bring over into the original vcf is:
#CHROM POS ID REF ALT QUAL FILTER INFO
chr1 2602 . . . . . AA=C
chr1 2614 . . . . . AA=G
chr1 2649 . . . . . AA=C
chr1 2830 . . . . . AA=G
Here is the bcftools command:
bcftools annotate -a ancestral_alleles.vcf.gz -c CHROM,POS,INFO/AA -h hdr.txt -o annotated_ancestral.vcf original.vcf.gz
The output vcf file header includes:
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##fileDate=07/08/2024 - 01:30:48
##source=shapeit4.1.3
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency">
##INFO=<ID=AC,Number=1,Type=Integer,Description="Allele count">
##INFO=<ID=CM,Number=A,Type=Float,Description="Interpolated cM position">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Phased genotypes">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele">
##bcftools_annotateVersion=1.19+htslib-1.19.1
While the file looks like:
chr1 2602 . C A . . AC=0;AF=0.0155039;CM=0;AN=22 GT 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
chr1 2614 . G A . . AC=0;AF=0.0193798;CM=1.2e-05;AN=22 GT 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
But I would like it to look something like:
chr1 2602 . C A . . AC=0;AF=0.0155039;CM=0;AN=22;AA=C GT 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
chr1 2614 . G A . . AC=0;AF=0.0193798;CM=1.2e-05;AN=22;AA=G GT 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0 0|0
I suppose it's not working because when you use a vcf as an annotation source, the REF (and ALT ?) allele is required ?
just try again but add the REF/ALT for the two variants above, just to check if that works.
Ah! Maybe you are on to something. Do you expect this would probably also be the case for a tab separated (bgziped and indexed) file as well? E.g.:
yes, a tab instead of a vcf should work.
Thank you so much, yes, this worked!