Entering edit mode
2.4 years ago
bdolin
▴
100
I'm somewhat new to SnpSift, looking to annotate a VCF file with gnomAD population frequencies.
My VCF file has an existing INFO.AF field
##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
and gnomAD VCF also defines an INFO.AF field (with a different meaning)
##INFO=<ID=AF,Number=A,Type=Float,Description="Alternate allele frequency in samples">
It looks like SnpSift's default behavior is to override an existing INFO.AF where there is a match to gnomAD - leaving me with a mix of INFO.AF fields.
Do I need to remove the existing INFO.AF field first, or is there a way, for instance, to rename the gnomAD annotation as I pull it in?
This is the command I'm running:
java -jar SnpSift.jar annotate gnomad.exomes.r2.1.1.sites.liftover_grch38.vcf.bgz -info "AF" myVCF.vcf > myAnnotatedVCF.vcf
Thanks
from my point of view, they do have the very same meaning
https://samtools.github.io/hts-specs/VCFv4.2.pdf
Unfortunately not. In one case, AF is the sample read frequency for the allele, and in the gnomAD case, AF is the population frequency for the allele.
I see what you are saying - both have a same definition. But here's the actual output. This record was NOT found in gnomAD, and so the original INFO.AF is retained, as the sample read frequency:
Whereas this record WAS found in gnomAD, so INFO.AF was revised by SnpSift, but in this case is the population frequency:
Ah I see ! SnpSift is doing a little wrong here. You could use
bcftools annotate --rename-annot
to rename the field AF before/after snpSift:Thank you!