Hi, I'm new to Variant Calling, and I am following the Somatic short variant discovery (SNVs + Indels) pipeline with mouse data. So far, I have not encountered any trouble, though, according to GetpileupSummaries, the tool requires a common germline variant sites VCF, e.g. derived from the gnomAD resource, with population allele frequencies (AF) in the INFO field. I have acquired a dbSNP VCF from Sanger's Mouse Genomes Project for the FVB/N strain, GRCm38_68 assembly (FTP link to mgp.v5.merged.snps_all.dbSNP142.vcf.gz
, listed under curent_snps) but it does not contain the AF flag in the INFO field. Plus, the tool requires ONLY biallelic sites, which I do not know how to extract. Is there any way to add these requirements to the VCF?
Thanks in advance for your response, I appreciate any support/guidance in the topic :) Have a nice day.