Hello, As I am new to the GATK pipeline, I have some questions regarding the supplement file used for the somatic variant filtering step.
I call individual samples separately using Mutect2 with the files 1000g_pon.hg38.vcf.gz and af-only-gnomad.hg38.vcf.gz, which can be found at the GATK storage (available at https://console.cloud.google.com/storage/browser/gatk-best-practices/somatic-hg38;tab=objects?prefix). For downstream analysis (filtering the variants), there are two approaches: filtering directly with one command using FilterMutectCalls, or following three steps: GetPileupSummaries, CalculateContamination, and FilterMutectCalls. Which method should I choose? (https://gatk.broadinstitute.org/hc/en-us/articles/360035531132--How-to-Call-somatic-mutations-using-GATK4-Mutect2) If I opt for the three-step process, during the GetPileupSummaries step, I understand that both -V and -L files are required. The -V file should be a biallelic VCF, and the -L file can be a .bed or .interval_list file. Many forums suggest using af-only-gnomad.hg38.vcf.gz and somatic-hg38_small_exac_common_3.hg38.vcf.gz for both -V and -L.
I am confused about which files to use and whether a BED file is necessary. If it is, how can I create a BED file?