Dear All, I am quite new to the GATK tools. I would like to obtain a VCF file for known SNPs to be used in the base quality score recaliberation step of GATK. However, looking into the Ensemble, I have found only 'gallus_gallus.gvf.gz', which is not a VCF file as required by the GATK. I have also found variant VCF file on the European variation archive (https://www.ebi.ac.uk/eva/?Home) that I could export as a VCF file. Actually all papers is retrieving the known variant file from the dbSNPs. Also, there is no chicken genome available in GATK bunddle. Could anybody advice on how to best retrieve the VCF file needed in GATK ?
Thanks
What are the errors when you try the ensembl VCF with GATK?
GATK BQSR takes in two files one known-SNPs and one known-INDELs.
Maybe u should split the ensembl
gallus_gallus.vcf.gz
file into SNPs and INDELs and try like that?Thanks, @barslmn Actually the ensemble have only GVF file (not VCF file). So I think can not be used ! in GATK base quality score recaliberation
Ensembl has VCF files here: https://ftp.ensembl.org/pub/release-108/variation/vcf/ and you can convert gvcf to vcf format.