I have used snpEff to annotate 1000genomes phase 3 vcf files. However, I seem to be getting a large number of warnings that read: 'WARNING_REF_DOES_NOT_MATCH_GENOME'. Too many of these warnings indicate a lack of match up between vcf file and the database snpEff is using.
I have been using the hg38 database, and get this warning in almost 25% of my variants. This surprises me as I am using data direct from 1000 genomes and the latest hg38 database and was hoping someone may shed some light on why this is.
N.b. This is more a matter of interest than anything else, as I have downloaded the gencode annotated vcf files that 1000genomes website provide and can use them instead.
Check to make sure that
chromosome names
match in your VCF files and the genome you are using.