Hello, I have some plink data from affymetrix 5.0 arrays and I am converting them to vcfs for imputation. However, I am running into a problem where my vcf files are not matching my reference files. The references I have are the GRch37 fasta and GRch38 primary assembly. Is there a way for me to find out the reference build? am getting 75.5% mismatch when I run bcftools' fixref plugin.
Can you look at some of the sites where there is mismatch to see what may be happening? Also, which array annotation files did you download?
Array probe sequences should be mostly independent of genome-build, as they only refer to sequence; however, the annotation for these probes will change with regard to their genomic base positions.
Annotation files for SNP 5.0 are available here: http://www.affymetrix.com/support/technical/byproduct.affx?product=genomewidesnp_5