Hello,
I have a vcf file, which I want to upload on the Sanger Imputation server. The following error occured:
--- Aborted Job ---
The input file sanity check failed, "bcftools norm -ce" exited with the following message:
Reference allele mismatch at X:3155141 .. REF_SEQ:'T' vs VCF:'G'
As suggested by the sanger website, I wanted to solve this issue with the bcftools +fixref command.
All my SNPs have dbsnp-IDs, so I downloaded the following file for reordering alleles: ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/VCF/All_20180423.vcf.gz
When I now use the
bcftools +fixref broken.vcf -O z -o fixref.vcf -- -d -f /path/to/reference.fasta -i `All_20151104.vcf.gz`
command, the following error appears:
[E::bgzf_uncompress] Inflate operation failed: invalid distance too far back
[E::bgzf_read_block] Invalid BGZF header at offset 15203091877
It seems, that the All_20151104.vcf.gz file is corrupted. I also am not able to index it with bcftools. However, another operation (subsetting it to regions) works...
Does anyone know, how to solve this problem?
Best,
Andreas
hg19: chrX:3155141 is T
hg18: chrX:3155141 is G
aren't you mixing hg* builds ?
I think/hope not...everything should be hg 19... Might be a stupid question, but where can I quickly check this for some SNPs?