I am trying to run the fixref plugin found here, to correct for snpflip errors in my topmed imputation. I found this code here: https://samtools.github.io/bcftools/howtos/plugin.fixref.html
for i in {1..22}
bcftools norm --check-ref e -f $OUTDIR/DAC14_send_to_topmed/Homo_sapiens_assembly38_withchrfa.fa $OUTDIR/DAC14_send_to_topmed/DAC14_chr$i\_hg38_nonduplicates.vcf.gz -Ou -o /dev/null
As my build is hg38 and I need to keep the chr prefix in my reference file, I decided to use the GATK HG38 Build called: Homo_sapiens_assembly38.fasta found here: https://console.cloud.google.com/storage/browser/genomics-public-data/resources/broad/hg38/v0;tab=objects?prefix=&forceOnObjectsSortingFiltering=false
I keep receiving this error message:
Failed to load the fai index: /sc/arion/projects/psychgen2/MAP2_dac/data/imputation/DAC14_send_to_topmed/Homo_sapiens_assembly38_withchr.fasta [E::fai_build_core] Format error, unexpected "<" at line 2
I cannot seem to find the solution to this error.
interesting.... perhaps I downloaded it incorrectly?
I used this to download...perhaps cannot do this from a cloud?
Also, I used this as I simply need a reference genome for running the fixref plugin that uses hg38 and chr prefix. Supposedly this is the main one available from GATK? In this same cloud, this was the only available fasta file...others were vcf.gz which couldn't work for this fixref script.
you downloaded the web page...
from cloud.google.com , I think you need to download it from the browser.
Yes, you're right.
So that any other rookies don't make this mistake, use:
^ This one worked for some reason...