Hello,
I made the following reference fasta file (amplicons.fa) with only two sequences:
>BALV3loop
GTACAAGACCCAACAACAATACAAGAAAAAGTATAAATATAGGACCAGGCAGAGCATTTTATACAACAGGAGAAATAATAGGAGATATAAGACAAGCACATTGTAACCTTAGTAGAGCAAAATGGAATGACACTTTAAATAAGATAGTTATAAAATTAAGAGAACAATTTGGGAATAAAACAATAGTCTTTAAGCACTCCTCAGGAGGGGACCCAGAAATTG
>NL4-3V3loop
GTACAAGACCCAACAACAATACAAGAAAAAGTATCCGTATCCAGAGGGGACCAGGGAGAGCATTTGTTACAATAGGAAAAATAGGAAATATGAGACAAGCACATTGTAACATTAGTAGAGCAAAATGGAATGCCACTTTAAAACAGATAGCTAGCAAATTAAGAGAACAATTTGGAAATAATAAAACAATAATCTTTAAGCAATCCTCAGGAGGGGACCCAGAAATTG
I am trying to index the reference to run "alfred qc" on my aligned reads with the following command:
samtools faidx amplicons.fa
I get the following error:
[E::fai_build_core] Format error, unexpected character at line 1
[faidx] Could not build fai index amplicons.fai
I am not sure what could be wrong as the reference headers are named correctly according to my knowledge. Any help would be appreciated.
Sara
You're not the first person who has somehow ended up with UTF8 byte order marks in their input files. UTF8 actually recommends against using these, but clearly some Windows tool is "helpfully" creating text files with these invisible headers.
Incase we see this again, it would be useful to know which tool you used to create these files, so we can recommend people don't use it in the future!
I used LibreOffice Writer as I work in a Linux OS and saved the file as a txt file and then converted it to fasta using "mv". I will probably use vim next time.
I work on Linux too (Pop! OS), and yeah, I recommend not using a word processor like LibreOffice Writer to write code/scripts. If you want to use a GUI/IDE, I recommend VS Code, which I'm currently using.