I have read various threads here that suggested that the "Parse error reading exome.fasta.amb" is due to the whitespaces in my reference file hg19.fa
Basically, I run and get $ bwa mem ref.fa read1.fq read2.fq > aln-pe.sam
BWA error [bns_restore_core] Parse error reading ref.fa.amb
I think I have removed whitespaces using:
tr -d ' ' < in.fa >out.fa ## remove all spaces
also tried without success: sed 's/ *$//g' in.fasta > out.fasta
I am trying to align/map exome.fastq to indexed hg19(ref.fa). I have read that bwa mem have issues with whitespaces but I am stuck here. Did anyone have the same issue and got it solved? Is there a workaround? Thanks.
You should download your hg19.fa and re-index it. I never had to remove space in hg19.fa ...
I had downloaded the hg19 tar file from ucsc, untarred it, concatenated every fasta files into one file (ref_hg19.fa). Do you think concatenating every files would cause the problem? Should I concatenate only chr1-22, X, Y, M?
How about should I download indexed hg19 file from iGenome?
Where did you obtain your hg19 and index it?
Thanks.
Should I concatenate only chr1-22, X, Y, M?
What did concatenated plus that ? Sure you have to concatenated only that ! How did syou concatenated it by the way ?
I get my hg19 on ucsc like downloading ref chromosome by chromosome and concatenated them