Hello
I'm using BWA to create the index for aligning some rna-seq fastq.
First thing I did was download hg38.fa.align.gz from UCSC
Then I:
gzip -d hg38.fa.align.gz
sudo apt-get install bwa
Here comes the problem. BWA instructions reccomend bwtsw algorithm, but when I use it:
bwa index -p ref_hum -a bwtsw hg38.fa.align
[bwa_index] Pack FASTA... 7.36 sec
[bwa_index] Construct BWT for the packed sequence...
Floating point exception (core dumped)
When I don't specify the algorithm
bwa index -p ref_hum hg38.fa.align
[bwa_index] Pack FASTA... 7.42 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.00 seconds elapse.
[bwa_index] Update BWT... 0.00 sec
[bwa_index] Pack forward-only FASTA... 7.33 sec
[bwa_index] Construct SA from BWT and Occ... 0.00 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index -p ref_hum hg38.fa.align
[main] Real time: 14.780 sec; CPU: 14.747 sec
I'm worried I might be losing information since bwa instructions are :
bwa index [-p prefix] [-a algoType] <in.db.fasta>
-a STR Algorithm for constructing BWT index. Available options are:
is IS linear-time algorithm for constructing suffix array. It requires 5.37N memory where N is the size of the database. IS is moderately fast, but does not work with database larger than 2GB. IS is the default algorithm due to its simplicity. The current codes for IS algorithm are reimplemented by Yuta Mori.
bwtsw Algorithm implemented in BWT-SW. This method works with the whole human genome
Thanks for the help
I might be using "is" instead of "bwtsw". I don't feel this is a solution
I changed the genome hg38.fa.align.gz to hg38.fa.gz and it worked
Bwa is not for rnaseq. Don't ignore that.