Question

[E::bwa_idx_load_from_disk] fail to locate the index files

0

Entering edit mode

2.7 years ago

melissachua90 ▴ 70

I want to use BWA to index my paired-end dataset.

First, I indexed the reference genome:

bwa index -p refseq -a is refseq.fa

Next, I use bwa-mem:

bwa mem -aHMP -t 20 refseq.fa corrected_data.tar.gz

I also tried:

cd corrected_data/
for f in `ls -1 *_1.fq.gz | sed 's/_1.fq.gz//’`;
do bwa mem -aHMP -t 20 refseq.fa $f\_1.fq.gz $f\_2.fq.gz;
done

Traceback:

[E::bwa_idx_load_from_disk] fail to locate the index files

alignment BWA • 1.5k views

ADD COMMENT • link 2.7 years ago by melissachua90 ▴ 70

1

Entering edit mode

I think your problem is your prefix -p refseq . I imagine you should use bwa mem -aHMP -t 20 refseq corrected_data.tar.gz instead of refseq.fa

ADD REPLY • link 2.7 years ago by Pierre Lindenbaum 164k

1

Entering edit mode

Tarballs with bwa-mem, where did you get that from? I doubt that works, or is this some edge case documentation I missed all these years?

ADD REPLY • link 2.7 years ago by ATpoint 86k

1

Entering edit mode

Just tried that, it even works, but each individual file in the tarball is considered as a single-end technical replicate so for paired-end data one would need two tarballs?! My advise would be to use the standard syntax of feeding the individual fastq files.

ADD REPLY • link 2.7 years ago by ATpoint 86k

0

Entering edit mode

for f in `ls -1 *_1.fq.gz | sed 's/_1.fq.gz//’`;

you should use a makefile or better, workflow manager like snakemake or nextflow....

ADD REPLY • link 2.7 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Thank you for the suggestion! I'm new to nextflow (and bash scripts) but this is my attempt.

nextflow.enable.dsl=2
// Script parameters
params.query = "../corrected_data/*.fq.gz"
params.db = "/../refseq.fa"

process alignment {
        input:
                file fastqList from query.collect()
                path db
        output:
                path "f\_.aligned.sam.gz"

        """ for f in `ls -1 *_q.fq.gz | sed 's/_1.fq.gz//'`
        do bwa mem -aHMP -t 20 refseq $f\_1.fq.gz $f\_2.fq.gz | gzip -3 > $f\_.alignment.sam.gz;
        done
        """
}

Call it:

nextflow nextflow_corrected_data

There are plenty of bugs and I'm still attempting to correct it. But if you can take a look at the script, that would be excellent!

ADD REPLY • link 2.7 years ago by melissachua90 ▴ 70