Nextflow: [E::bwa_idx_load_from_disk] fail to locate the index files
1
0
Entering edit mode
20 months ago
Eliveri ▴ 350

I have a nextflow workflow which I get the following error:
(I tried the path /path/genomes/human.fa and the path to bwa index files exists.)

Command error:
  [E::bwa_idx_load_from_disk] fail to locate the index files

The nextflow.config looks like this: I think the error may be due to the way the path variable is written in the nextflow.config or main.nf?

params {

refdir = "./genomes"
ref = "${params.refdir}/human.fa"

And main.nf looks like this for the bwa process:

ref = file(params.ref)
refdir = file(params.refdir)

... 
process bwa_align {

    tag "align ${pair_id}"

    publishDir "${params.outdir}/$pair_id"

    input:
    tuple val(pair_id), path(paired_reads), path(unpaired_reads)

    output:
    tuple val(pair_id), path("${pair_id}.sam")

    script:

    """
    bwa mem -t ${params.max_threads} -M $ref ${paired_reads} > ${pair_id}.sam
    """
}
sge bwa nextflow • 1.7k views
ADD COMMENT
2
Entering edit mode

It looks like the bwa index files are not being loaded into the environment. If you are in a container, are you sure the bind path is appropriate? You may need to use export SINGULARITY_BIND="/path/to/project/directory" in the beforeScript nextflow argument. Additionally, maybe it's unnecessary to add value channels to your input declaration, but I always put all necessary files and paths there including the index files. It makes checking logically checking for errors easier. So, if I needed the ref.dict and ref.fasta.fai index files, I would do something like this:

ref_genome = file( params.RefGen, checkIfExists: true )
ref_dir    = ref_genome.getParent()
ref_name   = ref_genome.getBaseName()
ref_dict   = file( "${ref_dir}/${ref_name}.dict", checkIfExists: true )
ref_index  = file( "${ref_dir}/${ref_name}.{fasta|fna|fa}.fai", checkIfExists: true )
Channel
  .fromFilePairs("${params.ProcBamDir}/*{bam,bai}") { file -> file.name.replaceAll(/.bam|.bai$/,'') }
  .ifEmpty { error "No bams found in ${params.ProcBamDir}" }
  .map { ID, files -> tuple(ID, files[0], files[1]) }
  .set { processed_bams }
process Some_Process {
  input:
  set SampleID, path(bam), path(bai) from processed_bams
  path ref_genome
  path ref_index
  path ref_dict

  [rest of process...]
}
ADD REPLY
1
Entering edit mode

what do you specify in your apptainer profile? I'm guessing you might be missing for it. This is an example that is used in the nf-core/configs for a specific HPC with singularity: https://github.com/nf-core/configs/blob/550f4745b61449cd2a57ac7ee5f232a87dd6450a/conf/abims.config#L10-L11 But it might give you some ideas

ADD REPLY
0
Entering edit mode

Hi the apptainer in nextflow.config looks like this

apptainer {
    conda.enabled           = false
    apptainer.enabled       = true
    apptainer.autoMounts    = true
    docker.enabled          = false
    process.container       = 'file://image.sif'
}
ADD REPLY
4
Entering edit mode
20 months ago
refdir = "./genomes"

use a full path for refdir

ADD COMMENT
0
Entering edit mode

I ended up passing in refdir as an input into my processes. Making sure full paths are used resolved most of the issues I encountered! Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 2103 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6