nextflow script of RNA data analysis
3
0
Entering edit mode
2.0 years ago
Srinka ▴ 20

This is the code I have written till mapping. It works fine till the trimming part as it takes 1 input. But throws an error while mapping. Please help in finding where I went wrong.

//rna_script.nf
nextflow.enable.dsl=2

/*
 * pipeline input parameters
 */
params.reads = "/home/user2/rnatoy/data/ggal/*.fq"
params.genome = "/home/user2/rnatoy/data/ggal/ggal_1_genome.fa"
params.outdir = 'results'

log.info """\
         R N A  P I P E L I N E    
         =============================
         genome: ${params.genome}
         reads : ${params.reads}
         outdir: ${params.outdir}
         """
         .stripIndent()


/*
 * Step 1. QUALITY CHECK
 */
process QUALITY {

    input :
    path( params.reads)



        script:
        """
        fastqc  ${params.reads} -o /home/user2/Dock/results

        """
}
/*
 * Step 2.TRIM
*/

process TRIM {

    input :
    path params.reads

    output :
    path '*trimmed.fq', emit : trimmed

        script:
        """
        trim_galore  ${params.reads} -o /home/user2/Dock/results/TRIMMED/trimmed.fq
        """

}

/*
 * Step 3. Builds the genome index required by the mapping process
 */
process buildIndex {
    input:
    path params.genome

    output:
    path 'genome.index*', emit : index

    """
    bowtie2-build --threads $task.cpus ${params.genome}  genome.index
    """
}

/*
 * Step 3. ALIGNMENT
 */
process mapping {     
    input:
    path TRIM.out
    path buildIndex.out 

    output:
    path "accepted_hits.sam"

    """
    bowtie2 -x buildIndex.out ${trimmed} -S accepted_hits.sam
    """
}

workflow {

    buildIndex(Channel.fromPath( params.genome ) )
    QUALITY(Channel.fromPath( params.reads ) )  
    TRIM(Channel.fromPath( params.reads ) )
    mapping(Channel.fromPath(TRIM.out, buildIndex.out ))


}

Command error

nextflow • 3.1k views
ADD COMMENT
0
Entering edit mode
    trim_galore  ${params.reads} -o /home/user2/Dock/results/TRIMMED/trimmed.fq

this is really wrong.

trim_galore should use the input from its own process and the output shouldn't be a full path.

ADD REPLY
0
Entering edit mode

with TRIM.trimmed the error is as follows

ADD REPLY
1
Entering edit mode
2.0 years ago

there is not output named "out" for TRIM. It's named "trimmed"

path '*trimmed.fq', emit : trimmed

you want something like: TRIM.trimmed

ADD COMMENT
1
Entering edit mode
2.0 years ago
Ming Tommy Tang ★ 4.5k

Unless you want to have some custom code, I suggest you start with https://nf-co.re/rnaseq

ADD COMMENT
2
Entering edit mode

I encourage people writing their own workflows and understanding Nextflow logic. nf-core pipelines can be valuable, but by just running them you won’t learn a single piece of Nextflow.

ADD REPLY
1
Entering edit mode

yes, you are right. If you want to learn it you should write it yourself. If it is a one-off thing that you have to do it quickly (as in industry), go to the dockerized workflow is the right choice.

ADD REPLY
0
Entering edit mode
2.0 years ago

Paths in nextflow I see you are trying to output nextflow data to a hard coded path when trimming - don't to this, it will cause major problems, so let Nextflow handle input and output directories and paths for you as much as possible.

    trim_galore  ${params.reads} -o /home/user2/Dock/results/TRIMMED/trimmed.fq

So just do something like this

    trim_galore  ${params.reads} -o trimmed.fq

I would also learn how to put your reads in an input channel instead of passing ${params.reads} around everywhere. See great examples here:

https://nextflow-io.github.io/patterns/

ADD COMMENT
0
Entering edit mode

Thank you so much for this help but another error is thrown as shown in the picture -

The workflow was given as -

workflow {

mapping(Channel.fromPath('TRIM.out', 'buildIndex.out'))

}

The process is written as -

process mapping {     
    input:
    path 'TRIM.out'
    path 'buildIndex.out'


    """
    bowtie2 -x $buildIndex.out $TRIM.out -S accepted_hits.sam
    """

}

ERROR

ADD REPLY
0
Entering edit mode

Have a look at the complete Nextflow examples for RNA-seq - these deal with multiple inputs and outputs, which is the problem here.

https://github.com/CRG-CNAG/CalliNGS-NF/blob/master/modules.nf

workflow {

mapping(Channel.fromPath('TRIM.out', 'buildIndex.out'))

}

// I write my processes like this:

process mapping {     
    input:
    file trimmed_R1  // just try with Read1 until you get the hang of how it works, Read2 will complicate the pipeline
    file index    // this will only input 1 file called index, which is not sufficient as indices are mostly 6+ files

"""
bowtie2 -x $index $trimmed_R1 -S accepted_hits.sam
"""

}

ADD REPLY
0
Entering edit mode

Thank you. Also, can you help me prioritize the command to run first? I am not able to find any command that suggests that.

ADD REPLY
0
Entering edit mode

Sorry, not sure what you mean. I think it's worth looking at a lot of entry level Nextflow stuff to pass 1 file input and 1 file output through a workflow using Nextflow DSL2. Nothing fancy, just to gain experience. Or take an RNA-seq pipeline like the output I suggested, break it, and repair it - googling the error messages to get more experience. The Nextflow docs are pretty good and they provide lots of examples.

ADD REPLY
0
Entering edit mode

Here is a youtube video of an RNA-seq pipeline being built in nf -

ADD REPLY

Login before adding your answer.

Traffic: 2992 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6