Question

I want to assembly all the reads at ones in linux

1

Entering edit mode

4.7 years ago

Bioinfo ▴ 20

Hello, I'm new in bioinformatics and I need help please in linux

I have folder contain 109 couple of reads Forward and reverse (218 files) and I want to assembly every couple of reads using spades the name of reads is like this SRR....R.fastq.gz (reverse) SRR...F.fastq.gz (forward)

I need to use for loop to assembly all the reads at ones (because I don't want to repeat the same command 109 times for example I have four reads file SRR1.F.fastq.gz SRR2.f.fastq.gz SRR1.R.fastq.gz SRR2.R.fastq.gz and I need to take SRR1.F with SRR1.R and so on

Can you help me please

Thank you in advance

assembly sequence genome • 913 views

ADD COMMENT • link updated 4.7 years ago by mrmrwinter ▴ 30 • written 4.7 years ago by Bioinfo ▴ 20

0

Entering edit mode

What you need is a pipeline, nextflow, wdl, snakemake. Take a look at https://github.com/nf-core/mag for instance

ADD REPLY • link 4.7 years ago by Asaf 10k

0

Entering edit mode

thank you for your response but i didn't find the page (Page not found )

ADD REPLY • link 4.7 years ago by Bioinfo ▴ 20

0

Entering edit mode

This question has been asked a lot on the forum. Take a look in the "Similar posts" list at the right hand side, or search the forum via the searchbar/google.

ADD REPLY • link 4.7 years ago by Joe 21k

0

Entering edit mode

Okay i will look for similar posts thank you

ADD REPLY • link 4.7 years ago by Bioinfo ▴ 20

score 0 · Answer 1 · 2020-03-05

In Snakemake:

sample = glob_wildcards("path/{sample}.F.fastq.gz")

    rule all:
    input: expand("path/spades_out/{sample}/scaffolds.fa", sample=sample)

    rule spades:
    input: 
    F = "path/{sample}.F.fastq.gz"
    R = "path/{sample}.R.fastq.gz"
    output:
    "path/spades_out/{sample}/scaffolds.fa"
    params:
    out = "path/spades_out/{sample}/"
    shell:
    spades.py -1 {input.F} -2 {input.R} -o {params.out}