Entering edit mode
5.1 years ago
Assa Yeroslaviz
★
1.9k
I'm trying to implement a snakemake workflow for my fastq files
This is my rule for mapping: gz_command="--readFilesCommand zcat" if config["gzipped"] else ""
main snakefile
configfile:"config.yaml"
SAMPLES=["1_S1", "2_S2", "3_S3", "4_S4"]
rule all:
input:
directory("data/starIndex/"),
bam=expand("mapped/star/bamFiles/{sample}.bam", sample=SAMPLES),
counts=expand("mapped/star/CountsFiles/{sample}.counts.tab", sample=SAMPLES),
expand("mapped/star/bamFiles/{sample}.bam.bai", sample=SAMPLES)
# Genome indexing
include:"./Star.GenomeIndexing.Snakefile"
# Genome Mapping
include:"./Star.Mapping.Snakefile"
Mapping step (Star.Mapping.Snakefile
)
rule map_star:
input:
R1='data/samples/paired-end/{sample}_R1.fastq',
R2='data/samples/paired-end/{sample}_R2.fastq',
index=directory("data/starIndex/")
output:
bam='mapped/star/bamFiles/{sample}.bam',
counts='mapped/star/CountsFiles/{sample}.counts.tab'
params:
prefix = 'mapped/bams/star/{sample}.',
starlogs = 'mapped/starlogs',
gz_support=gz_command
threads: 16
shell:
r'''
STAR --runThreadN {threads}\
--genomeDir {input.index}\
--outFileNamePrefix {params.prefix} --readFilesIn {input.R1} {input.R2} {params.gz_support}\
--outSAMtype BAM SortedByCoordinate\
--limitBAMsortRAM 50000000000\ #50 Gib
--quantMode GeneCounts\
--outReadsUnmapped Fastx &&\
mv {params.prefix}Aligned.sortedByCoord.out.bam {output.bam} &&\
mv {params.prefix}counts.tab {output.counts} &&\
mkdir -p {params.starlogs} &&\
mv {params.prefix}Log.final.out {params.prefix}Log.out {params.prefix}Log.progress.out {params.starlogs}
'''
rule index:
input:
'mapped/star/bamFiles/{sample}.bam'
output:
'mapped/star/bamFiles/{sample}.bam.bai'
shell:
'samtools index {input}'
I have the problem that the threads
parameter is not being recognized
When I testing the --dryrun
I see this command for the mapping:
STAR --runThreadN 1\
--genomeDir data/starIndex/\
--outFileNamePrefix mapped/bams/star/1_S1. --readFilesIn data/samples/paired-end/1_S1_R1.fastq data/samples/paired-end/1_S1_R2.fastq \
--outSAMtype BAM SortedByCoordinate\
--limitBAMsortRAM 50000000000\ #50 Gib
--quantMode GeneCounts\
--outReadsUnmapped Fastx &&\
...
which tells me that there is only one thread active. Why is that?
How can I fix this to the parameter set in the command itself?
thanks Assa
thanks. i forgot to add it.
Please do not close a post after it has received a response. If a given answer resolves the question then please accept it so others can get an indication on how to resolve this.