Entering edit mode
5.3 years ago
lawrence.hsu
▴
10
I'm running into a nameerror where says 'sample' is unknown in this context. I'm thinking it has to do with the --outFileNamePrefix in the shell where it won't allow {sample}. I've tested it with coding a single sample to be processed and it works fine. I'm not sure how to proceed.
configfile: "config.yaml"
SAMPLES= ['A','B','C']
rule all:
input:
expand("01-qc/{sample}_fastqc.html", sample=config['samples']),
expand('01-qc/{sample}_fastqc.zip', sample=config['samples']),
expand('02-trimmed/{sample}.fastq_trimming_report.txt', sample=config['samples']),
expand('02-trimmed/{sample}_trimmed.fq', sample=config['samples']),
expand("03-qc/{sample}_trimmed_fastqc.html", sample=config['samples']),
expand('03-qc/{sample}_trimmed_fastqc.zip', sample=config['samples']),
expand('04-aligned-2pass/{sample}_ID/Aligned.out.sam', sample=SAMPLES),
expand('04-aligned-2pass/{sample}_ID/Log.final.out', sample=SAMPLES)
rule fastQC01:
input:
sample=lambda wildcards: config['samples'][wildcards.sample]
output:
html="01-qc/fastqc/{sample}_fastqc.html",
zip="01-qc/fastqc/{sample}_fastqc.zip" # the suffix _fastqc.zip is necessary for multiqc to find the file. If not using multiqc, you are free to choose an arbitrary filename
params: ""
log:
"logs/fastqc/{sample}.log"
shell:
"fastqc {input} -o 01-qc/fastqc/"
rule trimgalore02:
input:
sample=lambda wildcards: config['samples'][wildcards.sample]
output:
report="02-trimmed/{sample}.fastq_trimming_report.txt",
fq="02-trimmed/{sample}_trimmed.fq"
log:
"logs/trimgalore/{sample}.log"
shell:
"trim_galore --length 20 -q 20 {input} -o 02-trimmed/"
rule fastQC03:
input:
"02-trimmed/{sample}_trimmed.fq"
output:
html="03-qc/{sample}_trimmed_fastqc.html",
zip="03-qc/{sample}_trimmed_fastqc.zip" # the suffix _fastqc.zip is necessary for multiqc to find the file. If not using multiqc, you are free to choose an arbitrary filename
params: ""
log:
"logs/03-qc/{sample}.log"
shell:
"fastqc {input} -o 03-qc/"
rule STAR04:
input:
expand("02-trimmed/{sample}_trimmed.fq", sample=SAMPLES)
output:
sam=expand("04-aligned-2pass/{sample}_ID/Aligned.out.sam", sample=SAMPLES),
sum=expand("04-aligned-2pass/{sample}_ID/Log.final.out", sample=SAMPLES)
log:
expand("logs/star/{sample}.log", sample=SAMPLES)
shell:
"/data/p_magnuson_lab/bin/STAR-2.6.0c/bin/Linux_x86_64/STAR "
"--genomeDir /data/p_magnuson_lab/reference/mouse/STAR_index "
"--runThreadN 8 "
"--readFilesCommand zcat "
"--twopassMode Basic "
"--alignEndsType EndToEnd "
"--alignIntronMax 1 "
"--readFilesIn {input} "
"--outFileNamePrefix 04-aligned-2pass/{sample}_ID/"
You don't want
expand
in a wildcard rule - leave that for your targets. A wildcard rule is a recipe, not a manifest.Have you tried running by changing {sample} to {wildcards.sample} in your shell commands?
Yes I have tried that but it comes back with the error: 'Wildcards' object has no attribute 'sample'
You get the error because
STAR04
doesn't have any wildcard.{sample}
is already resolved byexpand
, hencewildcards.sample
is not a valid attribute.Perhaps this post will help. Use a workflow management tool to manage your computational pipelines . Your use of
expand
in both input and output inSTAR04
doesn't make sense.That was my issue. Thanks!
May be it should be {wildcards.samples} since you have
SAMPLES= ['A','B','C']
at the beginning of your config file and you are trying to refer to that.