This is inspired by a comment on this post by /u/dariober:
Basically, you are asking snakemake to produce one log file per SRR id and this log file is produced by rule download_files. As a side product, download_files will give you the actual fastq files (things could be done differently in this respect but hopefully this will help...)
I have a list of SRA files that I need to download to split reads. The goal is to get from [SRR1, .....]
a list of files SRR1_1.fastq
, SRR1_2.fastq
in a predetermined folder. To that extent, I wrote the following file:
SRA_MAPPING = read_dictionary()
SRAFILES = list(SRA_MAPPING.keys())[1:]
RawSampleFolderName="raw_samples"
RawSampleFolder=RawSampleFolderName+"/"
rule all:
input:
expand("{RawSampleFolder}{srafiles}_1.fastq",srafiles=SRAFILES, RawSampleFolder=RawSampleFolder),
expand("{RawSampleFolder}{srafiles}_2.fastq",srafiles=SRAFILES, RawSampleFolder=RawSampleFolder)
rule download_srafiles:
output:
expand("{RawSampleFolder}{srafiles}_1.fastq",srafiles=SRAFILES, RawSampleFolder=RawSampleFolder),
expand("{RawSampleFolder}{srafiles}_2.fastq",srafiles=SRAFILES, RawSampleFolder=RawSampleFolder)
params:
download_folder = RawSampleFolderName
shell:
"fasterq-dump {wildcards.srafiles} -O {params.download_folder}"
(This is a proof of concept, I'll dump the global variables in config
as soon as I can). The nutshell is that I have a list of SRA files read from a list, and I have a preset download folder. I use fasterq-dump
to get split read files - from SR1
to SR_1.fastq
and SR_2.fastq
, In a snakemake fashion, I'd like for the rule download_srafiles
to have output
to be fastq files. My previous solution was given in the linked posted, but said solution made a .log
file as an output and retrieved files as a side effect - I'd like to skip the part where I need log
file. Since all I have is a python list, I skip the input. Running the above file does not download samples. Instead I get the error:
'Wildcards' object has no attribute 'srafiles'
So what is it that I'm doing wrong?
This pretty much solved all my problems. I figured that expand does something funky to wildcards but couldn't figure out what. Many thanks.