Hello there!
A few days ago I started using Snakemake for the first time.
Mainly I want to use fasterq-dump to download a big number of files from NCBI and I do it like this:
sra = []
with open("run_ids") as f:
for line in f:
sra.append(line.strip())
rule all:
input:
expand("raw_reads/{sample}.fastq", sample=sra)
rule download:
output:
"raw_reads/{sample}.fastq"
threads: 8
params:
"--split-spot --skip-technical"
log:
"logs/fasterq-dump/{sample}.log"
shell:
"""
fasterq-dump {params} --outdir /home/snakemake/raw_reads {wildcards.sample} -e {threads}
"""
This is working, but:
- How can I load the samples from a configure.yaml file instead. Now I have and external txt file with a list of samples and I read it with python
- Is it worth it? Will make my script faster if I load the samples from a configure.yaml?
Thanking you in advance!