Hi everyone,
Trying to change my habits from bash scripting to snakemake. I struggle to understand the basic logic of few things and the most important one is how to deal with input name. I don't want to freeze a particular input name inside the snakefile, I want it to be able to work on any fastq file, and I know that's what is snakemake for. So I thought the correct way to do this was to use the wildcardss so it know where he have to look and what for (a fastq file for example).
I have made the following script :
from Bio import SeqIO
import sys
import pandas
rule split_fastq:
input:
"data/{reads}.fastq",
"data/{seqsum}.txt"
output:
"split_fastq/low.fastq",
"split_fastq/high.fastq"
run:
cols = ["read_id","mean_qscore_template"]
dataInter = pandas.read_csv(filepath_or_buffer=args.seqsum_path,sep="\t",usecols=cols)
data = dataInter.rename(mapper={"mean_qscore_template": "quals"}, axis="columns").set_index("read_id").to_dict()["quals"]
with open("split_fastq/low.fastq",'w+') as a, open("split_fastq/high.fastq",'w+') as b:
try:
for rec in SeqIO.parse("data/{reads}.fastq","fastq"):
if(data[rec.id] <= 6):
SeqIO.write(rec,a,"fastq")
else:
SeqIO.write(rec,b,"fastq")
except KeyError:
sys.exit('\nERROR: Mismatch between sequencing_summary and fastq file: {} was not found in the summary file.\nByeBye.'.formatrec.id))
And I keep switching from this error :
(snakemake-4.8.0_venv) |sbsuser@genologin2 /work/sbsuser/test/roxane/alignement-ont|$snakemake
Building DAG of jobs...
WildcardError in line 5 of /work/sbsuser/test/roxane/alignement-ont/Snakefile:
Wildcards in input files cannot be determined from output files:
'reads'
So then I add the wildcars also in the output like this "split_fastq/{reads}-low.fastq","split_fastq/{reads}-high.fastq" and I end up with :
(snakemake-4.8.0_venv) |sbsuser@genologin2 /work/sbsuser/test/roxane/alignement-ont|$snakemake
Building DAG of jobs...
WorkflowError:
Target rules may not contain wildcards. Please specify concrete files or a rule without wildcards.
Can someone explain what I am doing wrong ? I really don't get it...
Thanks,
Roxane
Hello Roxane Boyer ,
is this you whole workflow or do you plan to include more rules? Depending on that, there might be technique you should have a look on:
The later one I have used in the example of my tutorial.
fin swimmer
Hi finswimmer !
I was planning on adding more rules later, but I wanted to build my first snakescript slowly and leanr step by step. That's why it's frustrating to get stuck at the first rule !
I'll have a look on those link and try to figure out what I am missing. Thanks