Snakemake error
1
1
Entering edit mode
10 weeks ago
aUser ▴ 70

I am trying to write a workflow in Snakemake for my MD studies. This is test as well as a learning experience. My workflow looks like this:


# This is to set the GMX workflow using SnakeMake workflow system

mdp_files = "/mnt/e2f3_dp2/mdp_files/"
em_files = "/mnt/e2f3_dp2/01_prep/"

rule nvt_prep:
    input:
        coordinate = em_files + "em.gro" ,
        topology =   em_files + "topol.top" ,
        mdparam =    mdp_files+ "03_nvt_amber_1ns.mdp" ,
        checkpoint = em_files + "em.gro"
    output:
        outf = "nvt.tpr"
    shell:
        "
        gmx grompp -f {input.mdparam} \
                     -c {input.coordinate} \
                     -r {input.checkpoint} \
                     -p {input.topology} \
                     -o {output.outf}
        "
rule nvt:
    input:
        rules.nvt_prep.output
    output:
        multiext("nvt", "edr", "xtc", "log", "cpt")
    shell:"""gmx mdrun -v -deffnm {input}""""

rule temp:
    output: "nvt_temp.xvg"

    input:
       energy= "nvt.tpr.edr"

    shell: "echo Temperature | gmx energy -f {input.energy}  -o {output}"


rule npt_prep:
    input:
      coordinate= "nvt.gro",
      topology=   em_files + "topol.top",
      md_param=   mdp_files+ "03_npt_amber_1ns.mdp",
      checkpoint= "nvt.gro"

    output: "npt.tpr"
    shell:
        "gmx grompp -f {input.md_param} \
                    -c  {input.coordinate} \
                    -r  {input.checkpoint} \
                    -p {input.topology} \
                    -o {output} "

rule npt:
    input:
        rules.npt_prep.output
    output:
        "nvt.edr",
        "nvt.gro",
        "nvt.cpt",
        "nvt.xtc"
    shell:
        "gmx mdrun -v -deffnm {input} -nb gpu -pin on -nt 12 -gpu_id 1 -pinoffset 24"

It keeps throwing this error:

SyntaxError in file /mnt/e2f3_dp2/md3/gmx_workflow.text, line 19:
unterminated string literal :
None
'''

I tried googling, but could not figure out the prob. Its been many days, but still could not figure out whats wrong with this.

Tried to change the shell line with:

       f"""gmx grompp ... """

But it did not work, tried single quote, multiple quotes, none seemed to work out.

Anyone, please help me figure out the problem, many thanks.

SnakeMake • 1.6k views
ADD COMMENT
1
Entering edit mode

I am not sure this will solve your issue, but I usually don't write anything after the marks of multiple code lines """

Try :

shell:
   """
   gmx grompp -f {input.mdparam} \
              -c {input.coordinate} \
              -r {input.checkpoint} \
              -p {input.topology} \
              -o {output.outf}
    """
ADD REPLY
0
Entering edit mode

Thank you, I tried with 3 quotes, it gave me same error, however, I changed to singel quote and put the indentation as shown. With this modification, it gave me the same error but now the line number is 13 (some improvement).

    output:                     # specifically this line.
        outf = "nvt.tpr"

I am writing this in vim, so I dont think that there is some hiddern charater causing this. Any other idea, please.

For indentation, I am single or double tabbing, if it matters.

ADD REPLY
0
Entering edit mode

Smell like an indentation issue for sure. Try to simplify your script as much as possible (echo something as output, remove your inputs).

I also remember having some issue with tabs and moving my files around from one text editor to another, some text editors translate 1 tab as 2 or 4 spaces. It gave me a lot of headaches. Try to use a fix number of spaces and not tabs (even if tabs should work in theory)

ADD REPLY
0
Entering edit mode

Thank you, let me simplify it first, then I will update you. I have changed all the tabs to spaces, (1 tab == 4 spaces), but it did not solve the prob. I wll add one rule at a time, and then see, where I am getting error.

ADD REPLY
0
Entering edit mode

I noticed that it is difficult to convey the indentation properly in the biostars code widget, which is a little odd because this is essentially python code and python is most used these days. However, indentation inside the quotes should not matter for shell code. The way Bastien shows should work fine.

ADD REPLY
0
Entering edit mode

I was wondering if snakemake has a specific interpretation of """, on top of python usual interpretation, which is making multiple lines comments.

ADD REPLY
0
Entering edit mode

If you have a multiple lines command in shell you should use 3 double quotes (""")

ADD REPLY
0
Entering edit mode

I tried your code literally and I got no syntax error in a dry run. If it was an indentation error, you should see something like

IndentationError in file <string>, line 13: unindent does not match any outer indentation level:

Are you sure this is your complete code? Possibly, you have some unterminated quotes higher up in the file?
Also, I haven't been able to get the exact error message you got. The closest I got was:

TokenError:
('unterminated string literal (detected at line 3)', (3, 1))

By inserting ' " in line 3. I would look for a non-matching pair of quotes and the like.

ADD REPLY
0
Entering edit mode

Thank you for the help, I have uploaded the complete script, fom top to bottom. My error is not identation, but "unterminated string". After checking so many examples, I could not figure out where I am making mistake. The following is the command I am running (I copied from terminal):

(snakemake) fsbserver:/mnt/e2f3_dp2/md3$ snakemake -np temp -s gmx_workflow.text
SyntaxError in file /mnt/e2f3_dp2/md3/gmx_workflow.text, line 12:
unterminated string literal :
None
ADD REPLY
5
Entering edit mode
9 weeks ago
Michael 55k

Good you posted the complete script, because the first excerpt wasn't causing the error. The problem is that multi-line strings in python are defined by """ not a single ". For all multi-line strings, replace the single double-quote with three and the syntax will be correct. I have not checked whether it works as intended for obvious reasons.

Further, in line 27, there was an extra closing quote (4 instead of 3). In general, using multi-line strings for all shell commands for readability is good practice, IMO. If the editor does not do the indentation correctly (like Emacs snakemake-mode), you can use r""" or f""" instead. I would switch to an editor or IDE with a dedicated Snakemake mode with syntax highlighting that will also do consistent indentation. This can be very hard in plain vi. Here is a link for a a vi syntax file if you want to stick with it: https://mstamenk.github.io/2017/08/snakefile-syntax-file-for-vi-vim.html (haven't tried it)

Also, did I say that I hate Snakemake's error handling? It gives the wrong line in case of the first error.

# This is to set the GMX workflow using SnakeMake workflow system

mdp_files = "/mnt/e2f3_dp2/mdp_files/"
em_files = "/mnt/e2f3_dp2/01_prep/"

rule nvt_prep:
    input:
        coordinate = em_files + "em.gro" ,
        topology =   em_files + "topol.top" ,
        mdparam =    mdp_files+ "03_nvt_amber_1ns.mdp" ,
        checkpoint = em_files + "em.gro"
    output:
        outf = "nvt.tpr" ## this is line 12, nothing wrong here
    shell:
        """ ## This is where the error was
         gmx grompp -f {input.mdparam} \
                     -c {input.coordinate} \
                     -r {input.checkpoint} \
                     -p {input.topology} \
                     -o {output.outf}
        """
rule nvt:
    input:
        rules.nvt_prep.output
    output:
        multiext("nvt", "edr", "xtc", "log", "cpt")
    shell:
         """
         gmx mdrun -v -deffnm {input}
         """ # deleted additional quote here, was not a multi-line string, single double quotes would have sufficed but better this way

rule temp:
    output: "nvt_temp.xvg"

    input:
       energy= "nvt.tpr.edr"

    shell: "echo Temperature | gmx energy -f {input.energy}  -o {output}"


rule npt_prep:
    input:
      coordinate= "nvt.gro",
      topology=   em_files + "topol.top",
      md_param=   mdp_files+ "03_npt_amber_1ns.mdp",
      checkpoint= "nvt.gro"

    output: "npt.tpr"
    shell:
        """
        gmx grompp -f {input.md_param} \
                    -c  {input.coordinate} \
                    -r  {input.checkpoint} \
                    -p {input.topology} \
                    -o {output} 
        """

rule npt:
    input:
        rules.npt_prep.output
    output:
        "nvt.edr",
        "nvt.gro",
        "nvt.cpt",
        "nvt.xtc"
    shell:
        """
        gmx mdrun -v -deffnm {input} -nb gpu -pin on -nt 12 -gpu_id 1 -pinoffset 24
        """ # always use multi-line strings
ADD COMMENT
1
Entering edit mode

Also, did I say that I hate Snakemake's error handling? It gives the wrong line in case of the first error.

This issues has been re-surfacing a number of times. At the moment it looks fixed, what version of snakemake are you using?

ADD REPLY
0
Entering edit mode

Indeed, I was using $ snakemake --version 8.16.0

So possibly updating to the latest version could fix it....

but it doesn't:

(snakemake) michael@kjempefuru ~ $ snakemake --version
8.25.1
(snakemake) michael@kjempefuru ~ $ snakemake -n --snakefile test01.smk
SyntaxError in file test01.smk, line 12:
unterminated string literal :
None

The error is in line 15 not 12....

Also, there are more things wrong with the error handling.

ADD REPLY
0
Entering edit mode

I am using 8.25.0, and it gave me headache.

ADD REPLY
0
Entering edit mode

It is possible that is fixed in the latest main branch (not 8.25.1 which is the latest release) or some other branch but I unfortunately cannot be bothered to try out to install snakemake from github main branch for the next few days. However, if it persists, I will post a new issue or re-open the old one.

ADD REPLY
1
Entering edit mode

I found that this is a new issue. The line number reported for "unterminated string literal" is often way off by >=1 lines. Here is the simplest possible example that is off by 1:

rule error:
    shell:
        " Line 3
        "
SyntaxError in file test03.smk, line 2:
unterminated string literal :
None

While it should say:

TokenError:
('unterminated string literal (detected at line 3)', (3, 9))
ADD REPLY
1
Entering edit mode

Off by one doesn't seem a lot, but the difference can add up in unpredictable ways. Here is a more extreme example:

rule error: 
    input:
        a = "test",
        "test2"
    output:
        "test.out",
        "test2.out"
    message:
        "ERROR"
    conda: "error"

    benchmark:"error" #Line 12       
    log: a = "a.log",
       b = "b.log",
       c = "c.log",
       d = "d.log",
       e = "e.log",
       f = "f.log"





    shell:
        " Line 25
        "

Output:

SyntaxError in file test03.smk, line 12:
unterminated string literal :
None

Whereas even python gets that right (notwithstanding that it's otherwise not pure python because of the rules):

$ python test03.smk
  File "test03.smk", line 25
    " Line 25
    ^
SyntaxError: unterminated string literal (detected at line 25)
ADD REPLY
1
Entering edit mode

Thanks a lot, I did the 3 quotes as you suggested, and it solved the issue at this moment. I accepted and upvoted if someone else stumbled upon this issue.

I tried using formatting-scripts for Vim, but it seems none of them worked. I have already consulted the link you provided, but no fix so far. I will try again later.

ADD REPLY
1
Entering edit mode

The vim syntax only allows syntax coloring; you need a color-capable terminal like xterm or the various linux terminals (set xterm-256color terminal type in iTerm, for example). It doesn't help with the indentation, you still need to keep track of using either space or tab. Here is some guidance for that: https://vi.stackexchange.com/questions/422/how-can-i-display-tabs-as-characters

I am using emacs with Melpa snakemake-mode. That does the indentation right (converting tab to 4 spaces by default) but has some quirks as well.

If I am uncertain about a Tokenerror in Snakemake, I tend to think that it is safe to use only the Python3 tokenizer. You can do that like so:

$ python -m tokenize test03.smk
test03.smk:25:9: error: unterminated string literal (detected at line 25)
ADD REPLY
0
Entering edit mode

Thank you, I will give it a try. This is really helpful.

ADD REPLY

Login before adding your answer.

Traffic: 3513 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6