Loop for Slurm in python
0
0
Entering edit mode
2.1 years ago

Hi there, Someone has any tips to fix this code? The function fast() is not running in looping. I guess when the first function sbatch is called, the fastq() function finished of and does not go ahead with the second file. I would not like to use something like "if R1 in file" or If "R2" in file...

Someone has a fancy answer for this.

> # Inputs def sbatch(job_name, command, time=25, mem=60, tasks=4, dep=''):
>     if dep != '':
>         dep = '--dependency=afterok:{} --kill-on-invalid-dep=yes '.format(dep)
> 
>     sbatch_command = "sbatch -J {} -o {}.out -e {}.err -t 00:{}:00 --mem={}00 --ntasks-per-node={} --wrap='{}' {}".format(job_name, job_name, job_name, time, mem, tasks, command, dep)
> 
>     sbatch_response = subprocess.getoutput(sbatch_command)
>     #print(sbatch_response)
> 
>     job_id = sbatch_response.split(' ')[-1].strip()
>     return job_id
> 
> 
> def fastqc():
>     os.chdir("/home/")
> 
>     for file in glob.glob("*.gz"):
>         #print(file)
>         command = "fastqc {}".format(file)
>         job_id = sbatch('fastqc', command)
>         # Return the job id
>         return job_id
> 
> fastqc()
loop python slurm • 975 views
ADD COMMENT
0
Entering edit mode

I don't have an answer to your question and I'm not even entirely sure what you are trying to do. I just wanted to say, though, that managing cluster jobs with python/bash scripts is quite a pain. Depending on your situation, I would suggest investing in a workflow manager like snakemake (my favorite) or nextflow. These take away a lot of complications like checking the successful completion of jobs, job dependencies, resuming failures where you left off, etc. Here on biostars there was some discussion recently about the scope and utility of workflow managers. This is a good example where workflow managers are a good investment.

ADD REPLY
0
Entering edit mode

Hi, Thanks for your comments. I totally agree with you. It is hard to put python and bash in one script.

The function fastqc() must work on a loop. It should process first a R1.fastq.gz, and then go to R2.fastq.gz. If you use the function fastq() without the function sbatch, it works pretty well.

However, I could not put this in a loop yet. And I would not want to do something like.. if "R1" in file ...else "R2" in file... and daddada...

Looking for a fancy answer.

But I appreciate your comments. I am also learning snakemake - which works pretty well in this looping.

All the best,... Thanks

ADD REPLY

Login before adding your answer.

Traffic: 1889 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6