Job pipelines with dependencies

Very often, computational workflows can be broken into sequential pieces with different computational needs. As an example, suppose you need to generate simulation data and then do post processing on that data. The simplest way to handle these cases is to submit the simulation job, wait for it to complete, and then submit the post-processing job. However, it may be convenient to submit both jobs so that they may be executed in series without re-submission.

We can submit both steps to be executed in sequence with job dependencies. Suppose we have a simulation_script.py which generates a data file and a processing_script.py which performs some post-processing:

$ simulation_script.py
import numpy as np
data = np.array([1, 2, 3, 4])
np.savetxt("data.txt", data)

and

$ processing_script.py
import numpy as np
import sys
args = sys.argv[1]
datafile = args
data = np.loadtxt(datafile)
print(data**2)

We can write the corresponding batch scripts:

$ simulation.sl
#!/bin/bash
#SBATCH --job-name=my_job_simulation
#SBATCH --partition=shared
#SBATCH --time=00:10:00
#SBATCH --output=slurm_%x_%j.out
#SBATCH --error=slurm_%x_%j.err
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=4gb
module load anaconda/2020.07-p3.8
cd (path to working directory)
python simulation_script.py

and

$ processing.sl
#!/bin/bash
#SBATCH --job-name=my_job_postprocessing
#SBATCH --partition=shared
#SBATCH --time=00:10:00
#SBATCH --output=slurm_%x_%j.out
#SBATCH --error=slurm_%x_%j.err
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=8gb
module load anaconda/2020.07-p3.8
cd (path to working directory)
python processing_script.py data.txt

Submitting the first job with sbatch simulation.sl shows the job id of the newly submitted job. You can also view the job id by running squeue -u (your user-id)

[johnchris@l001 ~]$ sbatch simulation.sl
Submitted batch job 2196925
[johnchris@l001 ~]$ squeue -u johnchris
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
2196925 shared my_job_s johnchris R 0:02 1 c007

In this case, our job has jobid = 2196925. We can submit our post processing step with sbatch --dependency=afterok:2196925 postprocessing.slThis will submit the post-processing job, but it will not execute until the simulation job finishes successfully.

[johnchris@l001 ~]$ sbatch --dependency=afternotok:123456 processing.sl
2206898 shared my_job_s johnchris PD 0:00 1 (Priority)
2206906 shared my_job_p johnchris PD 0:00 1 (Dependency)

Slurm has a few different ways to run a job with a dependency. Please check the link for details: https://slurm.schedmd.com/sbatch.html#OPT_dependency

The following table lists 3 ways which dependencies might be useful for users.

OptionDescriptionExample
afteranySchedule a job after its dependent job has finished (passed or failed)--dependency=afterany:<job_id>
afterokSchedule a job after its dependent job has completed successfully--dependency=afterok:<job_id>
afternotokSchedule a job after its dependent job has failed--dependency=afternotok:<job_id>
Scroll to Top