The ‘squeue’ command will show users the jobs that are currently scheduled:
[johnchris@l001 ~]$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
2197268 exclusive jobOne johnchirs PD 0:00 12 (Priority)
2197266 exclusive jobTwo johnchris PD 0:00 12 (Priority)
2197267 exclusive jobThree johnchris PD 0:00 12 (Priority)
2197265 exclusive jobFour johnchris PD 0:00 12 (Resources)
2197269 exclusive jobFive johnchris PD 0:00 12 (Priority)
2197270 exclusive jobSix johnchris PD 0:00 12 (Priority)
2216941 shared Job johnchris R 6:51:59 1 c132
2214572 gpua100 python1 johndoe R 1-09:09:51 1 g012
2216584 gpuv100 python2 johndoe R 1-09:06:16 1 g002
2216679 shared python3 johndoe R 22:35:27 1 c012
2216680 shared matlab johnchris R 22:35:27 1 c014
2216677 shared jupyter johnchris R 22:35:58 1 c002
JOBID | Job or step ID |
---|---|
PARTITION | Name of a partition. shared, exclusive, gpu100 and gpuv100 etc. |
NAME | Name of the job in the queue |
USER | Owner of the job in the queue |
ST | State of the job. Include: allocated, completing, down, drained, draining, fail, failing, future, idle |
TIME | Time used by the job. days-hours:minutes |
NODES | Number of nodes allocated to the job |
NODELIST (REASON) | For pending and failed jobs, this field displays the reason for pending or failure. Otherwise, this field shows a list of allocated nodes |
The squeue command gives us the following information:
- JOBID: The unique ID for your job.
- PARTITION: The partition your job is running on (or scheduled to run on).
- NAME: The name of your job.
- USER: The username for whomever submitted the job.
- ST: The status of the job. The typical status codes you may see are:
- CD (Completed): Job completed successfully.
- CG (Completing): Job is finishing, Slurm is cleaning up.
- PD (Pending): Job is scheduled, but the requested resources aren’t available yet.
- R (Running): Job is actively running.
- TIME: How long your job has been running.
- NODES: How many nodes your job is using.
- NODELIST(REASON): Which nodes your job is running on (or scheduled to run on). If your job is not running yet, you will also see one of the following reason codes:
- Priority: When Slurm schedules a job, it takes into consideration how frequently you submit jobs. If you often submit many jobs, Slurm will assign you a lower priority than someone who has never submitted a job or submits jobs very infrequently. Don’t worry, your job will run eventually.
- Resources: Slurm is waiting for the requested resources to be available before starting your job.
- Dependency: If you are using dependent jobs, the parent job may show this reason if it’s waiting for a dependent job to complete.
“squeue -u username” will view only the jobs from a specific user [johnchris@l001 ~]$ squeue -u johnchris
2213604 exclusive JobI johnchris PD 0:00 2 (Priority)
2213606 exclusive JobII johnchris PD 0:00 2 (Priority)
2213618 exclusive JobIII johnchris PD 0:00 2 (Priority)
2213620 exclusive JobIV johnchris PD 0:00 2 (Priority)
2213621 exclusive JobV johnchris PD 0:00 2 (Priority)
2214144 exclusive JobVI johnchris PD 0:00 2 (Priority)
2217039 shared JobVII johnchris PD 0:00 5 (Priority)
2213574 shared JobVIII johnchris R 2-05:17:14 1 c021
2217608 shared JobIX johnchris R 7:34:08 1 c106
2216796 shared JobX johnchris R 13:19:33 7 c[116-122]
[johnchris@l001 ~]$ squeue -u johnchris --start
2197281 exclusive A johnchris PD N/A 6 (null) (Priority)
2197283 exclusive B johnchris PD N/A 6 (null) (Priority)
2197284 exclusive C johnchris PD N/A 6 (null) (Priority)
2216608 exclusive D johnchris PD N/A 24 (null) (Priority)
2216842 exclusive E johnchris PD N/A 36 (null) (Priority)
2197265 exclusive F johnchris PD 2024-04-12T07:33:39 12 c[163-174] (Resources)
2197266 exclusive G johnchris PD 2024-04-13T16:30:23 12 c[176-187] (Priority)
2197267 exclusive H johnchris PD 2024-04-13T16:30:23 12 c[188-199] (Priority)
2197268 exclusive I johnchris PD 2024-04-13T19:21:20 12 c[151-155,200-206] (Priority)
Note: The job start time is an estimate; it may change as other jobs are submitted. The memory and wall clock time parameters require you to estimate the amount of time your job will need to finish. If you underestimate, your job may be killed before it completes. If you overestimate, your job may take longer to start, as the queuing system finds a way to allocate the resources.