The sinfo command will tell you some useful information about the available partitions on the cluster, including a partition’s time limit, how many nodes are available on that partition, which nodes are available on that partition, and the state of those nodes.
[johnchris@l001 ~]$ sinfo
[johnchris@l001 ~]$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
exclusive up 5-00:00:00 0 maint c[135,137]
exclusive up 5-00:00:00 137 alloc c[028-092,133-134,136,138-206]
full_nodes48 up 5-00:00:00 0 n/a
full_nodes64 up 5-00:00:00 0 n/a
gpu_test up 5-00:00:00 0 n/a
gpuua100 up 5-00:00:00 7 mix g[004-007,010-012]
gpua100 up 5-00:00:00 1 alloc g002
gpuv100 up 5-00:00:00 1 alloc g001
partial_nodes up 5-00:00:00 0 n/a
partial_nodes up 5-00:00:00 3 mix c[093-095,097,108-109]
shared* up 5-00:00:00 1 down s130
shared* up 5-00:00:00 40 mix c[004,011,013,015-016,019,021-023,027,096,098-103,105-107,110-128,131]
shared* up 5-00:00:00 11 alloc c[001-003,005-010,012,014,017-018,020,024-026,132]
shared* up 5-00:00:00 1 idle c104
shared* up 5-00:00:00 1 down c129
PARTITION | Name of a partition: shared, exclusive, gpu100, gpuv100 |
---|---|
AVAIL | Partition state: up or down |
TIMELIMIT | Maximum time limit for any user job in days-hours:minutes, default 5 days |
NODES | Count of nodes with this particular configuration |
STATE | State of the nodes. Possible states include: allocated, completing, down, drained, draining, fail, failing, future, idle |
NODELIST | Names of nodes associated with the configuration/partition |
To display more specific info, see man sinfo
For example:
[johnchris@l001 ~]$ sinfo -o '%11P %5D %22N %4c %21G %7m %11l'
PARTITION NODES NODELIST CPUS GRES MEMORY TIMELIMIT
exclusive 159 c[028-092,133-226] 48+ (null) 190000+ 5-00:00:00
full_nodes4 0 - (null) (null) 0 5-00:00:00
full_nodes6 0 - (null) (null) 0 5-00:00:00
gpu_test 0 - (null) (null) 0 5-00:00:00
gpua100 4 g[004-007] 64 gpu:v10064:4 256000 5-00:00:00
gpua100 3 g[010-012] 64 gpu:a100:4 250000 5-00:00:00
gpuv100 2 g[001-002] 48 gpu:v100:4 190000 5-00:00:00
partial_nod 0 - (null) (null) 0 5-00:00:00
shared* 67 c[001-027,093-132] 48+ (null) 190000+ 5-00:00:00