Slurm Usage¶
Slurm is used to submit jobs on the different partitions from the Monolithe frontend. The available partitions are listed in the description page (see the "Slurm Partition" column in the summary table).
Frontend Connection¶
It is recommended to add some lines to your ~/.ssh/config file as explained
in the SSH access section. Then, to connect to the frontend
from your computer you only have to do:
Basic Slurm Commands¶
Here are some useful command to start using Slurm:
-
sinfo -llists the available partitions (= nodes in our case)$ sinfo -l Mon Mar 18 10:57:44 2024 PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT OVERSUBS GROUPS NODES STATE NODELIST xu4 up infinite 1-infinite no NO all 1 idle vroum tx2 up infinite 1-infinite no NO all 1 idle tegrax2c xagx up infinite 1-infinite no NO all 1 idle tegraagx brub up infinite 1-infinite no NO all 1 idle brubeck xnano up infinite 1-infinite no NO all 1 idle jetson-nano1 rpi4 up infinite 1-infinite no NO all 1 idle selfix xnx up infinite 1-infinite no NO all 1 idle tegranx-1 m1u up infinite 1-infinite no NO all 1 idle m1ultra onx up infinite 1-infinite no NO all 1 idle orinnx oagx up infinite 1-infinite no NO all 1 mixed orinagx onano up infinite 1-infinite no NO all 1 idle orinnano opi5 up infinite 1-infinite no NO all 1 idle orangepi5 -
Submission of a job that executes thesrun -p [partition] commandruns a command on a partitionhostnamecommand on them1upartition. -
Interactive job on the Orange Pi 5, all the 8 cores are used in this session (by default, ifsrun -p [partition] --pty bash -iruns a interactive session on a partition--cpus-per-taskis not specified, only one core is allocated).Note
An easier way to connect interactively to the nodes is to use a custom
~/.ssh/configfile as detailed in the SSH Access page. -
sbatch [script]runs a Slurm script on the cluster -
squeue -lallows you to view current submitted jobs on the clusterFor instance, here one job from the$ squeue -l Mon Mar 18 10:58:20 2024 JOBID PARTITION NAME USER STATE TIME TIME_LIMI NODES NODELIST(REASON) 1702 oagx bash galveze- RUNNING 47:10 UNLIMITED 1 orinagxgalvezeuser is running on theoagxpartition. -
scancel [jobid]cancels a job -
scancel -u [user]cancels all the jobs for a given user