Hands-on: Checking GPU usage interactively using rocm-smi¶
Exercises on the course GitHub.
Q&A¶
-
So using sbatch bills compute time only as long as the scripts are running, while salloc keeps billing for as long as specified even if the scripts stopped running?
-
(Lukas) Essentially yes.
salloc
essentially allocated the resources and then enters into a new shell environment from which all following (srun) commands are run on the compute node. So in a way you are in an interactive terminal on the compute node and as long as that is active you are billed for the resources. As far as I now recall, you can actually step out of it by simply typingexit
, which will then also free the allocation. -
Both bill you for the time of the job, i.e., the time that the resources are set aside for you. With
sbatch
the job ends when the batch script ends and everything is cleaned up then. Withsalloc
, the job ends when you leave the shell created bysalloc
, unless you reach the time limit before that, and the resources are only freed for other users when the job ends, i.e., the time limit expires or you leave the shell created bysalloc
, effectively terminating thesalloc
command.As Lukas said, exit or CTRL-D takes you out of that shell, but be careful if you created subshells into that shell...
-