Exercise session 15: Omniperf¶
Exercise assignments can be found in the AMD exercise notes, section on Omniperf.
Exercise files can be copied from Exercises/AMD/HPCTrainingExamples.
Materials¶
Materials on the web:
-
Exercise files: Download as .tar.bz2 or download as .tar
Archived materials on LUMI:
-
Exercise assignments PDF:
/appl/local/training/4day-20231003/files/LUMI-4day-20231003-Exercises_AMD.pdf -
Exercise files:
/appl/local/training/4day-20231003/files/LUMI-4day-20231003-Exercises_AMD.tar.bz2or/appl/local/training/4day-20231003/files/LUMI-4day-20231003-Exercises_AMD.tar
Q&A¶
-
When I try executing commands from the first hackmd.io link, this is what I get:
salloc -N 1 --ntasks=1 --partition=small-g --gpus=1 -A project_465000644 --time=00:15:00 salloc: error: Job submit/allocate failed: Requested node configuration is not available salloc: Job allocation 4701325 has been revoked.- could you
source /project/project_465000644/Exercises/HPE/lumi_g.sh. It could be that you run thelumi_c.shscript which sets some variables that could clash with thesalloc.
Yes, I did source the
lumi_g.sh- OK. I checked the
lumi_g.shscript and even the environment variables that it sets influencesallocin a way that creates conflicts with your command line. So the trick is to log in again and not source any of those scripts and then thesallocline will work, but you will not be working in the reservation, or to not add the-Aand--partitionargument as they are set by environment variables. What is actually happening is that because of the enviornment variables the reservation is activated but you're asking for nodes outside the reservation.
Yes, it works when not sourcing anything. Is the
sallocin the linked document really needed and it wouldn't be better to forgo it and just use the reservation? - could you