Moving your AI training jobs to LUMI workshop - Copenhagen, May 29-30 2024¶
Course organisation¶
-
Location: NORDUNets, Kastruplundgade 22, DK-2770 Kastrup, Denmark
-
Questions with longer-term relevance have been incorporated into the pages linked below.
Setting up for the exercises¶
During the course¶
If you have an active project on LUMI, you should be able to make the exercises in that project. To reduce the waiting time during the workshop, use the SLURM reservations we provide (see above).
You can find all exercises on our AI workshop GitHub page
After the termination of the course project¶
Setting up for the exercises is a bit more elaborate now.
-
The containers used in some of the exercises are no longer available in
/scratch/project_465001063/containers
. You'll have to replace that directory now with/appl/local/training/software/ai-20240529
.Alternatively you can download the containers as a tar file and untar in a directory of your choice (and point the scripts to that directory where needed).
-
The exercises as they were during the course are available as the tag
ai-202405291
in the GitHub repository. Whereas the repository could simply be cloned during the course, now you have to either:-
Download the content of the repository as a tar file or bzip2-compressed tar file or from the GitHub release where you have a choice of formats,
-
or clone the repository and then check out the tag
ai-202405291
:git clone https://github.com/Lumi-supercomputer/Getting_Started_with_AI_workshop.git cd Getting_Started_with_AI_workshop git checkout ai-202405291
-
Note also that any reference to a reservation in Slurm has to be removed.
The exercises were thoroughly tested at the time of the course. LUMI is an evolving supercomputer though,
so it is expected that some exercises may fail over time, and modules that need to be loaded, will also
change as at every update we have to drop some versions of the LUMI
module as the programming environment
is no longer functional. Likewise it is expected that at some point the ROCm driver on the system may
become incompatible with the ROCm versions used in the containers for the course.
Course materials¶
Note: Some links in the table below will remain invalid until after the course when all materials are uploaded.
Web links¶
-
LUMI documentation