Skip to content

Moving your AI training jobs to LUMI workshop - KTH, Stockholm, October 8-9, 2025

Course organisation

  • Location: KTH, Lindstedtsvägen 24, 114 28 Stockholm, Sweden, room 522 "Fantum".

    The building is just a 5 minute walk from a metro station with excellent connections to the central station (red metro line). See also the SL web site for more information on public transportation in Stockholm.

  • Hotel suggestions for participants from outside Stockholm:

    • Elite Hotel Arcadia is close to the entrance of the KTH campus and the metro station serving KTH.

    • Hotels in the neighbourhood of the central station are also a good choice as there is a fast and high frequency metro connection to the KTH campus from there and as it is a good place to find restaurants or head into old town at night.

      The Scandic Continental Hotel even sits right on top of the main metro station where you can jump on any line.

  • Schedule

  • HedgeDoc for questions

    Questions with longer-term relevance will be incorporated into the pages linked below. This HedgeDoc document will not be monitored anymore for further questions after the course. The link will likely die over time.

  • Zoom link

  • There are two Slurm reservations for the course. One for each day:

    • First day: AI_workshop_Day1 (on the small-g Slurm partition)
    • Second day: AI_workshop_Day2 (on the standard-g Slurm partition)

    Project with the compute resources: project_465002178. These resources are limited and should only be used for the exercises during the course and not for your own work.

Travel

  • Hotel options: The various hotels near the central station would be a good choice for travellers. The Odenplan area could be an alternative as there is a frequent bus service to the KTH campus. Elite Hotel Arcadia is right on the edge of the KTH campus and hence also a possible choice.

  • Travel from the airport: There are 4 ways to get to central stockholm from the airport:

    • Taxi - Expensive, go to the labeled taxi ranks in the airport. There is a fixed price to the city, which should be displayed on the taxi and will be different for different companies

    • Arlanda Express - Moderately expensive, very fast direct train to stockholm central station. There are two train stations for the Arlanda Express at the airport, one for terminal 2, 3 and 4 (Arlanda Sodra) and one for terminal 5 (Arlanda Norra).

    • Commuter train - From Sky City in Arlanda you can get the normal trains. You need to pay 147 SEK (toll to use the tunnel under Arlanda) at the ticket barriers plus a normal SL ticket. This is a different train station as those for the Arlanda Express!

    • Airport bus - cheapest option, but takes longest. Central bus station is right next to central station.

  • Using public transportation in Stockholm

    Probably simplest if you just get the SL (Stockholm Localtrafik) app on a smartphone, will give you a QR code that you can use at ticket barriers. Tickets can be bought using a credit card in the app. Physical cards are also available, but cost 50 SEK for the card, plus extra for the tickets you put on it.
    Credit cards can also be used with contactless pay as you go to buy a single.

    Single tickets are fairly expensive. If you plan to use public transportation not only to come and go to the venue, but also at night to travel in Stockholm, a travelcard, which you can get in the app, may be a better option.

    Be careful when using the route planner in the app to travel to the course venue. If you search for KTH, several options will be offered to you, some of those in entirely different parts of the city. Search for "Tekniska högskolan", the metro station, or "Lindstedtsvägen (Stockholm)". But do NOT take "KTH - Royal Intitute of Technology (Södertälje)" as this will take you to a different place, even outside Stockholm

  • Something for the night

Setting up for the exercises

During the course

If you have an active project on LUMI, you should be able to make the exercises in that project (i.e., store the files in your own project, but use the course project for running). That way you're guaranteed access to your work for the duration of your project. To reduce the waiting time during the workshop, use the SLURM reservations we provide (see above) and the course project for running.

You can find all exercises on our AI workshop GitHub page

After the termination of the course project

Setting up for the exercises is a bit more elaborate now.

The exercises as they were during the course are available as the tag ai-20251009 in the GitHub repository. Whereas the repository could simply be cloned during the course, now you have to either:

  • Download the content of the repository as a tar file or bzip2-compressed tar file or from the GitHub release where you have a choice of formats,

  • or clone the repository and then check out the tag ai-20251009:

    git clone https://github.com/Lumi-supercomputer/Getting_Started_with_AI_workshop.git
    cd Getting_Started_with_AI_workshop
    git checkout ai-20251009
    

Note also that any reference to a reservation in Slurm has to be removed.

The exercises were thoroughly tested at the time of the course. LUMI is an evolving supercomputer though, so it is expected that some exercises may fail over time, and modules that need to be loaded, will also change as at every update we have to drop some versions of the LUMI module as the programming environment is no longer functional. Likewise it is expected that at some point the ROCm driver on the system may become incompatible with the ROCm versions used in the containers for the course.

Course materials

Note: Some links in the table below will remain invalid until after the course when all materials are uploaded.

Presentation Slides recording
Welcome and course introduction / video
Introduction to LUMI slides video
Using the LUMI web-interface slides video
Hands-on: Run a simple PyTorch example notebook / video
Your first AI training job on LUMI slides video
Hands-on: Run a simple single-GPU PyTorch AI training job / video
Understanding GPU activity & checking jobs slides video
Hands-on: Checking GPU usage interactively using rocm-smi / video
Running containers on LUMI slides video
Hands-on: Pull and run a container / video
Building containers from Conda/pip environments slides video
Hands-on: Creating a conda environment file and building a container using cotainr / video
Extending containers with virtual environments for faster testing slides video
Scaling AI training to multiple GPUs slides video
Hands-on: Converting the PyTorch single GPU AI training job to use all GPUs in a single node via DDP / video
Extreme scale AI slides video
Demo/Hands-on: Using multiple nodes / video
Loading training data on LUMI slides video
Coupling machine learning with HPC simulation slides video
Hands-on: Advancing your project and general Q&A / video