Skip to content

[PyTorch] [package list]

PyTorch/2.6.0-rocm-6.2.4-python-3.12-singularity-20250404 (PyTorch-2.6.0-rocm-6.2.4-python-3.12-singularity-20250404.eb)

Install with the EasyBuild-user module in partition/container:

module load LUMI partition/container EasyBuild-user
eb PyTorch-2.6.0-rocm-6.2.4-python-3.12-singularity-20250404.eb
The module will be available in all versions of the LUMI stack and in the CrayEnv stack.

To access module help after installation use module spider PyTorch/2.6.0-rocm-6.2.4-python-3.12-singularity-20250404.

EasyConfig:

# Developed by Kurt Lust and Mihkel Tiks for LUMI
#DOC Contains PyTorch 2.6.0 with torchaudio 2.6.0, torchdata 0.9.0+cpu, torchtext 0.18.0+cpu,
#DOC torchvision 0.21.0 GPU version, DeepSpeed 0.15.1,  flash-attention 2.7.3, transformers 4.50.1,
#DOC xformers 0.0.30+a0a401e4.d20250322 and vllm 0.7.2.post2+rocm624, on Python 3.12 and ROCm 6.2.3.
#DOC The container also fully assists the procedure to add extra packages in a Python virtual environment.
#DOC
#DOC This version also includes a pre-set virtual environment, but the module together with the 
#DOC container to all the initialisations, so $WITH_CONDA, $WITH_VENV, etc., are not needed.

easyblock = 'MakeCp'

local_c_rocm_version =    '6.2.4'
local_c_python_mm =       '3.12'
local_c_PyTorch_version = '2.6.0'
local_c_dockerhash =      '36e16fb5b67b'
local_c_date =            '20250404'

local_c_DeepSpeed_version =      '0.15.1'
local_c_flashattention_version = '2.7.3'
local_c_transformers_version =   '4.50.1'
local_c_xformers_version =       '0.0.30+a0a401e4.d20250322'
local_c_vllm_version =           '0.7.2+rocm624' 

local_conda_env = 'pytorch'
local_c_python_m = local_c_python_mm.split('.')[0]

name =          'PyTorch'
version =       local_c_PyTorch_version
versionsuffix = f'-rocm-{local_c_rocm_version}-python-{local_c_python_mm}-singularity-{local_c_date}'

local_sif =    f'lumi-pytorch-rocm-{local_c_rocm_version}-python-{local_c_python_mm}-pytorch-v{local_c_PyTorch_version}-dockerhash-{local_c_dockerhash}.sif'
#local_docker = f'lumi-pytorch-rocm-{local_c_rocm_version}-python-{local_c_python_mm}-pytorch-v2.2.0.docker'

homepage = 'https://pytorch.org/'

whatis = [
    'Description: PyTorch, a machine learning package',
    'Keywords: PyTorch, DeepSpeed, flash-attention, xformers, vllm'
]

description = f"""
This module provides a container with PyTorch %(version)s (with torchaudio,
torchdata, torchtext and torchvision) on Python {local_c_python_mm}. It also contains 
DeepSpeed {local_c_DeepSpeed_version}, flash-attention {local_c_flashattention_version}, transformers {local_c_transformers_version}, 
xformers {local_c_xformers_version} and vllm {local_c_vllm_version}.

The module defines a number of environment variables available outside the
container:

*   SIF and SIFPYTORCH: The full path and name of the Singularity SIF file 
    to use with singularity exec etc.
*   SINGULARITY_BIND: Mounts the necessary directories from the system,
    including /users, /project, /scratch and /flash so that you should be
    able to use your regular directories in the container.
*   RUNSCRIPTS and RUNSCRIPTSPYTORCH: The directory with some sample
    runscripts.
*   CONTAINERROOT: Root directory of the container installation. Alternative
    for EBROOTPYTHORCH.

There are also a number of environment variables available inside the container.
These are not strictly needed though as the module already ensures that all
necessary environment variables are set to activate the Conda environment in
the container and on top of that the virtual environment for additional packages.
They are there for compatibility with scripts for older versions of the containers.

*   WITH_CONDA: Command to execute to activate the Conda environment used for 
    the Python installation.
*   WITH_VENV: Command to execute to activate the pre-created Python virtual
    environment.
*   INIT_CONDA_VENV: Command that can be used to initialise the Conda environment
    and then on top of it the Python virtual environment.

Outside of the container, the following commands are available:

*   start-shell: To start a bash shell in the container. Arguments can be used
    to, e.g., tell it to start a command. Use the -c flag of bash if you want to
    pass commands to that shell as otherwise the conda and virtual environments
    are not properly initialised.
*   make-squashfs: Make the user-software.squashfs file that would then be mounted
    in the container after reloading the module. This will enhance performance if
    the extra installation in user-software contains a lot of files.
*   unmake-squashfs: Unpack the user-software.squashfs file into the user-software
    subdirectory of $CONTAINERROOT to enable installing additional packages.
*   python, python{local_c_python_m} and  python{local_c_python_mm} are wrapper scripts to start Python in the 
    container, passing along all arugments.
    They should work in the same way as those in the pytorch modules in the 
    local CSC software stack.
*   pip, pip{local_c_python_m} and pip{local_c_python_mm} are wrapper scripts to start pip in the container.
    passing along all arguments.
    They should work in the same way as those in the pytorch modules in the 
    local CSC software stack.
*   Other such wrappers are accelerate, huggingface-cli, ray and torchrun.
    They should work in the same way as those in the pytorch modules in the 
    local CSC software stack.

Inside the container, the following scripts are available in /runscripts
(and can be checked or edited outside the container in $CONTAINERROOT/runscripts):

*   conda-python-simple: Start Python in the conda + Python venv environment.
*   conda-python-distributed: Example script that can be used to start Python
    in a distributed way compatible with the needs of PyTorch. You should pass
    the Python commands to be executed with the options that the python executable
    would take.
*   get-master: A script used by conda-python-distributed.

Note that these scripts are meant as examples and in no way do they cover all possible
use cases.

Note also that any change that you make to files in $CONTAINERROOT will be fully erased
whenever you reinstall the container with EasyBuild so backup all changes or 
additions!
"""

docurls = [
    'DeepSpeed web site: https://www.deepspeed.ai/',
    'Latest LUMI AI training: https://lumi-supercomputer.github.io/AI-latest',   
]

toolchain = SYSTEM

sources = [
    {
        'filename':    local_sif,
        'extract_cmd': '/bin/cp -L %s .'
    },
#    {
#        'filename':    local_docker,
#        'extract_cmd': '/bin/cp -L %s .'
#    },
]

skipsteps = ['build']

files_to_copy = [
    ([local_sif],    '.'),
#    ([local_docker], 'share/docker-defs/')    
]

####################################################################################################
#
# Scripts for bin and and/or runscript
#

#
# Script to start a shell in the container and/or execute commands in a shell.
#

local_bin_start_shell = """
#!/bin/bash -e

# Run application
if [ -d "/.singularity.d/" ]
then
    # In a singularity container, just in case a user would add this to the path.
    exec bash "$@"
else
    # Not yet in the container

    if [ -z $SIFPYTORCH ] || [ ! -f $SIFPYTORCH ]
    then
        >&2 echo "SIFPYTORCH is undefined or wrong, use this command with the PyTorch module properly loaded!"
        exit
    fi

    singularity exec $SIFPYTORCH bash "$@"
fi

""".replace( '$', '\\$' )


#
# Python wrapper script, also works for several other commands by symlinking.
# Based on code from CSC but adapted for the approach with this module.
#
local_bin_python = """
#!/bin/bash
#
# Python wrapper script, also used for some other commands.
#
# This will start python, or whatever the name of the link to this script is,
# in the PyTorch container.
#
if [ -z $SIFPYTORCH ] || [ ! -f $SIFPYTORCH ]
then
    >&2 echo "SIFPYTORCH is undefined or wrong, use this command with the PyTorch module properly loaded!"
    exit
fi

#REAL_PYTHON="${BASH_SOURCE[0]}"  # Full path + name of the file being executed, which can be the symbolic link.
EXEC_BIN=$(basename "$0")         # Command executed without the path, so python or the name of the symbolic link.

if [ -d /.singularity.d/ ]; then
    1>&2 echo "These wrapper scripts are not meant to be executed in a container and will lead to infinite loops."
    exit 1
else
    # Note: The CSC script starts the command in the container with
    # singularity exec $SIFPYTORCH bash -c "exec -a $REAL_PYTHON $EXEC_BIN $( test $# -eq 0 || printf " %q" "$@" )"
    # where "exec -a $REAL_PYTHON $EXEC_BIN" executes $EXE_BIN but setting arg[0] to $REAL_PYTHON.
    # This however breaks the working of the virtual environment, likely # because of the overwriting
    # of argv[0] so that Python can't find the correct path from where it was started. This is because
    # of the way the EasyBuild modules work with virtual environments. They are not seen in the container
    # in their place in the file system, but in /user-software. We have done this to be able to squash
    # that whole directory structure in a SquashFS file mounted in the container to reduce the pressure
    # that big virtual environments cause on the filesystem.
    singularity exec $SIFPYTORCH $EXEC_BIN "$@"
fi
""".replace( '$', '\\$' )


#
# Make a SquashFS file of the virtual environment.
#
local_bin_make_squashfs = """
#!/bin/bash -e

if [[ -d "/.singularity.d" ]] 
then
    # In a singularity container, just in case a user would add this to the path.
    >&2 echo 'The make-squashfs command should not be run in the container.'
    exit 1
fi

cd "%(installdir)s"

if [[ ! -d "user-software" ]]
then
    >&2 echo -e 'The $CONTAINERROOT/user-software subdirectory does not exist, so there is nothing to put into the SquashFS file.'
    exit 2
fi

if [[ -f "user-software.squashfs" ]]
then
    >&2 echo -e '$CONTAINERROOT/user-software.squashfs already exists. Please remove the file by' \\\\
                '\\nhand if you are sure you wish to proceed and re-run the make-squashfs command.'
    exit 3
fi

mksquashfs user-software user-software.squashfs -processors 1 -no-progress |& grep -v Unrecognised

echo -e '\\nCreated $CONTAINERROOT/user-software.squashfs from $CONTAINERROOT/user-software.' \\\\
        '\\nYou need to reload the PyTorch module to ensure that the software is now mounted' \\\\
        '\\nfrom $CONTAINERROOT/user-software.squashfs. Note that /user-software in the' \\\\
        '\\ncontainer will then be a read-only directory.' \\\\
        '\\nAfter reloading the module, you can also remove the $CONTAINERROOT/user-software' \\\\
        '\\nsubdirectory if you so wish.\\n'

""".replace( '$', '\\$' )


#
# Bin script to restore the user-software directory from a SquashFS file for further 
# updating.
#
local_bin_unmake_squashfs = """
#!/bin/bash -e

if [[ -d "/.singularity.d" ]] 
then
    # In a singularity container, just in case a user would add this to the path.
    >&2 echo 'The unmake-squashfs command should not be run in the container.'
    exit 1
fi

cd "%(installdir)s"

if [[ ! -f "user-software.squashfs" ]]
then
    >&2 echo -e '$CONTAINERROOT/user-software.squashfs does not exist so cannot uncompress it.'
    exit 2
fi

if [[ -d "user-software" ]]
then
    >&2 echo -e 'The $CONTAINERROOT/user-software subdirectory already exists. Please remove this directory by hand' \\\\
                '(rm -r $CONTAINERROOT/user-software) if you are sure you wish to proceed and re-run the unmake-squashfs command.'
    exit 3
fi

unsquashfs -d ./user-software user-software.squashfs

echo -e '\\nCreated $CONTAINERROOT/user-software subdirectory from $CONTAINERROOT/user-software.squasfs.' \\\\
        '\\nYou need to reload the PyTorch module to ensure that the software is now mounted from the' \\\\
        '\\n$CONTAINERROOT/user-software directory and can now write to /user-software in the container.' \\\\
        '\\nYou can then also remove the $CONTAINERROOT/user-software.squashfs file if you so wish.\\n'

""".replace( '$', '\\$' )


#
# Script to list packages in a container for compatibility with the CSC approach.
# Goes in bin to be available outside the container and runscript to also be 
# available inside.
#
local_bin_runscript_list_packages = """
#!/bin/bash -e

if [[ -d "/.singularity.d" ]] 
then
    # Running in a singularity container already
    pip3 list
else
    # Not running in a container. We could simply use the pip script,
    # or start in the container which is what we will do.

    if [ -z $SIFJAX ] || [ ! -f $SIFJAX ]
    then
        >&2 echo "SIFJAX is undefined or wrong, use this command with the jax module properly loaded!"
        exit
    fi

    singularity exec $SIFJAX pip3 list
fi

""".replace( '$', '\\$' )


#
# Initialise the conda and virtual environment
#
local_runscript_init_conda_venv=f"""
#
# Source this file to initialize both the Conda environment and 
# predefined virtual environment in the container.
#
# This script is still useful to initialise the environment when the
# module is not loaded, e.g., to execute commands in the `postinstallcmds` section.
#
# Conda not needed anymore, done by initialisation in the container
#source /opt/miniconda3/bin/activate {local_conda_env}
source /user-software/venv/{local_conda_env}/bin/activate

"""


#
# Runscript to run Python in the initialised container.
# This script is no longer needed but here for compatibility with older containers.
#
local_runscript_python_simple="""
#!/bin/bash -e

# Run application
python "$@"

""".replace( '$', '\\$' )


#
# Runscript template for distributed learning with PyTorch, doing all proper
# initialisations of MIOpen and RCCL and some PyTorch initialisations.
# Note that this script is not suitable for all types of distributed learning!
#
local_runscript_python_distributed="""
#!/bin/bash -e

# Make sure GPUs are up
if [ $SLURM_LOCALID -eq 0 ] ; then
    rocm-smi
fi
sleep 2

# MIOPEN needs some initialisation for the cache as the default location
# does not work on LUMI as Lustre does not provide the necessary features.
export MIOPEN_USER_DB_PATH="/tmp/$(whoami)-miopen-cache-$SLURM_NODEID"
export MIOPEN_CUSTOM_CACHE_DIR=$MIOPEN_USER_DB_PATH

# Set MIOpen cache to a temporary folder.
if [ $SLURM_LOCALID -eq 0 ] ; then
    rm -rf $MIOPEN_USER_DB_PATH
    mkdir -p $MIOPEN_USER_DB_PATH
fi
sleep 2

# Set interfaces to be used by RCCL.
# This is needed as otherwise RCCL tries to use a network interface it has
# no access to on LUMI.
export NCCL_SOCKET_IFNAME=hsn
export NCCL_NET_GDR_LEVEL=3    # Not really needed anymore for ROCm 6.2 as this is now the default

# Set ROCR_VISIBLE_DEVICES so that each task uses the proper GPU
export ROCR_VISIBLE_DEVICES=$SLURM_LOCALID

# Report affinity to check 
echo "Rank $SLURM_PROCID --> $(taskset -p $$); GPU $ROCR_VISIBLE_DEVICES"

# The usual PyTorch initialisations (also needed on NVIDIA)
# Note that since we fix the port ID it is not possible to run, e.g., two
# instances via this script using half a node each.
export MASTER_ADDR=$(/runscripts/get-master "$SLURM_NODELIST")
export MASTER_PORT=29500
export WORLD_SIZE=$SLURM_NPROCS
export RANK=$SLURM_PROCID

# Run application
python "$@"

""".replace( '$', '\\$' )


#
# Runscript used by other scripts to extract the first node of a 
# multi-node allocation.
# This version is based on a SLURM environment variable so that it can be
# used inside the container where we cannot run Slurm commands.
#
local_runscript_get_master="""
#!/usr/bin/env python3
# This way of starting Python should work both on LUMI and in the container, though
# this script is really meant to be used in the container.

import argparse
def get_parser():
    parser = argparse.ArgumentParser(description="Extract master node name from Slurm node list",
            formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument("nodelist", help="Slurm nodelist")
    return parser


if __name__ == '__main__':
    parser = get_parser()
    args = parser.parse_args()

    first_nodelist = args.nodelist.split(',')[0]

    if '[' in first_nodelist:
        a = first_nodelist.split('[')
        first_node = a[0] + a[1].split('-')[0]

    else:
        first_node = first_nodelist

    print(first_node)

"""


####################################################################################################
#
# Installing mostly done in postinstallcmds as we have no EasyBlock to work with the container.
#

#local_singularity_bind = '/var/spool/slurmd,/opt/cray,/usr/lib64/libcxi.so.1,/usr/lib64/libjansson.so.4,' + \
#                         '%(installdir)s/runscripts:/runscripts,' + \
#                         '/pfs,/scratch,/projappl,/project,/flash,/appl'
local_singularity_bind = '/var/spool/slurmd,/opt/cray,/usr/lib64/libcxi.so.1,' + \
                         '%(installdir)s/runscripts:/runscripts,' + \
                         '/pfs,/scratch,/projappl,/project,/flash,/appl'

postinstallcmds = [
    #
    # Install the scripts in the bin subdirectory
    #
    'mkdir -p %(installdir)s/bin',
    f'cat >%(installdir)s/bin/start-shell <<EOF {local_bin_start_shell}EOF',
    'chmod a+x %(installdir)s/bin/start-shell',
    f'cat >%(installdir)s/bin/make-squashfs <<EOF {local_bin_make_squashfs}EOF',
    'chmod a+x %(installdir)s/bin/make-squashfs',    
    f'cat >%(installdir)s/bin/unmake-squashfs <<EOF {local_bin_unmake_squashfs}EOF',
    'chmod a+x %(installdir)s/bin/unmake-squashfs',    
    f'cat >%(installdir)s/bin/list-packages <<EOF {local_bin_runscript_list_packages}EOF',
    'chmod a+x %(installdir)s/bin/list-packages',
    # CSC-style wrapper scripts
    f'cat >%(installdir)s/bin/python <<EOF {local_bin_python}EOF',
    'chmod a+x %(installdir)s/bin/python',
    f'ln -s ./python %(installdir)s/bin/python{local_c_python_m}',
    f'ln -s ./python %(installdir)s/bin/python{local_c_python_mm}',
    'ln -s  ./python %(installdir)s/bin/pip',
    f'ln -s ./python %(installdir)s/bin/pip{local_c_python_m}',
    f'ln -s ./python %(installdir)s/bin/pip{local_c_python_mm}',
    'ln -s  ./python %(installdir)s/bin/accelerate',
    'ln -s  ./python %(installdir)s/bin/huggingface-cli',
    'ln -s  ./python %(installdir)s/bin/ray',
    'ln -s  ./python %(installdir)s/bin/torchrun',
    #
    # Commands in runscripts
    #
    'mkdir -p %(installdir)s/runscripts',
    f'cat >%(installdir)s/runscripts/list-packages <<EOF {local_bin_runscript_list_packages}EOF',
    'chmod a+x %(installdir)s/runscripts/list-packages',
    f'cat >%(installdir)s/runscripts/init-conda-venv <<EOF {local_runscript_init_conda_venv}EOF',
    'chmod a-x %(installdir)s/runscripts/init-conda-venv',
    f'cat >%(installdir)s/runscripts/conda-python-simple <<EOF {local_runscript_python_simple}EOF',
    'chmod a+x %(installdir)s/runscripts/conda-python-simple',
    f'cat >%(installdir)s/runscripts/conda-python-distributed <<EOF {local_runscript_python_distributed}EOF',
    'chmod a+x %(installdir)s/runscripts/conda-python-distributed',
    f'cat >%(installdir)s/runscripts/get-master <<EOF {local_runscript_get_master}EOF',
    'chmod a+x %(installdir)s/runscripts/get-master',
    #
    # Create the virtual environment and space for other software installations that
    # can then be packaged.
    #
    'mkdir -p %(installdir)s/user-software/venv',
    # For the next command, we don't need all the bind mounts yet, just the user-software one is enough.
    f'singularity exec --bind %(installdir)s/user-software:/user-software %(installdir)s/{local_sif} bash -c \'$WITH_CONDA ; cd /user-software/venv ; python -m venv --system-site-packages {local_conda_env}\'',
]

sanity_check_paths = {
    # We deliberately don't check for local_sif as the user is allowed to remove that file
    # but may still want to regenerate the module which would then fail in the sanity check.
    #'files': [f'share/docker-defs/{local_docker}'],
    'files': [],
    'dirs':  ['runscripts'],
}

sanity_check_commands = [
    # Full syntax check of bash scripts
    'echo "Syntax check of start-shell"     ; bash -n start-shell',
    'echo "Syntax check of python"          ; bash -n python',
    'echo "Syntax check of make-squashfs"   ; bash -n make-squashfs',
    'echo "Syntax check of unmake-squashfs" ; bash -n unmake-squashfs',
    'echo "Syntax check of list-packages"   ; bash -n list-packages',
    'echo "Syntax check of conda-python-simple"      ; bash -n %(installdir)s/runscripts/conda-python-simple',
    'echo "Syntax check of conda-python-distributed" ; bash -n %(installdir)s/runscripts/conda-python-distributed',
    # Check the list-packaged wrapper script outside and inside the container
    'list-packages',
    'singularity exec $SIFPYTORCH list-packages',
    # Check python wrapper script and reported version
    ('echo "Testing Python wrapper script and version" ; '
    f'python --version | sed -e \'s|.* \([[:digit:]]\.[[:digit:]]\+\).*|\\1|\' | grep -q "{local_c_python_mm}"'),
    # Check pythonMAJOR wrapper script and reported version
    (f'echo "Testing python{local_c_python_m} wrapper script and version" ; '
    f'python{local_c_python_m} --version | sed -e \'s|.* \([[:digit:]]\.[[:digit:]]\+\).*|\\1|\' | grep -q "{local_c_python_mm}"'),
    # Check pythonMAJOR.MINOR wrapper script and reported version
    (f'echo "Testing python{local_c_python_mm} wrapper script and version" ; '
    f'python{local_c_python_mm} --version | sed -e \'s|.* \([[:digit:]]\.[[:digit:]]\+\).*|\\1|\' | grep -q "{local_c_python_mm}"'),
    # Check pip and deepspeed version
    (f'echo "Testing pip wrapper script and DeepSpeed version (expected {local_c_DeepSpeed_version})" ; '
    f'pip freeze | grep deepspeed | sed -e \'s|.*=\(.*\)|\\1|\' | grep -q "{local_c_DeepSpeed_version}"'),    
    # Check pipMAJOR and transformers version
    (f'echo "Testing pip{local_c_python_m} wrapper script and transformers version (expected {local_c_transformers_version})" ; '
    f'pip{local_c_python_m} freeze | grep transformers | sed -e \'s|.*=\(.*\)|\\1|\' | grep -q "{local_c_transformers_version}"'),    
    # Check pipMAJOR.MINOR and deepspeed version
    (f'echo "Testing pip{local_c_python_mm} wrapper script and DeepSpeed version (expected {local_c_DeepSpeed_version})" ; '
    f'pip{local_c_python_mm} freeze | grep deepspeed | sed -e \'s|.*=\(.*\)|\\1|\' | grep -q "{local_c_DeepSpeed_version}"'),    
    # Check pip and xformers version
    (f'echo "Testing pip wrapper script and xformers version (expected {local_c_xformers_version})" ; '
    f'pip list | grep xformers | awk \'{{ print $2}}\' | grep -q "{local_c_xformers_version}"'),    
    # Check pip and flashattention version
    (f'echo "Testing pip wrapper script and xformers version (expected {local_c_flashattention_version})" ; '
    f'pip list | grep flash_attn | awk \'{{ print $2}}\' | grep -q "{local_c_flashattention_version}"'),    
    # Check pip and vllm version
    (f'echo "Testing pip wrapper script and xformers version (expected {local_c_vllm_version})" ; '
    f'pip list | grep vllm | awk \'{{ print $2}}\' | grep -q "{local_c_vllm_version}"'),    
    # Check pip and torch version
    (f'echo "Testing pip wrapper script and torch version (expected {local_c_PyTorch_version})" ; '
    f'pip list | egrep "^torch " | awk \'{{ print $2}}\' | sed -e \'s|\\+ro.*||\' | grep -q "{local_c_PyTorch_version}"'),    
    # Check if the accelerate wrapper script can run accelerate
    'echo "Checking if the accelerate wrapper can run accelerate" ; accelerate -h',
    # Check if the huggingface-cli wrapper script can run huggingface-cli
    'echo "Checking if the huggingface-cli wrapper can run huggingface-cli" ; huggingface-cli version',
    # Check if the ray wrapper script can run ray
    'echo "Checking if the ray wrapper can run ray" ; ray --version',
    # Check if the torchrun wrapper script can run torchrun
    'echo "Checking if the torchrun wrapper can run torchrun" ; torchrun -h',
]

modextravars = {
    # SIF variables currently set by a function via modluafooter.
    #'SIF':                             '%(installdir)s/' + local_sif,
    #'SIFPYTORCH':                      '%(installdir)s/' + local_sif,
    'CONTAINERROOT':                    '%(installdir)s',
    'RUNSCRIPTS':                       '%(installdir)s/runscripts',
    'RUNSCRIPTSPYTORCH':                '%(installdir)s/runscripts',
    #'SINGULARITY_BIND':                local_singularity_bind,
    #
    # In containers made before 2025, we had the WITH_CONDA environment variable set in the container
    # with the commands to activate the conda environment, and the module defined the WITH_VENV
    # environment variable with the commands to activate the preset virtual environment and
    # WITH_CONDA_VENV to do both. As the combination of container and module now do that work,
    # these are no longer needed, but for compatibility with older scripts we let them execute 
    # dummy commands.
    #
    'SINGULARITYENV_WITH_VENV':         'true',
    'SINGULARITYENV_WITH_CONDA_VENV':   'true',
    #
    # The following lines have the same effect as activating the Python virtual environment
    # with the script created by venv. The conda environment is already correctly initialised
    # through settings in the container /.singularity.d/env/10-docker2singularity.sh script.
    #
    #'SINGULARITYENV_PREPEND_PATH':      '/runscripts:/user-software/venv/pytorch/bin:/opt/miniconda3/envs/pytorch/bin:/opt/miniconda3/condabin',
    'SINGULARITYENV_PREPEND_PATH':      '/runscripts:/user-software/venv/pytorch/bin',
    #'SINGULARITYENV_CONDA_DEFAULT_ENV': 'pytorch',                      # Set in the container already
    #'SINGULARITYENV_CONDA_EXE':         '/opt/miniconda3/bin/conda',    # Now set in the container already
    #'SINGULARITYENV_CONDA_PREFIX':      '/opt/miniconda3/envs/pytorch', # Now set in the container already
    #'SINGULARITYENV_CONDA_PYTHON_EXE':  '/opt/miniconda3/bin/python',   # This Python should not be used as-is. Instead the wrapper from the Python venv should be used.
    'SINGULARITYENV_VIRTUAL_ENV':       '/user-software/venv/pytorch',
    # Typical NCCL environment variables
    'NCCL_SOCKET_IFNAME':               'hsn',
    'NCCL_NET_GDR_LEVEL':               '3',  # Not really needed anymore for ROCm 6.2 as this is now the default
}

modluafooter = f"""
conflict( 'singularity-AI-bindings' )

-- Call a routine to set the various environment variables.
create_container_vars( '{local_sif}', 'PyTorch', '%(installdir)s', '{local_singularity_bind}' )
"""

moduleclass = 'devel'

[PyTorch] [package list]