Skip to content

The HPE Cray Programming Environment

[back: The Lmod module system]


On LUMI, the main programming environment is the HPE Cray Programming Environment (further abbreviated as Cray PE). The environment provides several tools, including compilers, communication libraries, optimised math libraries and various other libraries, analyzers and debuggers.

The Cray PE is made available through environment modules that allow to select particular versions of tools and to configure the environment in a flexible way.


HPE Cray PE components

Cray compiler environments

Cray PE supports multiple compilers, Cray and third party compilers as well: AOCC, Cray, Intel, GNU.

Users can access the compilers loading a programming environment module (see below).

Compilers in the Cray Programming Environment are almost always used through the Cray compiler wrappers. In fact, until recently, some compilers could not be used without the wrappers as they failed to even detect their own include files and some components. The behaviour of the wrapper will depend on the other modules that are loaded in the environment. E.g., there is no separate wrapper for MPI as the MPI header files and libraries are automatically activated through the regular wrappers as soon as a Cray MPI module is loaded.

The commands to invoke compiler wrappers are ftn (Fortran), cc (C), CC (C++). They wrap automatically to the right compilers based on the programming environment module that is loaded and the compiler module.

The online help can be accessed with the -help option. E.g.: cc -help, CC -help. One of the most frequently used options of the compiler wrappers is -craype-verbose:

 ftn -help | grep verbose -A 1 
   -craype-verbose              Print the command which is forwarded 
                                to compiler invocation
More information is available with the info or man commands. E.g.: both info cc or man cc will display the man page of the C compiler wrapper.

The compiler wrappers call the correct compiler in the currently loaded programming environment, with appropriate options to build and link applications with relevant libraries, as required by the modules loaded. Besides a number of generic options, they also will pass all other options to the underlying compile, so you can still use the regular compiler options also.

The compiler wrappers should replace direct calls to compiler drivers in Makefiles and build scripts to ensure that the proper libraries are used.

Note

For system libraries, only dynamic linking is supported by compiler wrappers on the Cray EX system


Cray Compiling Environment (CCE)

The Cray Compiling Environment is set by the module PrgEnv-cray, which is loaded by default at login. The compiler itself is contained in the cce module.

CCE consists of Cray compilers performing code analysis during compilation to generate highly optimized code. Supported languages include Fortran, C and C++, and UPC (Unified Parallel C). The Cray C/C++ compiler is basically the Clang/LLVM compiler with a back-end configured by HPE Cray that also adds some Cray-specific options while the Cray Fortran compiler uses an HPE-Cray specific front-end with a back-end based on LLVM. The Fortran compiler supports most of Fortran 2018 and tends to be considerably stricter than the GNU or Intel Fortran compilers.

The CCE compilers will also support OpenMP offload to NVIDIA and AMD GPUs but that is still very immature and very much work in progress at the time of the development of this tutorial. The Fortran compiler (but not the C/C++ compiler) also supports OpenACC for offloading to GPU and is being updated to the newest versions of this standard.

The classic pre-LLVM Cray C/C++ compiler is not available on Cray EX systems.

Compiler-specific manpages can be accessed on the system with man crayftn, man craycc or man crayCC.

More details are given in the Cray Fortran Reference Manual and the Cray Compiling Environment Release that unfortunately are hidden deep in the new support pages of HPE. The Clang Compiler User’s Manual is another source of information for the Cray C and C++ Clang compilers.

For more information about compiler pragmas and directives, see man intro_directives on the system.


Third-Party Compilers

GNU

The GNU C/C++ and Fortran compilers are probably the best supported third party compiler in the Cray PE.

Compiler-specific manpages can be accessed on the system with man gfortran, man gcc or man g++.

More details are provided by the GCC online documentation.

AOCC

There AOCC compilers, the AMD Optimizing C/C++ Compiler and matching fortran compilers, AMD's compiler offering for CPU-only systems, have a matching programming environment module and are a full citizen of the Cray PE.

Compiler-specific documentation is available in the AOCC User Guide.

Cray provides a bundled package of support libraries to install into the PE environment to enable AOCC, and Cray PE utilities such as debuggers and performance tools work with AOCC.

AMD ROCm compilers

The AMD ROCm compilers (in the AMD world sometimes known as AOMP) are supported on systems with AMD CPUs. However, at the time of writing of this tutorial they are not yet available in the LUMI environment and the integration with the Cray environment still seems to be work in progress.

Intel (not on LUMI)

The Cray PE also provides a programming environment module to enable the Intel® oneAPI compiler and tools.

The documentation is available in the Intel® oneAPI Programming Guide

Cray provides a bundled package of support libraries to install into the Cray PE to enable the Intel compiler, allowing utilities such as debuggers and performance tools to work with it.

NVIDIA HPC toolkit (not on LUMI)

The NVIDIA HPC Toolkit compilers (formerly PGI) are supported on systems with NVIDIA GPUs.

Cray Scientific and Math Library

  • Manpages: intro_libsci, intro_fftw3

The Cray Scientific and Math Libraries (CSML, also known as LibSci) are a collection of numerical routines optimized for best performance on Cray systems.

These libraries satisfy dependencies for many commonly used applications on Cray systems for a wide variety of domains.

When the module for a CSML package (such as cray-libsci or cray-fftw) is loaded, all relevant headers and libraries for these packages are added to the compile and link lines of the cc, ftn, and CC compiler wrappers, so linking with them is completely transparent (to the extent that users wonder where the libraries are).

The CSML collection contains the following Scientific Libraries:

  • BLAS (Basic Linear Algebra Subroutines)
  • BLACS (Basic Linear Algebra Communication Subprograms)
  • CBLAS (Collection of wrappers providing a C interface to the Fortran BLAS library)
  • IRT (Iterative Refinement Toolkit)
  • LAPACK (Linear Algebra Routines)
  • LAPACKE (C interfaces to LAPACK Routines)
  • ScaLAPACK (Scalable LAPACK)
  • libsci_acc (library of Cray-optimized BLAS, LAPACK, and ScaLAPACK routines)
  • HDF5
  • NetCDF (Network Common Data Format)
  • FFTW3 (the Fastest Fourier Transforms in the West, release 3)

Cray Message Passing Toolkit

MPI is a widely used parallel programming model that establishes a practical, portable, efficient, and flexible standard for passing messages between ranks in parallel processes.

Cray MPI is derived from Argonne National Laboratory MPICH and implements the MPI-3.1 standard as documented by the MPI Forum in MPI: A Message Passing Interface Standard, Version 3.1.

Support for MPI varies depending on system hardware. To see which functions and environment variables the system supports, please have a look at the corresponding man pages with man intro_mpi on the system.

Note that though on LUMI at the time of the tutorial both the OpenFabric Interface (OFI) based and UCX-based versions of the library are supported, the finalised LUMI system will only support OFI.

DSMML

Distributed Symmetric Memory Management Library (DSMML) is a HPE Cray proprietary memory management library.

DSMML is a standalone memory management library for maintaining distributed shared symmetric memory heaps for top-level PGAS languages and libraries like Coarray Fortran, UPC, and OpenSHMEM.

DSMML allows user libraries to create multiple symmetric heaps and share information with other libraries.

Through DSMML, interoperability can be extracted between PGAS programming models.

Further details are available in the man page on the system with man intro_dsmml.

Other components

  • Cray Performance and Analysis Tools:

    Tools to analyze the performance and behavior of programs that are run on Cray systems, and a Performance API (PAPI).

  • Cray Debugging Support Tools:

    Debugging tools, including gdb4hpc and Valgrind4hpc.


Configuring the Cray PE through modules

Multiple releases of the HPE Cray PE can be installed simultaneously on the system and users can mix-and-match components. However, it is also possible to load only components for a specific release of the PE. Cray PE versions have version numbers of the form yy.mm, e.g., 22.02 for the version released in February of 2022. However, each of the components have their own version number and it is not easy to see which version of a component came with which version(s) of the Cray PE.

Below we only discuss those modules that are important when building software with EasyBuild. Debuggers, profilers, etc., are not included in the list.

The programming environment modules

The usual way to initialise the Cray PE is by loading one of the PrgEnv-* modules. This module will then load the appropriate compiler, compiler wrappers and other libraries. Some of the components that will be loaded are configured through the /etc/cray-pe.d/cray-pe-configuration.sh file, so the list of modules may depend on the actual system that you are using.

The Cray PE supports the following PrgEnv-* modules. On LUMI, only the first three are currently available:

Module Compiler module What?
PrgEnv-cray cce The Cray Compiling Environment compilers
PrgEnv-gnu gcc The GNU compilers
PrgEnv-aocc aocc AMD compilers for CPU-only systems
PrgEnv-amd rocm AMD ROCm compilers for GPU systems
PrgEnv-intel intel The Intel compilers
PrgEnv-nvidia nvidia NVIDIA HPC toolkit compilers (formerly PGI)

All PrgEnv-* modules belong to the same module family. Hence Lmod will automatically unload any already loaded PrgEnv-* module when you load a different one.

Selecting the version through the cpe meta-module

The Cray PE on the EX system provides the meta-module cpe: the purpose of the meta-module is similar to the scope of the cdt and cdt-cuda meta-modules available on the XC systems.

Loading one of the cpe/yy.mm modules (e.g., cpe/22.02) has the following effects:

  • It sets the default versions of each of the Cray PE modules to the version that comes with that particular release of the Cray PE. E.g.,

    module load cpe/22.02
    module load cce
    

    would load that version of the cce compiler that comes with the 22.02 release of the Cray PE.

  • It will reload all already loaded Cray PE modules and switch them over to the version corresponding to that particular release of the Cray PE.

Limitations and bugs

Due to the way Lmod works and implementation bugs in the cpe modules, loading the cpe module does not always have the desired effect.

  • The Cray PE sets the default version of each module by adding a file to the list of files in the LMOD_MODULERCFILE environment variable. This is because Lmod does not re-evaluate the visibility of modules and the internal list of default version immediately when the value of LMOD_MODULERCFILE is changed but only the next time the module command is executed. Hence

    module load cpe/22.02 ; module load cce
    
    and
    module load cpe/22.02 cce
    
    do not have the same effect. In the first version, the version of cce loaded is the version that corresponds to the 22.02 release of the Cray PE. In the second case however the default version of the cce module is determined by whatever list of default modules was used when calling the module command so may or may not be the one of the 22.02 release.

  • Loading the cpe module after loading the other Cray PE modules also does not always have the desired effect in many versions of the Cray PE. This is because of a bug in the cpe module that reloads the modules in the wrong order which may trigger the reload of a module with whatever version was the default when the module command was called rather than the version the the cpe module intends to (re-)load.

The compiler wrapper module craype

The craype module loads the compiler wrappers. There is only one set of compiler wrappers for all compilers. Which compiler will be called, which libraries will be included, but also processor and GPU target options will be used, is all determined by other modules. Hence it is in principle possible to use a single Makefile for a project and still reconfigure the build by loading certain modules.

Target modules

The targets for CPU and GPU optimization, the network library for MPI bt also some other compiler options, can be set through target modules:

  • craype-x86-* (and similar options can be expected on ARM-based systems) set the target for CPU optimisations. For LUMI, the crape-x86-rome, craype-x86-milan and craype-x86-trento modules are relevant.

    This can also be used to cross-compile to a different target architecture unless the compiler target gets overwritten by a compiler flag added to the command line through the Makefile or other means, something that unfortunately happens more and more often in faulty software installation procedures.

  • creype-accel-* sets the target for OpenMP offload (and likely other technologies in the future). E.g., loading craype-accel-amd-gfx90a tells the compilers to target AMD MI200 family GPUs, while loading craype-accel-host tells the compiler to use the CPU instead (according to the documentation, the latter is for PrgEnv-cray only).

  • craype-network-* selects the communication library to be used by Cray MPICH. On Slingshot 11 EX systems, only craype-network-ofi is supported, but Slingshot 10 EX systems also offer support for UCX through the craype-network-ucx module.

  • The craype-hugepages* modules enable Cray Huge Pages support. To fully enable this support they have to be used at link-time and at run-time. At link time, support is compiled into the binary while at run-time they are used to set the actual size of the huge pages.

    The craype-hugepages* modules are not supported by all compilers. E.g., the AOCC compiler does not support huge pages at the moment, and loading the module at link time will cause an error message from the linker.

The compiler modules

The compiler modules have already been discussed with the PrgEnv-* modules above. The different regular compiler modules also all belong to the same family so no two different compilers can be loaded simultaneously and Lmod will automatically unload the other compiler when a new one is loaded.

The MPI modules

To load the Cray MPI libraries, both one of the craype-network-* modules and a compiler module has to be loaded as the MPI libraries are both network- and compiler specific.

For some unknown reason, the MPI module for the libfabric (craype-network-ofi) transport is called cray-mpich while the library for the UCX transport (craype-network-ucx) is called craype-network-ucx. As a result, the MPI module fails to reload automatically when switching between both transports, but it does reload automatically when switching compilers.

Loading an MPI module will also automatically configure the regular compiler wrappers to compile with support for that MPI module. However, the libfabric and UCX versions of the MPI library have compatible interfaces, so it is always possible to swap between those versions at runtime.

The Cray Scientific libraries

The Cray Scientific Libraries are loaded through the cray-libsci module (or cray-libsci_acc for the GPU versions). Loading this module makes the BLAS, LAPACK, and ScaLAPACK libraries available, and also the Cray IRT (Iterative Refinement Toolkit), a Cray-specific library. It will also configure the compiler wrappers to link with these libraries, so no additional include or link options are needed.

The fftw module

As FFTW is a third party library, it gets its own module (cray-fftw) and is not included in the cray-libsci module.

The cray-fftw module can only be loaded if one of the processor target modules (the craype-x86-* modules) is loaded first.


Some unexpected behaviour of the Cray PE

On Cray EX systems, dynamic linking is the preferred way of linking applications. With that comes some unexpected behaviour of the Cray modules. EasyBuild users expect that at run time the versions of the libraries that are used, are the ones from the modules that are loaded. This is not always the case for the runtime libraries of the Cray PE. By default the Cray PE will use a default set of libraries that is determined by the system default version of the Cray PE (which is set by the sysadmins, not determined by the cpe module). This can only be avoided by either using\ rpath-linking (which is also special in the Cray PE as the wrappers activate rpath linking if the environment variable CRAY_ADD_RPATH is defined and set to yes) or by manually adjusting the library path after loading the modules:

export LD_LIBRARY_PATH=${CRAY_LD_LIBRARY_PATH}:$LD_LIBRARY_PATH
The latter cannot be easily automated in modulefiles. Any technique that can be used (without actually reworking the Cray PE modules) has nasty side effects in some scenarios.

The net result of this feature is that some reproducibility of the results is lost. Programs will react differently if the system default version of the Cray PE is changed as that will change the set of default libraries that will be used at runtime unless rpath-linking is used or users redfine LD_LIBRARY_PATH.


Further reading


[next: LUMI software stacks]


Last update: April 8, 2022