Skip to content

[package list]

HyperQueue

License information

HyperQueue is licensed under the MIT License. See the LICENSE file in the GitHub repositorty.

From version 0.17.0 on, the LICENSE file is also available in $EBROOTHYPERQUEUE/share/licenses/HyperQueue after installation of the package and loading of the module.

User documentation

HyperQueue is software developed at LUMI-partner IT4Innovations.

It is a tool designed to simplify execution of large workflows on HPC clusters. It allows you to execute a large number of tasks in a simple way, without having to manually submit jobs into batch schedulers like PBS or Slurm. You just specify what you want to compute – HyperQueue will automatically ask for computational resources and dynamically load-balance tasks across all allocated nodes and cores. Hence it is one of the tools that can be used to work around the strict limits for the number of jobs for the Slurm scheduler on LUMI.

Besides the user documentation you may also want to have a look at the change log of the code as breaking changes do occur from time to time.

Installation

HyperQueue requires Rust as a dependency. Hence each version has a preferred version of the LUMI software stack. However, from HyperQueue 0.17.0 on, we try to build our Rust build recipes so that they can also be installed in other than the intended versions of the LUMI stack, using a different version of the buildtools that are needed to compile Rust. We have not tested those combinations though.

As HyperQueue doesn't really benefit from processor-specific optimisations (and Rust as a compiler isn't run frequently so doesn't really need it either) we suggest the following slightly non-standard procedures to install HyperQueue: To install in a LUMI software stack, use partition/common, and afterwards HyperQueue will be available in all main partitions of the software stack with just a single install.

E.g., HyperQueue 0.17.0 was tested specifically in LUMI/23.09 (as can be seen if you open the EasyConfig via the links on this page). So one can install it as follows:

module load LUMI/23.09 partition/common
module load EasyBuild-user
eb HyperQueue-0.17.0.eb -r

This will also install the required version of the Rust compiler first which is actually a rather time-consuming thing, so don't be surprised if the build takes one hour.

After a successful installation, Rust and HyperQueue will be available in all partitions of the LUMI/23.09 stack.

Installation of HyperQueue 0.17.0 or later may work in other versions of the LUMI stack also but this has not been tested neither do we support it.

Training materials

User-installable modules (and EasyConfigs)

Install with the EasyBuild-user module:

eb <easyconfig> -r
To access module help after installation and get reminded for which stacks and partitions the module is installed, use module spider HyperQueue/<version>.

EasyConfig:

Technical documentation

Installation

  • HyperQueue is installed via the Rust package manager cargo which in its default setup may not be compatible with the file systems on the cluster as it needs $HOME/.cargo and will try to lock a file in there. This does not work on GPFS. It does work on a Lustre file system though.

    The workaround is to use the CARGO_HOME environment variable and to point to a file system where locking is possible. It does work pointing to a subdirectory in $XDG_RUNTIME_DIR.

  • Add the end of the build with cargo build --release, the hq executable can be found in target/release in the source directory.

EasyBuild

  • There is no support for HyperQueue in the EasyBuilders repository

  • There is no support for HyperQueue in the CSCS repository

We build our own EasyConfig. Two big warnings are needed though:

  • It would likely not pass the criteria set forward by the EasyBuild community for inclusion as the build process itself downloads a lot of files. It is very hard to figure out which files are neede and where they can be put in a way that cargo can find them, if this is possible at all.

    As a result of this there is no way to get this to work on a cluster that does not allow outgoing https connections.

    It also implies that it might be impossible to reproduce the build at a later time as not all sources are stored locally. So if the sources that are downloaded change or are removed, the build may produce a different result or fail alltogether.

  • The cargo command needs a directory in which it can lock files which may not be possible on all file systems. We currently redirect cargo to the file system used for EASYBUILD_BUILDPATH as that is often on a local tmp directory or RAM disk that supports file locking. Building on GPFS fails.

0.4.0 for 21.08

  • We created our own EasyConfig file using the generic CmdCp EasyBlock.

    • In the build phase we execute the cargo build --release command via the command map (and first create the directory that we point to with CARGO_HOME)

    • In the install step we copy the binary.

  • As this is not performance-critical software and as it should work with all toolchains and on all partitions, we decided to use the system compilers and install in partition/common.

  • As EasyBuild doesn't support Rust, we have to set the target CPU by hand or rely on the default which should be to compile to the current CPU. To ensure compatibility with all nodes of LUMI we currently set 'RUSTFLAGS="-C target-cpu=znver2"'. You will have to change cmds_map for other platforms or to simply compile to the CPU target that the compile is run on.

0.5.0 for 21.08

  • This is a trivial edit of the 0.4.0 one.

0.16.0 for use with Rust/1.70.0

  • Trivial modification of the 0.5.0 one, but we do take extra care to avoid picking up toolchain components.

0.17.0 for use with Rust/1.75.0

  • Trivial modification of the 0.16.0. However, we now also copy the LICENSE and CHANGELOG.md files to the software installation and took care to warn for a breaking change in the help information of the module.

Archived EasyConfigs

The EasyConfigs below are additonal easyconfigs that are not directly available on the system for installation. Users are advised to use the newer ones and these archived ones are unsupported. They are still provided as a source of information should you need this, e.g., to understand the configuration that was used for earlier work on the system.