Implementing easyblocks¶
[back: Using external modules from the Cray PE]
The basics¶
An easyblock is a Python module that implements a software build and installation procedure.
This concept is essentially implemented as a Python script that plugs into the EasyBuild framework.
EasyBuild will leverage easyblocks as needed, depending on which software packages it needs to install. Which easyblock is required is determined by the easyblock
easyconfig parameter, if it is present, or by the software name.
Generic vs software-specific easyblocks¶
Easyblocks can either be generic or software-specific.
Generic easyblocks implement a "standard" software build and installation procedure that is used by multiple different
software packages.
A commonly used example is the
ConfigureMake
generic easyblock, which implements the standard configure
- make
- make install
installation procedure used
by most GNU software packages.
Software-specific easyblocks implement the build and installation procedure for a particular software package.
Typically this involves highly customised steps, for example specifying dedicated configuration options, creating
or adjusting specific files, executing non-standard shell commands, etc. Usually a custom implementation of the
sanity check is also included. Much of the work done in software-specific easyblocks can often also be done
in generic easyblocks using parameters such as confdigopts
etc., but a software-specific easyblock can
hide some of that complexity from the user. Other software-specific easyblocks implement very specific
installation procedures that do not fit in one of the generic ones.
Using a generic easyblock requires specifying the easyblock
parameter in the easyconfig file.
If it is not specified, EasyBuild will try and find the software-specific easyblock derived from the software name.
The distinction between generic and software-specific easyblocks can be made based on the naming scheme that is used for an easyblock (see below).
Naming¶
Easyblocks need to follow a strict naming scheme, to ensure that EasyBuild can pick them up automatically as needed. This involves two aspects:
- the name of the Python class;
- the name and location of the Python module file.
Python class name¶
The name of the Python class is determined by the software name for software-specific easyblocks.
It consists of a prefix 'EB_
', followed by the (encoded) software name.
Because of limitations in Python on characters allowed in names of Python classes,
only alphanumeric characters and underscores (_
) are allowed. Any other characters are replaced following an encoding scheme:
- spaces are replaced by underscores (
_
); - dashes
-
are replaced by_minus_
(note the inconsistency with the naming ofEBROOT
andEBVERSION
variables); - underscores are replaced by
_underscore_
;
The encode_class_name
function provided in easybuild.tools.filetools
returns the expected class name
for a given software name; for example:
$ python3 -c "from easybuild.tools.filetools import encode_class_name; print(encode_class_name('netCDF-Fortran'))"
EB_netCDF_minus_Fortran
Python class name for generic easyblocks
For generic easyblocks, the class name does not include an EB_
prefix (since there is no need for an escaping
mechanism) and hence the name is fully free to choose, taking into account the restriction to alphanumeric characters
and underscores.
For code style reasons, the class name should start with a capital letter and use CamelCasing.
Examples include Bundle
, ConfigureMake
, CMakePythonPackage
.
Python module name and location¶
The filename of the Python module is directly related to the name of Python class it provides.
It should:
- not include the
EB_
prefix of the class name for software-specific easyblocks; - consists only of lower-case alphanumeric characters (
[a-z0-9]
) and underscores (_
);- dashes (
-
) are replaced by underscores (_
); - any other non-alphanumeric characters (incl. spaces) are simply dropped;
- dashes (
Examples include:
gcc.py
(for GCC)netcdf_fortran.py
(for netCDF-Fortran)gamess_us.py
(for GAMESS (US))
The get_module_path
function provided by the EasyBuild framework in the
easybuild.framework.easyconfig.easyconfig
module returns the (full)
module location for a particular software name or easyblock class name. For example:
>>> from easybuild.framework.easyconfig.easyconfig import get_module_path
>>> get_module_path('netCDF-Fortran')
'easybuild.easyblocks.netcdf_fortran'
>>> get_module_path('EB_netCDF_minus_Fortran')
'easybuild.easyblocks.netcdf_fortran'
The location of the Python module is determined by whether the easyblock is generic or software-specific.
Generic easyblocks are located in the easybuild.easyblocks.generic
namespace, while software-specific easyblocks
live in the easybuild.easyblocks
namespace directly.
To keep things organised, the actual Python module files
for software-specific easyblocks are kept in 'letter' subdirectories,
rather than in one large 'easyblocks
' directory
(see
https://github.com/easybuilders/easybuild-easyblocks/tree/main/easybuild/easyblocks),
but this namespace is collapsed transparently by EasyBuild (you don't need to import from letter subpackages).
To let EasyBuild pick up one or more new or customized easyblocks, you can use the --include-easyblocks
configuration option. As long as both the filename of the Python module and the name of the Python class
are correct, EasyBuild will use these easyblocks when needed.
On LUMI, the EasyBuild configuration modules take care of setting this parameter (using the corresponding environment variable), pointing to custom easyblocks in the LUMI software stack itself and a repo (with a fixed name) that users can create themselves. At this moment it does not yet include possible other easyblock repositories in other repositories.
Structure of an easyblock¶
The example below shows the overall structure of an easyblock:
from easybuild.framework.easyblock import EasyBlock
from easybuild.tools.run import run_cmd
class EB_Example(EasyBlock):
"""Custom easyblock for Example"""
def configure_step(self):
"""Custom implementation of configure step for Example"""
# run configure.sh to configure the build
run_cmd("./configure.sh --install-prefix=%s" % self.installdir)
Each easyblock includes an implementation of a class
that (directly or indirectly) derives from the abstract
EasyBlock
class.
Typically some useful functions provided by the EasyBuild framework are imported at the top of the Python module.
In the class definition, one or more '*_step
' methods (and perhaps a couple of others) are redefined,
to implement the corresponding step in the build and installation procedure.
Each easyblock must implement the configure
, build
and install
steps, since these are not implemented
in the abstract EasyBlock
class. This could be done explicitly by redefining the corresponding *_step
methods,
or implicitly by deriving from existing (generic) easyblocks.
The full list of methods that can be redefined in an easyblock can be consulted in the API documentation.
Deriving from existing easyblocks¶
When implementing an easyblock, it is common to derive from an existing (usually generic) easyblock, and to leverage the functionality provided by it. This approach is typically used when only a specific part of the build and installation procedure needs to be customised.
In the (fictitious) example below, we derive from the generic ConfigureMake
easyblock to redefine the configure
step. In this case, we are extending the configure
step as implemented by ConfigureMake
rather than
redefining it entirely, since we call out to the original configure_step
method at the end.
from easybuild.easyblocks.generic.configuremake import ConfigureMake
from easybuild.tools.filetools import copy_file
class EB_Example(ConfigureMake):
"""Custom easyblock for Example"""
def configure_step(self):
"""Custom implementation of configure step for Example"""
# use example make.cfg for x86-64
copy_file('make.cfg.x86', 'make.cfg')
# call out to original configure_step implementation of ConfigureMake easyblock
super(EB_Example, self).configure_step()
Easyconfig parameters¶
All of the easyconfig parameters that are defined in an easyconfig file
are available via the EasyConfig
instance that can be accessed through self.cfg
in an easyblock.
For instance, if the easyconfig file specifies
name = 'example'
version = '2.5.3'
versionsuffix = '-Python-3.7.4'
then these three parameters are accessible within an easyblock via self.cfg['name']
, self.cfg['version']
and self.cfg['versionsuffix']
.
A few of the most commonly used parameters can be referenced directly:
self.name
is equivalent withself.cfg['name']
;self.version
is equivalent withself.cfg['version']
;self.toolchain
is equivalent withself.cfg['toolchain']
;
Updating parameters¶
You will often find that you need to update some easyconfig parameters in an easyblock,
for example configopts
which specifies options for the configure command.
Because of implementation details (related to
how template values like %(version)s
are handled), you need to be a bit careful here...
To completely redefine the value of an easyconfig parameter, you can use simple assignment. For example:
self.cfg['example'] = "A new value for the example easyconfig parameter."
If want to add to the existing value however, you must use the self.cfg.update
method. For example:
self.cfg.update('some_list', 'example')
One could be tempted to use
# anti-pattern, this does NOT work as expected!
self.cfg['some_list'].append('example')
instead, but this will not work because self.cfg['some_list']
does not return a reference to the original value,
but to a temporary copy thereof.
Custom parameters¶
Additional custom easyconfig parameters can be defined in an easyblock to steer its behaviour.
This is done via the extra_options
static method. Custom parameters can be specified to be mandatory.
The example below shows how this can be implemented:
from easybuild.easyblocks.generic.configuremake import ConfigureMake
from easybuild.framework.easyconfig import CUSTOM, MANDATORY
class EB_Example(ConfigureMake):
"""Custom easyblock for Example"""
@staticmethod
def extra_options():
"""Custom easyconfig parameters for Example"""
extra_vars = {
'required_example_param': [None, "Example required custom parameter", MANDATORY],
'optional_example_param': [None, "Example optional custom parameter", CUSTOM],
}
return ConfigureMake.extra_options(extra_vars)
The first element in the list of a defined custom parameter corresponds to the default value for that parameter
(both None
in the example above). The second element provides some informative help text
(which can then be displayed with eb -a -e <name_of_easyblock>
, eg, eb -a -e EB_GCC
),
and the last element
indicates whether the parameter is mandatory (MANDATORY
) or just an optional custom parameter (CUSTOM
).
Easyblock constructor¶
In the class
constructor of the easyblock, i.e. the __init__
method, one or more class variables
can be initialised. These can be used for sharing information between different *_step
methods in the easyblock.
For example:
from easybuild.framework.easyblock import EasyBlock
class EB_Example(EasyBlock):
"""Custom easyblock for Example"""
def __init__(self, *args, **kwargs):
"""Constructor for Example easyblock, initialises class variables."""
# call out to original constructor first, so 'self' (i.e. the class instance) is initialised
super(EB_Example, self).__init__(*args, **kwargs)
# initialise class variables
self.example_value = None
self.example_list = []
File operations¶
File operations is a common use case for implementing easyblocks, hence the EasyBuild framework provides a number of useful functions related to this, including:
-
read_file(<path>)
: read file at a specified location and returns its contents; -
write_file(<path>, <text>)
at a specified location with provided contents; to append to an existing file, useappend=True
as an extra argument; -
copy_file(<src>, <dest>)
to copy an existing file; -
apply_regex_substitutions(<path>, <list of regex substitutions>)
to patch an existing file;
All of these functions are provided by the easybuild.tools.filetools
module.
Executing shell commands¶
For executing shell commands two functions are provided by the
easybuild.tools.run
module:
-
run_cmd(<cmd>)
to run a non-interactive shell command; -
run_cmd_qa(<cmd>, <dict with questions & answers>)
to run an interactive shell command;
Both of these accept a number of optional arguments:
-
simple=True
to just returnTrue
orFalse
to indicate a successful execution, rather than the default return value, i.e., a tuple that provides the command output and the exit code (in that order); -
path=<path>
to run the command in a specific subdirectory;
The run_cmd_qa
function takes two additional specific arguments:
-
no_qa=<list>
to specify a list of patterns to recognize non-questions; -
std_qa=<dict>
to specify regular expression patterns for common questions, and the matching answer;
Manipulating environment variables¶
To (re)define environment variables, the setvar
function provided by the
easybuild.tools.environment
module should be used.
This makes sure that the changes being made to the specified environment variable are kept track of,
and that they are handled correctly under --extended-dry-run
.
Logging and errors¶
It is good practice to include meaningful log messages in the *_step
methods being customised in the easyblock,
to enrich the EasyBuild log with useful information for later debugging or diagnostics.
For logging, the provided self.log
logger class should be used.
You can use the self.log.info
method to log an informative message.
Similar methods are available for logging debug messages (self.log.debug
), which are
only emitted when eb
is run with debugging mode enabled (--debug
or -d
),
and for logging warning messages (self.log.warning
).
If something goes wrong, you can raise an EasyBuildError
instance to report the error.
For example:
from easybuild.framework.easyblock import EasyBlock
from easybuild.tools.build_log import EasyBuildError
from easybuild.tools.run import run_cmd
class EB_Example(EasyBlock):
"""Custom easyblock for Example"""
def configure_step(self):
"""Custom implementation of configure step for Example"""
cmd = "./configure --prefix %s" % self.installdir)
out, ec = run_cmd(cmd)
success = 'SUCCESS'
if success in out:
self.log.info("Configuration command '%s' completed with success." % cmd)
else:
raise EasyBuildError("Pattern '%s' was not found in output of '%s'." % (success, cmd))
Custom sanity check¶
For software-specific easyblocks, a custom sanity check is usually included to verify that the installation was successful or not.
This is done by redefining the sanity_check_step
method in the easyblock. For example:
from easybuild.framework.easyblock import EasyBlock
class EB_Example(EasyBlock):
"""Custom easyblock for Example"""
def sanity_check_step(self):
"""Custom sanity check for Example."""
custom_paths = {
'files': ['bin/example'],
'dirs': ['lib/examples/'],
}
custom_commands = ['example --version']
# call out to parent to do the actual sanity checking, pass through custom paths and commands
super(EB_Example, self).sanity_check_step(custom_paths=custom_paths, custom_commands=custom_commands)
You can both specify file paths and subdirectories to check for, which are specified relative to the installation directory, as well as simple commands that should execute successfully after completing the installation and loading the generated module file.
It is up to you how extensive you make the sanity check, but it is recommended to make the check as complete as possible to catch any potential build or installation problems that may occur, while ensuring that it can run relatively quickly (in seconds, or at most a couple of minutes).
Version-specific parts¶
In some cases version-specific actions or checks need to be included in an easyblock.
For this, it is recommended to use LooseVersion
rather than directly comparing version numbers using string values.
For example:
from distutils.version import LooseVersion
from easybuild.framework.easyblock import EasyBlock
class EB_Example(EasyBlock):
"""Custom easyblock for Example"""
def sanity_check_step(self):
"""Custom sanity check for Example."""
custom_paths = {
'files': [],
'dirs': [],
}
# in older versions, the binary used to be named 'EXAMPLE' rather than 'example'
if LooseVersion(self.version) < LooseVersion('1.0'):
custom_paths['files'].append('bin/EXAMPLE')
else:
custom_paths['files'].append('bin/example')
super(EB_Example, self).sanity_check_step(custom_paths=custom_paths)
Compatibility with --extended-dry-run
and --module-only
¶
Some special care must be taken to ensure that an easyblock is fully compatible with --extended-dry-run
/ -x
(see Inspecting install procedures) and --module-only
.
For compatibility with --extended-dry-run
, you need to take into account that specified operations
like manipulating files or running shell commands will not actually be executed. You can check
whether an easyblock is being run in dry run mode via self.dry_run
.
For example:
from easybuild.framework.easyblock import EasyBlock
from easybuild.tools.build_log import EasyBuildError
from easybuild.tools.run import run_cmd
class EB_Example(EasyBlock):
"""Custom easyblock for Example"""
def configure_step(self):
"""Custom implementation of configure step for Example"""
cmd = "./configure --prefix %s" % self.installdir)
out, ec = run_cmd(cmd)
success = 'SUCCESS'
if success in out:
self.log.info("Configuration command '%s' completed with success." % cmd)
# take into account that in dry run mode we won't get any output at all
elif self.dry_run:
self.log.info("Ignoring missing '%s' pattern since we're running in dry run mode." % success)
else:
raise EasyBuildError("Pattern '%s' was not found in output of '%s'." % (success, cmd))
For --module-only
, you should make sure that both the make_module_step
, including the make_module_*
submethods,
and the sanity_check_step
methods do not make any assumptions about the presence of certain environment variables, or that class variables have been defined already.
This is required because under --module-only
the large majority of the *_step
functions are
simply skipped entirely. So, if the configure_step
method is responsible for defining class variables that are
picked up in sanity_check_step
, the latter may run into unexpected initial values like None
.
A possible workaround is to define a separate custom method to define the class variables, and to call out to this
method from configure_step
and sanity_check_step
(for the latter, conditionally, i.e., only if the class
variables still have the initial values).
For example:
from easybuild.framework.easyblock import EasyBlock
class EB_Example(EasyBlock):
"""Custom easyblock for Example"""
def __init__(self, *args, **kwargs):
"""Easyblock constructor."""
super(EB_Example, self).__init__(*args, **kwargs)
self.command = None
def set_command(self):
"""Initialize 'command' class variable."""
# $CC environment variable set by 'prepare' step determines exact command
self.command = self.name + '-' + os.getenv('CC')
def configure_step(self):
"""Custom configure step for Example."""
self.set_command()
self.cfg.update('configopts', "COMMAND=%s" % self.command)
super(EB_Example, self).configure_step()
def sanity_check_step(self):
"""Custom implementation of configure step for Example"""
if self.command is None:
self.set_command()
super(EB_Example, self).sanity_check_step(custom_commands=[self.command])
Easyblocks in the Cray ecosystem¶
The generic easyblocks are usually rather independent of compilers etc. and tend to work well with all toolchains. However, software-specific easyblocks may contain code that is specific for certain toolchains and are often only tested with the common toolchains (foss and intel and their subtoolchains). Many of those easyblocks will fail on Cray systems (or any system that uses other toolchains) as they don't recognise the compiler and rather than implementing some generic behaviour that may or may not work, produce an error message instead that the compiler toolchain is not supported.
Several packages on LUMI therefore use generic easyblocks rather than the software-specific easyblocks that may exist for those applications. Adapting those software-specific easyblocks for LUMI poses an interesting maintenance problem. Either one could decide to not contribute back to the community, but this implies then that all modifications made to the corresponding easyblocks in the EasyBuild distribution should be monitored and implemented in the custom easyblocks for Cray also. On the other hand, contributing back to the community also poses two problems. First it would also require to implement the Cray toolchains as used on LUMI in the core of EasyBuild (which already contains a different set of toolchains targeted more at how the Cray PE works with the regular environment modules), and that only makes sense if these toolchains are first extended to not only cover the programming environments supported on LUMI but also the Intel and NVIDIA programming environments. Second, the EasyBuild community has no easy way of testing any modification made to such an easyblock on a Cray PE system. Hence every update made in the community may break the Cray PE support again.
Exercise¶
Exercise I.1¶
Try implementing a new custom easyblock for eb-tutorial
, which derives directly
from the base EasyBlock
class.
Your easyblock should:
- define a custom mandatory easyconfig parameter named
message
; - run
cmake
to configure the installation, which includes at least:- specifying the correct installation prefix (using the
-DCMAKE_INSTALL_PREFIX=...
option); - passing down the value of
message
easyconfig parameter via-DEBTUTORIAL_MSG=...
- specifying the correct installation prefix (using the
- run
make
to buildeb-tutorial
; - run
make install
to install the generated binary; - perform a custom sanity check to ensure the installation is correct;
- pick up on commonly used easyconfig parameters like
configopts
andpreinstallopts
where appropriate;
(click to show solution)
Here's a complete custom easyblock for eb-tutorial
that derives from the base EasyBlock
class,
which should be included in a file named eb_tutorial.py
.
We need to implement the configure_step
, build_step
, and install_step
methods in
order to have a fully functional easyblock.
The configure, build, and install steps take into account the corresponding easyconfig parameters that allow customizing these commands from an easyconfig file.
from easybuild.framework.easyblock import EasyBlock
from easybuild.framework.easyconfig import MANDATORY
from easybuild.tools.run import run_cmd
class EB_eb_minus_tutorial(EasyBlock):
"""Custom easyblock for eb-tutorial."""
@staticmethod
def extra_options():
extra = EasyBlock.extra_options()
extra.update({
'message': [None, "Message that eb-tutorial command should print", MANDATORY],
})
return extra
def configure_step(self):
"""Custom configure step for eb-tutorial: define EBTUTORIAL_MSG configuration option."""
cmd = ' '.join([
self.cfg['preconfigopts'],
'cmake',
'-DCMAKE_INSTALL_PREFIX=\'%s\'' % self.installdir,
'-DEBTUTORIAL_MSG="%s"' % self.cfg['message'],
self.cfg['configopts'],
])
run_cmd(cmd)
def build_step(self):
"""Build step for eb-tutorial"""
cmd = ' '.join([
self.cfg['prebuildopts'],
'make',
self.cfg['buildopts'],
])
run_cmd(cmd)
def install_step(self):
"""Install step for eb-tutorial"""
cmd = ' '.join([
self.cfg['preinstallopts'],
'make install',
self.cfg['installopts'],
])
run_cmd(cmd)
def sanity_check_step(self):
custom_paths = {
'files': ['bin/eb-tutorial'],
'dirs': [],
}
custom_commands = ['eb-tutorial']
return super(EB_eb_minus_tutorial, self).sanity_check_step(custom_paths=custom_paths,
custom_commands=custom_commands)
We also need to adapt our easyconfig file for eb-tutorial
:
- The
easyblock
line is no longer needed as we will rely on the automatic selection of the software-specific easyblock. - We don't need to define the message through
configopts
but via the easyblock-specific configuration parametermessage
. In fact, we were so careful when implementing theconfigure_step
that even variable expansion will still work so we can still include$USER
in the message. - The sanity check is also no longer needed as it is done by the software-specific easyblock.
So the easyconfig file simplifies to:
name = 'eb-tutorial'
version = "1.1.0"
homepage = 'https://easybuilders.github.io/easybuild-tutorial'
whatis = [ 'Description: EasyBuild tutorial example']
description = """
This is a short C++ example program that can be build using CMake.
"""
toolchain = {'name': 'cpeCray', 'version': '21.12'}
builddependencies = [
('buildtools', '%(toolchain_version)s', '', True)
]
source_urls = ['https://github.com/easybuilders/easybuild-tutorial/raw/main/docs/files/']
sources = [SOURCE_TAR_GZ]
checksums = ['def18b69b11a3ec34ef2a81752603b2118cf1a57e350aee41de9ea13c2e6a7ef']
message = 'Hello from the EasyBuild tutorial! I was installed by $USER.'
moduleclass = 'tools'
Running this example on LUMI is a little tricky as using --include-easyblocks
to point EasyBuild to
our new easyblock interferes with settings already made by the EasyBuild configuration modules (EasyBuild-user
)
and causes error messages about the toolchains. So either the easyblock needs to be copied to the user location
that can be found by looking at the output of eb --show-config
or we simply need to extend the list of
easyblocks that EasyBuild searches with the easyblocks in the current directory:
EASYBUILD_INCLUDE_EASYBLOCKS="$EASYBUILD_INCLUDE_EASYBLOCKS,./*.py"
Exercise I.2¶
Try implementing another new custom easyblock for eb-tutorial
,
which derives from the generic CMakeMake
easyblock.
Your easyblock should only:
- define a custom mandatory easyconfig parameter named
message
; - pass down the value of
message
easyconfig parameter via-DEBTUTORIAL_MSG=...
- perform a custom sanity check to ensure the installation is correct;
(click to show solution)
When deriving from the CMakeMake
generic easyblock, there is a lot less to worry about.
We only need to customize the configure_step
method to ensure that the -DEBTUTORIAL_MSG
configuration
option is specified; the CMakeMake
easyblock already takes care of specifying the location of
the installation directory (and a bunch of other configuration options, like compiler commands and flags, etc.).
Implementing the build_step
and install_step
methods is no longer needed,
the standard procedure that is run by the CMakeMake
generic easyblock is fine,
and even goes beyond what we did in the previous exercise (like building in parallel with make -j
).
from easybuild.easyblocks.generic.cmakemake import CMakeMake
from easybuild.framework.easyconfig import MANDATORY
from easybuild.tools.run import run_cmd
class EB_eb_minus_tutorial(CMakeMake):
"""Custom easyblock for eb-tutorial."""
@staticmethod
def extra_options():
extra = CMakeMake.extra_options()
extra.update({
'message': [None, "Message that eb-tutorial command should print", MANDATORY],
})
return extra
def configure_step(self):
"""Custom configure step for eb-tutorial: define EBTUTORIAL_MSG configuration option."""
self.cfg.update('configopts', '-DEBTUTORIAL_MSG="%s"'% self.cfg['message'])
super(EB_eb_minus_tutorial, self).configure_step()
def sanity_check_step(self):
custom_paths = {
'files': ['bin/eb-tutorial'],
'dirs': [],
}
custom_commands = ['eb-tutorial']
return super(EB_eb_minus_tutorial, self).sanity_check_step(custom_paths=custom_paths,
custom_commands=custom_commands)
This is a much simpler easyblock as we already use all the logic that has been written for us to build with CMake.
[next: EasyBuild as a library]