R
License information
R itself is available as free software under the GNU General Public License, see the "COPYING" page on the R web site.
Several of the EasyConfigs provide additional packages from CRAN and other sources that may be licensed differently though. It is the User's responsibility to ensure that all packages can be legally used by them as some may have liceses that restrict use.
User documentation
We do provide some EasyBuild recipes for a plain R installation with only the standard packages and Rmpi (the -raw versions, Rmpi provided as it needs special options to install on LUMI), and a regular version with a lot of packages for parallel computing already included.
Known restrictions
It should be said that R was never developed for parallel computing. Parallel computing is only added through a mess of packages and not an intrinsic part of the language. Moreover the packages for parallel computing often evolved from multicore computing on a workstation, or some disctributed computing on a network of workstations not managed by a scheduler. As a result some packages are not fully functional on LUMI.
Some known restrictions:
-
Rmpi:
mpi.spawn.Rslaves
is not suppoted as Cray MPI does not supportMPI_Comm_spawn
. -
parallel:
detectCores
detects the total number of (virtual) cores, not the number of cores available to the application. This happens with both Cray R and R built with EasyBuild.A solution is to use the
availableCores()
function from the parallelly package. That package can, e.g., recognize the CPU set a program will be running in when started throughsrun
.As a result of packages not being sophisticated enough to recognise they are not running on the full node, it can be expected that some packages using shared memory multiprocessing will launch one thread per virtual core rather one thread per available core, which can lead to heavy oversubscription of the cores in your job and very bad parallel performance.
-
We have seen issues with some linear algebra operations when running in multithreaded mode. Two workarounds seem to help:
-
Setting
OMP_STACKSIZE=256M
(and exporting this variable) which we have implemented in the module for the 24.03 versions, and -
using
ulimit -s 300000
The problem has been reported to HPE Cray. It might be fixed in the 25.03 release of the programming environment.
Note that the issue also occurs with the
cray-R
module, but there you'll have to also setOMP_STACKSIZE
by hand as we cannot change those modules. -
Some noteworthy packages included in the regular EasyConfig
-
Shared memory parallel computing
- parallelly, with improvements over the system parallel package that are more aware of the typical environment on an HPC cluster.
-
Distributed memory computing:
-
Unified shared and distributed memory computing:
-
foreach with several adapters:
-
doParallel for multicore computing.
-
doMPI interfacing with Rmpi for distributed memory parallel computing.
-
doSNOW for interfacing with the SNOW package for distributed memory parallel computing.
-
-
future, see also the documentation web site.
-
-
Interfacing with Slurm
Some examples
These examples are inspired by the examples in the LRZ documentation, section "Parallelization Using R"
A shared memory (multicore) computing example with the parallel and parallelly packages
Example R script that you can store in script_parallel.R
:
library(parallel)
library(parallelly)
ac <- detectCores()
sprintf( "Number of cores according to parallel::detectCores: %d", ac )
nc <- availableCores()
sprintf( "Number of cores according to parallelly::availableCores: %d", nc )
print( "lapply:" )
system.time(
lapply(1:20, function(x) sum(sort(runif(1e7))))
)
print( "mclapply with mc.cores = 1:" )
system.time(
mclapply(1:20, function(x) sum(sort(runif(1e7))), mc.cores = 1)
)
sprintf( "mclapply with mc.cores = %d:", nc )
system.time(
mclapply(1:20, function(x) sum(sort(runif(1e7))), mc.cores = nc)
)
print( "mclapply without mc.cores argument:" )
system.time(
mclapply(1:20, function(x) sum(sort(runif(1e7))))
)
print( "Not sure what is happening here as user time goes down but elapsed goes up." )
Example submit script to store in submit.slurm
:
#!/bin/bash
#SBATCH --job-name=R_parallel_test
#SBATCH --partition=small
#SBATCH --time=2:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --output=%x_%j.txt
#SBATCH --account=project_46YXXXXXX
module load LUMI/23.09 partition/C R/4.3.2-cpeGNU-23.09
echo -e "Running Rscript from $(which Rscript).'\n"
srun Rscript script_parallel.R
The srun
command doesn't seem to make a difference here but it usually ensures that
all Slurm flags, including hints for multithreading, are correctly applied to the
executable that is starting.
Note the difference between the detectCores
routine from parallel
and the
availableCores
routine from parallelly
. The former fails to correctly detect the
resources available for R while the latter does detect that there are only 8 CPUs
available.
This script was written by someone who is not an expert to to whom it is completely
unclear what is happening in the last call to mclapply
as the user time goes doen
while elapsed goes up.
A distributed memory computing example with the snow package
Example R script script_snow.R
:
library(snow)
cl <- makeCluster()
system.time(
parLapply(cl, 1:167, function(x){
sum(sort(runif(1e7)))
}
)
)
stopCluster(cl)
q(save="no")
Example submit script submit.slurm
:
#!/usr/bin/bash
#SBATCH --job-name=snow_test
#SBATCH --partition=small
#SBATCH --time=5:00
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=1
#SBATCH --hint=nomultithread
#SBATCH --output=%x_%j.txt
#SBATCH --account=project_46YXXXXXX
module load LUMI/23.09 partition/C
module load R/4.3.2-cpeGNU-23.09
echo -e "\\n# Running on 2 task\n\n"
srun -W 10 -n 2 $EBROOTR/lib/R/library/snow/RMPISNOW <script_snow.R
echo -e "\\n# Running on 3 task\n\n"
srun -W 10 -n 3 $EBROOTR/lib/R/library/snow/RMPISNOW <script_snow.R
echo -e "\\n# Running on 4 tasks\n\n"
srun -W 10 -n 4 $EBROOTR/lib/R/library/snow/RMPISNOW <script_snow.R
echo -e "\\n# Running on $((SLURM_NTASKS + 1)) tasks\n\n"
srun -W 10 -n $((SLURM_NTASKS + 1)) -O $EBROOTR/lib/R/library/snow/RMPISNOW <script_snow.R
The -W
option ensures that the srun
command will end 10 seconds after the first
process ends as an additional precaution.
Note that we use the EBROOTR
environment variable for the path to the RMPISNOW
script which is needed to run this example. This variable is defined by all EasyBuild-installed
R modules so one does not need to adapt the line when switching to a newer version of R.
The RMPISNOW
command takes care of some initialisations in
particular when the job spans multiple nodes.
There is another important thing to note in this example. If you analyse the timings
in the results carefully you will see that the 3-task case runs in half the time of
the 2-task case and the 4-task case in little over one third of the time of the
2-task case. This is because one of the MPI tasks is used internally as the master
process and is in fact not doing much while the remaining tasks are used
to build the cluster and do the computations in the parLapply
call. This is solved
with the last srun
call: we start one more task than requested in the #SBATCH
lines which we can do by adding the -O
or --overcommit
flag to the srun
command.
We now get a speedup of about 4 compared to the timing we got with 2 tasks which
effectively was running the sequentially as there was a master and one slave.
Examples with foreach
foreach for shared memory (multi-core) computation
For this example, we combine the foreach package with the doParallel adapter package and with parallelly to determine the number of available cores.
The R script script_foreach_MC.R
is:
library(foreach)
library(doParallel)
library(parallelly)
sprintf( "Running on %d core(s).", availableCores())
registerDoParallel(cores = availableCores())
system.time(
foreach(i = 1:100) %dopar% sum(sort(runif(1e7))) # parallel execution
)
The corresponding job script submit.slurm
is
#!/bin/bash
#SBATCH --job-name=R_foreach_doParallel_test
#SBATCH --partition=standard
#SBATCH --time=2:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --hint=nomultithread
#SBATCH --output=%x_%j.txt
#SBATCH --account=project_46YXXXXXX
module load LUMI/23.09 partition/C R/4.3.2-cpeGNU-23.09
echo -e "Running Rscript from $(which Rscript).'\n"
echo -e "\n\nStarting on 1 core.\n\n"
srun -n 1 -c 1 Rscript script_foreach_MC.R
echo -e "\n\nStarting on 4 cores.\n\n"
srun -n 1 -c 4 Rscript script_foreach_MC.R
The first srun
command starts the example on a single core, so it effectively runs in a serial way,
while the second srun
command uses 4 cores. When you inspect the results you'll notice that the user
and system
time in the second case are a bit higher which is normal as parallel computing alsways
comes with some overhead, but from the elapsed
time we get a speedup of 3.4.
foreach with the doSNOW adapter for distributed memory computing
The R script script_foreach_SNOW.R
is now
library(foreach)
library(doSNOW)
cl <- makeCluster()
registerDoSNOW(cl)
system.time(
foreach(i = 1:100) %dopar% sum(sort(runif(1e7))) # parallel execution
)
stopCluster(cl)
The foreach
line is the same as before, but the setup is different. This implies that the core
of the code actually remains the same as before.
The script is started with the jobscript submit.slurm
:
#!/usr/bin/bash
#SBATCH --job-name=R_foreach_SNOW_test
#SBATCH --partition=small
#SBATCH --time=5:00
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=1
#SBATCH --hint=nomultithread
#SBATCH --output=%x_%j.txt
#SBATCH --account=project_46YXXXXXX
module load LUMI/23.09 partition/C
module load R/4.3.2-cpeGNU-23.09
echo -e "\\n# Running on 2 task\n\n"
srun -W 10 -n 2 $EBROOTR/lib/R/library/snow/RMPISNOW <script_foreach_SNOW.R
echo -e "\\n# Running on 3 task\n\n"
srun -W 10 -n 3 $EBROOTR/lib/R/library/snow/RMPISNOW <script_foreach_SNOW.R
echo -e "\\n# Running on 4 tasks\n\n"
srun -W 10 -n 4 $EBROOTR/lib/R/library/snow/RMPISNOW <script_foreach_SNOW.R
echo -e "\\n# Running on $((SLURM_NTASKS + 1)) tasks\n\n"
srun -W 10 -n $((SLURM_NTASKS + 1)) -O $EBROOTR/lib/R/library/snow/RMPISNOW <script_foreach_SNOW.R
The user
and system
time is rather irrelevant now as that is for the master process. That it is still
rather high is likely due to a busy waiting strategy. The elapsed
time shows again the behaviour we've
seen before already, with the time for 2 tasks the time for serial execution of the foreach
function.
foreach with the doMPI adapter for distributed memory computing
The R script script_foreach_MPI.R
is now
library(foreach)
library(doMPI)
cl <- startMPIcluster() # use verbose = TRUE for detailed worker message output
registerDoMPI(cl)
system.time(
foreach(i = 1:100) %dopar% sum(sort(runif(1e7))) # parallel execution
)
closeCluster(cl)
mpi.quit()
The foreach
line is again the same as before, but the setup is different.
The script is started with the jobscript submit.slurm
:
#!/usr/bin/bash
#SBATCH --job-name=R_foreach_SNOW_test
#SBATCH --partition=small
#SBATCH --time=5:00
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=1
#SBATCH --hint=nomultithread
#SBATCH --output=%x_%j.txt
#SBATCH --account=project_46YXXXXXX
module load LUMI/23.09 partition/C
module load R/4.3.2-cpeGNU-23.09
echo -e "\\n# Running on 2 task\n\n"
srun -n 2 Rscript script_foreach_MPI.R
echo -e "\\n# Running on 3 task\n\n"
srun -n 3 Rscript script_foreach_MPI.R
echo -e "\\n# Running on 4 task\n\n"
srun -n 4 Rscript script_foreach_MPI.R
echo -e "\\n# Running on $((SLURM_NTASKS + 1)) tasks\n\n"
srun -n $((SLURM_NTASKS + 1)) -O Rscript script_foreach_MPI.R
It appears that here also one process is used as the master in the background while the other processes
execute the foreach
commands. With this adapter we can actually start the script in the way we are
used to start parallel programs, but we still need to oversubscribe (because of the master process)
to get the best speedup from using 4 tasks (and hence 4 cores).
Information on the web
-
LRZ documentation: "Parallelization Using R"
The job scripts need to be adapted to LUMI but the general principles are valid.
-
CRAN Task View: High-Performance and Parallel Computing with R
User-installable modules (and EasyConfigs)
Install with the EasyBuild-user module:
To access module help after installation and get reminded for which stacks and partitions the module is installed, usemodule spider R/<version>
.
EasyConfig:
-
EasyConfig R-4.3.2-cpeGNU-23.09-raw.eb, will build R/4.3.2-cpeGNU-23.09-raw
-
EasyConfig R-4.3.2-cpeGNU-23.09.eb, will build R/4.3.2-cpeGNU-23.09
-
EasyConfig R-4.4.1-cpeGNU-24.03-OpenMP-raw.eb, will build R/4.4.1-cpeGNU-24.03-OpenMP-raw
This EasyConfig offers R with only the base R packages and - because it is hard to install properly with Rcmd - Rmpi already pre-installed.
-
EasyConfig R-4.4.1-cpeGNU-24.03-OpenMP.eb, will build R/4.4.1-cpeGNU-24.03-OpenMP
This EasyConfig provides R compiled with the multithreaded LibSci libraries (hence the -OpenMP suffix) and with packages for parallel computing and other packages offered by Cray-R already installed. It also contains some packages for benchmarking.
Technical documentation
This module may be a replacement for some users for the HPE-provided version in cray-R.
The -raw
-version of the EasyConfigs is for R without any additional packages, while
the other EasyConfigs offer examples of how to add packages.
Some notes
bin/R
is really only a script that starts R. The relevant executable is inlib64/R/bin/exec
.
EasyBuild
R packages considered for inclusion
- Packages for parallel computing and their dependencies
- Rmpi
- Rcpp
- codetools
- Runit
- tinytest
- backports
- rlang
- parallelly
- iterators
- foreach
- doParallel
- doMPI
- snow
- snowfall
- doSNOW
- base64url
- brew
- checkmate
- data.table
- fs
- cli
- glue
- lifecycle
- pkgconfig
- vctrs
- hms
- prettyunits
- R6
- crayon
- progress
- rappdirs
- stringi
- withr
- digest
- batchtools
- globals
- listenv
- future
- future.apply
- future.batchtools
- Additional packages included in Cray-R
- Additional packages on top of the previous two groups for benchmarking
and their dependencies
- fansi
- utf8
- pillar
- profmem
- magrittr
- tibble
- bench
- microbenchmark
- SuppDists. Used by the popular R-benchmark-25.R script, originally developed by Philippe Grosjean, UMons. See also the "R benchmarks" page on the R for macOS site.
- generics
- tidyselect
- dplyr
- curl
- jsonlite
- mime
- sys
- askpass
- openssl
- httr
- stringi
- benchmarkmeData
- benchmarkme. For the "Crowd sourced benchmarks" page at CRAN.
Version 4.2.1 for cpeGNU 22.06, 22.08
-
Worked from the EasyBuilders EasyConfig but removed all OpenGL-related stuff and also removed all extensions to create the
-raw
version which has no external packages at all. -
PROBLEM: Suffers from the multiple Cray LibSci libraries linked in problem, and it is not clear how they come in.
-
The R
infoSession()
command shows that the non-MPI multithreaded library is used. -
The configure script does discover the right option to compile with OpenMP.
-
Enforcing the OpenMP compiler flag through
toolchainopts
though fails. It is not used at link time, resulting in undefined OpenMP symbols when linking.
SOLUTION: Enforce OpenMP through
toolchainopts
and manually add-fopenmp
toLDFLAGS
inpreconfigopts
. -
-
Extensions: List based on the foss-2022a version of R in EasyBuild.
-
Rmpi:
-
Typically uses an EasyBlock, but that EasyBlock does not support the Cray environment so we avoid using it and instead manually build the right
--configure-args
flag for the R installation command. -
There is still a problem as when loading
Rmpi.so
, the load fails becausempi_universe_size
cannot be found. Now this is a routine defined in Rmpi itself but only compiles in certain cases. However, it looks like routines that reference to this function are not correctly disabled when this routine is not included in the compilation.Now in fact, the configure script does correctly determine that it should add
-DMPI2
to the command line. However, it does so after a faulty-I
flag that does not contain a directory. Hence the compiler interprets the-DMPI2
flag as the argument of-I
instead.The solution to this last problem is a bit complicated.
-
The empty
-I
argument is a bug in the configure script that occurs when--with-Rmpi-include
is not used. -
Adding just any directory with that argument does not work; the configure script checks if it contains
mpi.h
. We used an environment variable to find the directory that Cray MPI uses. -
But adding
--with-Rmpi-include
also requires adding--with-Rmpi-libpath
for which we used the same environment variable. -
So hopefully these options do not conflict with anything the Cray compiler wrappers add.
--with-Rmpi-include
and--with-Rmpi-libpath
really should not be needed when using the Cray wrappers.
-
-
-
Version 4.3.1 for 22.12
-
Based on the 4.2.1 work but now focused on adding packages for parallel computing (and developed the USER.md page explaining options for parallel computing).
-
One way to figure out how to do this is to install the
-raw
version and then add the desired packages by hand and see what other packages R pulls in and when it does so.
Version 4.3.2 for 23.09
- Quick port with minor version updates of the 22.12 one.
VErsion 4.4.1 for 24.03
-
Quick port with version updates of the 23.09 EasyConfigs.
-
Changed the naming to stress that we link with the multithreaded BLAS libraries.
Archived EasyConfigs
The EasyConfigs below are additonal easyconfigs that are not directly available on the system for installation. Users are advised to use the newer ones and these archived ones are unsupported. They are still provided as a source of information should you need this, e.g., to understand the configuration that was used for earlier work on the system.
-
Archived EasyConfigs from LUMI-EasyBuild-contrib - previously user-installable software
-
EasyConfig R-4.2.1-cpeGNU-22.06-raw.eb, with module R/4.2.1-cpeGNU-22.06-raw
-
EasyConfig R-4.2.1-cpeGNU-22.06.eb, with module R/4.2.1-cpeGNU-22.06
-
EasyConfig R-4.2.1-cpeGNU-22.08-raw.eb, with module R/4.2.1-cpeGNU-22.08-raw
-
EasyConfig R-4.2.1-cpeGNU-22.08.eb, with module R/4.2.1-cpeGNU-22.08
-
EasyConfig R-4.2.3-cpeGNU-22.12-raw.eb, with module R/4.2.3-cpeGNU-22.12-raw
-
EasyConfig R-4.2.3-cpeGNU-22.12.eb, with module R/4.2.3-cpeGNU-22.12
-
EasyConfig R-4.3.1-cpeGNU-22.12-raw.eb, with module R/4.3.1-cpeGNU-22.12-raw
-
EasyConfig R-4.3.1-cpeGNU-22.12.eb, with module R/4.3.1-cpeGNU-22.12
-