Nextflow
License information
Nextflow is currently licensed under the Apache License version 2.0, a copy of which can also be found in the COPYING file in the Nextflow GitHub repository.
More copyright notices for software used by Nextflow can be found in the NOTICE file in the Nextflow GitHub repository.
Note that Nextflow should be cited in your papers if you use Nextflow in your research, with more information available in the "Citations" section of the README file in the Nextflow GitHub.
User documentation
-
CSC Nextflow documentation (and pointer to a module in the local software stack of CSC)
General Information
Nextflow is a workflow system for creating scalable, portable and reproducible workflows. It is based on the dataflow programming model, which greatly simplifies the writing of parallel and distributed pipelines, allowing you to focus on the flow of data and computation. Nextflow can deploy workflows on a variety of execution platforms, including your local machine, HPC schedulers, AWS Batch, Azure Batch, Google cloud Batch, and Kubernetes. It also supports many ways to manage your software dependencies, including Conda, Spack, Docker, Podman, Singularity, and more.
Installing Nextflow
Nextflow needs Java, but Java is already installed in the system image on LUMI. Other versions of Java can be installed via the LUMI Software Library, but keep in mind that some Java packages may still select the system one over the one installed as a module.
Nextflow is not very well suited for traditional package managers as it tends to write in its
own directories (trying to update itself) and also puts part of the installation in the
~/.nextflow
subdirectory of your home directory so any central installation is also incomplete.
However, installation instructions that should
work on LUMI are easily found in the
Nextflow documentation, "Installation section".
Assuming an installation in the directory /project/project-46YXXXXXX/software
, the following worked at the time
of writing:
-
Install Nextflow
This will create thenextflow
executable in the current directory.It will however also download a lot of files that will be stored in the hidden directory
~/.nextflow
(in your home directory) so it will eat from your quota and you may have to move that directory to a different filesystem and link to it if it becomes too large! -
Make Nextflow executable
-
Set the executable path for using Nextflow.
Note: You will also need to set the path in the bash script that you use to submit the job
-
Confirm that Nextflow is installed correctly
Note that if you use the CSC-provided module, the nextflow
command will still trigger the download of
a lot of files to the hidden directory ~/.nextflow
(in your home directory) so it will eat from
your quota and you may have to move that directory to a different filesystem and link to it if it
becomes too large!
Technical documentation
What is the problem?
The problem with Nextflow is that even though it claims to install Nextflow, it really
only installs a small shell script while other components are downloaded and put
outside the installation directory, typically in ~/.nextflow
. Hence what claims to
be a central installation really is not. It is also not clear how reproducible these
downloads are: Are they version-specific or are some or all of those packages always
the newest version?