Skip to content

[package list]

Nextflow

License information

Nextflow is currently licensed under the Apache License version 2.0, a copy of which can also be found in the COPYING file in the Nextflow GitHub repository.

More copyright notices for software used by Nextflow can be found in the NOTICE file in the Nextflow GitHub repository.

Note that Nextflow should be cited in your papers if you use Nextflow in your research, with more information available in the "Citations" section of the README file in the Nextflow GitHub.

User documentation

General Information

Nextflow is a workflow system for creating scalable, portable and reproducible workflows. It is based on the dataflow programming model, which greatly simplifies the writing of parallel and distributed pipelines, allowing you to focus on the flow of data and computation. Nextflow can deploy workflows on a variety of execution platforms, including your local machine, HPC schedulers, AWS Batch, Azure Batch, Google cloud Batch, and Kubernetes. It also supports many ways to manage your software dependencies, including Conda, Spack, Docker, Podman, Singularity, and more.

Installing Nextflow

Nextflow needs Java, but Java is already installed in the system image on LUMI. Other versions of Java can be installed via the LUMI Software Library, but keep in mind that some Java packages may still select the system one over the one installed as a module.

Nextflow is not very well suited for traditional package managers as it tends to write in its own directories (trying to update itself) and also puts part of the installation in the ~/.nextflow subdirectory of your home directory so any central installation is also incomplete. However, installation instructions that should work on LUMI are easily found in the Nextflow documentation, "Installation section".

Assuming an installation in the directory /project/project-46YXXXXXX/software, the following worked at the time of writing:

  • Install Nextflow

    cd /project/project-46YXXXXXX/software
    
    curl -s https://get.nextflow.io | bash
    
    This will create the nextflow executable in the current directory.

    It will however also download a lot of files that will be stored in the hidden directory ~/.nextflow (in your home directory) so it will eat from your quota and you may have to move that directory to a different filesystem and link to it if it becomes too large!

  • Make Nextflow executable

    chmod +x nextflow
    
  • Set the executable path for using Nextflow.

    Note: You will also need to set the path in the bash script that you use to submit the job

    export PATH=/project/project-46YXXXXXX/software:$PATH
    
  • Confirm that Nextflow is installed correctly

    nextflow info
    

Note that if you use the CSC-provided module, the nextflow command will still trigger the download of a lot of files to the hidden directory ~/.nextflow (in your home directory) so it will eat from your quota and you may have to move that directory to a different filesystem and link to it if it becomes too large!

Technical documentation

What is the problem?

The problem with Nextflow is that even though it claims to install Nextflow, it really only installs a small shell script while other components are downloaded and put outside the installation directory, typically in ~/.nextflow. Hence what claims to be a central installation really is not. It is also not clear how reproducible these downloads are: Are they version-specific or are some or all of those packages always the newest version?