Revision as of 15:29, 25 November 2025

This page is a simple guide to the creation of Singularity-based containers.

Containers are essential for Mufasa users, since user processes must always be run within containers. This is not only due to the way Mufasa is set up, but also a good practice worth learning for HPC applications and adopted both in academia and in the enterprise.

In a nutshell, containers are complete, self-contained development environments that allow users to reproduce the exact same feature set they need on any machine, including a specific operating system, libraries, applications, and packages. They are, essentially, lightweight virtual machines you can deploy on any system featuring a container runtime (e.g., Singularity, Docker, or Podman) by simply bringing a definition file with you. A definition file is an intuitive script containing the instructions required to create (build) your container image, making it particularly convenient to migrate containers across machines and share them with your coworkers.

Though the syntax of the definition files can change depending on the specific runtime (e.g., Docker and Podman share the same syntax for their Dockerfile, while Singularity's def file varies a little), the principles and main sections are generally the same, meaning that learning how to create good quality and clear Singularity def files is a valuable skill you can reuse with any other container runtime. Likewise, the time spent learning the command syntax of Singularity to build, run, shell into, or exec commands on an interactive container or running instance (or service) is definitely worth the price.

On Mufasa, containers must be built and run via the SLURM scheduling system, either interactively or non-interactively, using Singularity as the only available container runtime.

Basic concepts

Images

In Singularity, an image is a single file representing the full container filesystem. Images can be:

a compressed file (usually, .sif files, the standard Singularity format);
or an unpacked directory for development (a sandbox).

Often, an image is based on another image, with some additional customization. For example, you can instruct Singularity to build an image starting from the Ubuntu base image, adding any other applications and configurations you need to have the resulting container provide a specific service.

You might create your own images or use those created by others and published in a registry. To build your own image, you first create a definition file, using a simple syntax that defines the steps needed to create the image; then you build the image from this file and run it.

In general, it is possible to build an image on any machine (e.g., your laptop) and then move it to another machine featuring Singularity as a container runtime, including Mufasa. However, since containers you wish to complement with GPU support require access to the Nvidia libraries installed on the system where you want to run them, it is recommended to build images needing GPU resources on Mufasa itself.

Mufasa provides a specific QOS for users that need to build an image.

Containers

A Singularity container is simply a filesystem image being executed. Singularity containers:

are unprivileged by default, i.e. they run with the same privileges possessed by the calling user (instead of those of the host's root), which makes them safe and practical in multi-user environments; (more about this later)
if run from SIF files, they cannot modify the filesystem image during execution. Any change is applied only to the container instance being executed (i.e., a temporary sandbox);
the filesystem image can be modified, instead, if they are created starting from images built as sandbox (--sandbox), made writable (--writable), and run as container-exclusive root (--fakeroot).

These features make Singularity much more suitable for an HPC environment such as Mufasa with respect to many alternatives (such as Docker).

Mufasa does not include software libraries: to use them, users need to install the libraries in the same Singularity image where the user software runs.

Installation of software libraries in a Singularity image is possible either at building time, by including all those you need in the def file specifications, or at run time if you built the image as a sandbox and applied the above-mentioned flags when running it. The main advantage of the def file-based approach is that you can easily recreate the same image whenever you need by running a single command, while with the second approach, you can simply run commands within the container itself when it gets executed and modify it (e.g., update/install libraries and applications) interactively from the command line.

Creation of a custom Singularity image

Custom Singularity images are built using a definition file, usually stored with a .def extension (and no spaces inside the file name). This file describes how your image will be built and should be placed in an empty folder containing only the files needed to perform the building process.

For all the available specifications and options that can be provided in a def file, see the official documentation. A minimal structure comprises the following elements:

Bootstrap

Specifies the base image source, which is the registry from which the base image will be pulled (e.g., docker for the Docker Hub or library for the standard repository of Singularity-native images).

From

The path to the base image in the specified repository (e.g., ubuntu:22.04 for the base image of Ubuntu, available from both docker and library repositories). Usually, you can start from a base image already including most of the libraries you need (e.g., PyTorch or TensorFlow). The last part of this path (i.e., the image name) usually takes the form name:version, as in ubuntu:22.04. If the version tag is omitted in the image name, that is assumed to be latest.

Suggestion: to make your def file and builds more explicit and resilient, never omit the version tag.

%environment

Defines environment variables set at runtime (not at build time).

%files

Copies files from the host into the container.

%post

Commands executed inside the container during build (executed as root). E.g., in this section, you should include all commands to install the Linux and Python packages you need in the container. Also, in this section, you can also define variables needed at build time.

Note: commands issued at build time should not involve interaction with the user, since such interaction is not possible (e.g., apt install <package names> should be executed with the -y option.

%runscript

Commands executed when the container is run (i.e., when using the singularity run or singularity instance run commands).

In a definition file, lines beginning with # are comments.

Example of definition file

This is an example definition file to create a TensorFlow-ready image. Lines starting with # are comments.

# Base image of the container from which we start to create our custom one
Bootstrap: docker
From: tensorflow/tensorflow:2.16.1-gpu
# Note: if you do not need GPU support, you can use this alternative path instead: tensorflow/tensorflow:2.16.1

%files

# We copy the 'requirements.txt' file contained in the host build directory to the container.

# This file contains all the Python libraries we wish to include in our container.

requirements.txt

%post

# Set the desired Python version (environment variable)

# Note: it is suggested to set the same version of Python already installed in the container pulled at the beginning. For example, the container "tensorflow/tensorflow:2.16.1-gpu" runs Python 3.11

python_version=3.11

# Install the desired Python version and the other applications you need (the current TF image is based on Ubuntu, that's why we use apt as the package manager)

apt-get update

apt-get install -y python${python_version}-full graphviz libgl1-mesa-glx

# Set the default Python executable on the container, so you will not need to call it in its extended form (e.g., "python3.11") when executing scripts in the container.

# Set default version for root user - modified version of this solution: https://jcutrer.com/linux/upgrade-python37-ubuntu1810

update-alternatives --install /usr/local/bin/python python /usr/bin/python${python_version} 1

# Clean pip cache before updating pip

python -m pip cache purge

# Update pip, setuptools, and wheel

python -m pip install --upgrade pip setuptools wheel

# Install the Python libraries we need

python -m pip install -r requirements.txt

# Clean pip cache again to free up space

python -m pip cache purge

Examples of requirements files can be found in this GitHub repo.

When your def file is ready, follow the instructions in the next section to build the image.

Building Singularity images

With Singularity, users can either download and run ready-to-use container images from several preset repositories or customize such images to their needs, as seen in the previous section. Unlike Docker or other OCI-based container runtimes, which can simply pull images from OCI repositories, Singularity always requires building compatible filesystem images even when users don't need to customize them. Consequently, the pull command, typical of OCI container runtimes, is always replaced by the build command in Singularity, whether users want to run preassembled images or build a custom image starting from a pre-existing one. In this section, both kinds of building operations are described.

Useful options

The most useful options available when building Singularity images are the following:

--sandbox

enables the filesystem image to be built as a directory that the user can modify, instead of a single SIF file

--writable

makes the filesystem image writable

--fakeroot

lets the user be root (i.e., to have full administrator privileges) within the container

--nv

enables the image to make use of system GPUs

The following sections provide examples of use of these options.

Using SLURM to build Singularity images

As previously mentioned, image-building operations must be performed as SLURM jobs on Mufasa. The recommended QOS for such operations is the build one.

The general suggestion is to submit a non-interactive SLURM job with a sbatch execution script (named, for example, singularity_build.sbatch) placed in the directory of Mufasa where you want to perform the build and structured as follows:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=16g
#SBATCH --partition=jobs
#SBATCH --qos=build
#SBATCH --output=./logs/build_sbatch-%j.out
#SBATCH --error=./logs/build_sbatch-%j.err

#---Load the Singularity module
module load amd/singularity

#---Build the container image
singularity build [options] <destination_image_folder_OR_file> <path_to_image_in_public_repo_OR_definition_file>

After adapting the build command to your needs, you can simply launch the building process with

sbatch singularity_build.sbatch

(make sure to adapt the name of the sbatch script to the one you used).

The build commands to adopt in the most common situations are explained in the next subsections.

Build a SIF or sandbox image from a pre-assembled image in Docker Hub (or other repos)

To build an SIF image from a readily available image in a public repository, use this command:

singularity build <image_name>.sif <path_to_image_in_public_repo>

Example:

singularity build ubuntu.sif docker://ubuntu:22.04

SIF images are only convenient when you want an immutable or easy-to-share image. If you prefer an uncompressed, fast-to-access, and easy-to-modify sandbox image, use the following form of the build command:

singularity build --sandbox <image_folder>/ <path_to_image_in_public_repo>

Example:

singularity build --sandbox ubuntu/ docker://ubuntu:22.04

Note that, although images built in sandbox mode can be interactively personalized when accessed with the appropriate flags (see Running Singularity images), leveraging def files as documented in the next section is always recommended to create custom images. In particular, in systems where storage per user is limited, as in Mufasa, def files allow recreating the exact same images when needed and deleting them when they are not useful in the short term.

If you wish to run the resulting SIF or sandbox image with GPU support, you must include the --nv flag, as shown below.

When building the image as an SIF file:

singularity build --nv <image_name>.sif <path_to_image_in_public_repo>

When building the image as a sandbox:

singularity build --nv --sandbox <image_folder>/ <path_to_image_in_public_repo>

Example:

singularity build --nv --sandbox ollama/ docker://ollama/ollama

Build a SIF or sandbox custom image from a definition file

As explained in the dedicated section above, a definition file lists all the operations that Singularity must perform to create a custom image. As many of these operations (e.g., apt install ...) need the user to have root permissions on the container, building an image from a definition file always requires the --fakeroot flag. This option ensures the user appears as root inside the container while remaining an unprivileged user outside (i.e., on the host system, Mufasa).

The following syntax allows building an image from a definition file. Of course, it can be combined with the --nv option to enable GPU support in the resulting image.

As a SIF file:

singularity build --fakeroot <image_name>.sif <path_to_def_file>

As a sandbox:

singularity build --fakeroot --sandbox <image_folder>/ <path_to_def_file>

Example (using the definition file described in the previous section):

singularity build --nv --fakeroot --sandbox tf_custom/ Singularity.def

Running Singularity images

Once a Singularity image has been built, you can access or use it with various Singularity sub-commands. The most common ones, namely exec, run, shell, and instance run, are described in this section.

Useful options

The most useful options available when running Singularity images are the following:

--nv

This option is needed when access to GPU resources, allocated through a properly set up SLURM interactive or non-interactive session, should be granted to the container. As explained previously, only containers built with the same --nv flag are expected to work properly in this modality.

--fakeroot

As mentioned in regard to the build process, this flag is needed only when we have to perform operations causing changes to the container's operating system (OS) or accessing container resources requiring administrative privileges. Examples include interactively upgrading or installing system packages in the container, or mounting host paths on privileged container directories using the --bind option, as sometimes required by specific container configurations (e.g., --bind /home/username/appdata:/root/.ollama). The fakeroot option should be avoided in any other case, as it is usually unnecessary, and, as a general security measure, any process must be run with the lowest privileges it requires. E.g., --fakeroot is not required and must be avoided when mounting a specific home path on an unprivileged container path, such as /home (e.g., --bind /home/username/data:/home/data). Besides, the --fakeroot option is not required to leverage host GPU support, for which the --nv flag suffices.

--writable

This flag is needed only if we need to modify the content of paths owned by the container's root user (e.g., to install new applications interactively or modify system files), in which case it is usually combined with the --fakeroot option. The --writable option is not needed if we just need to mount a local path on a privileged directory within the container, as explained earlier. Moreover, it generally makes sense to use this flag only when running containers created from sandbox images. Limitation: --writable and --nv options cannot be applied simultaneously (the container starts properly, but GPU resources cannot be accessed).

The following sections provide examples of use of these options.

`exec`

The singularity exec command executes a specific command within a container without opening a shell, hence non-interactively.

The major advantage of this modality is that when the execution of the program is terminated, the SLURM job and the related container are stopped, freeing up the related resources. This makes exec the preferred method to run Singularity images on Mufasa, i.e. the one that uses less resources and thus maximises the priority of your future jobs.

exec is especially suitable to execute machine learning (ML) model training, validation, and testing, as it ensures that the resources locked for a specific job are released and made available to other Mufasa users as soon as the job completes. More generally, exec should be preferred, in combination with sbatch, to execute any job not requiring interaction with the user or real-time supervision.

Errors and outputs from your scripts can be comprehensively captured using the specific #SBATCH --output and #SBATCH --error directives documented for running a non-interactive build job.

The general syntax to use the exec sub-command is:

singularity exec <image_folder_OR_file_OR_running_instance> <command_to_be_executed_in_the_container>

This is an example using the basic Ubuntu image built previously:

singularity exec --sandbox ubuntu/ uname -a

The following is another example, making use of the TensorFlow custom container built previously, with GPU support.

Note that the current home path you are in is mounted by default when running Singularity containers. Thus, the following will work if a 'main.py' script is provided in the same folder (or subfolder) where the Singularity command is executed.

singularity exec --nv tf_custom/ python main.py

A third example, below, uses the TensorFlow custom container built previously, with GPU support, but mounting a selected home folder (containing code and data subfolders) on a specific unprivileged path within the container:

singularity exec --nv --bind /home/username/testProject:/home/project tf_custom/ python /home/project/code/main.py

Example script for model training

Below is and example of how a sbatch script for running an ML model training task should look.

In this example, we assume the Singularity image called in the script has already been built according to the instructions provided in the relevant sections above. Details on possible and suggested values for the various #SBATCH directives can be found in the related section of this Wiki.

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=<<number of CPU cores your container needs> #---4 cores are usually sufficient for GPU-intensive tasks: ask for more only if actually needed
#SBATCH --mem=<total amount of RAM your container needs>
#SBATCH --partition=jobs
#SBATCH --qos=<the QOS of your choice> #---a QOS with GPU complement is required if the --nv flag is applied below
#SBATCH --gres=<requested GPU resources (provided the selected QOS supports such a request)>
#SBATCH --output=./logs/train_sbatch-%j.out
#SBATCH --error=./logs/train_sbatch-%j.err

#---Load the Singularity module
module load amd/singularity

#---Run your Python training script in a container created from the previously built image WITH GPU SUPPORT.
# Notes:
# - The --bind option can also be added if needed in your specific case.
# - If you don't need GPU support, build the image and 'exec' it without the --nv flag.
singularity exec --nv tf_custom/ python main_train.py

`run`

The singularity run command is similar to the exec sub-command, as it executes a single command (or a sequence of them) in a container before exiting. However, with the run sub-command, only the commands indicated in the %runscript section of the definition file get executed, while it is not possible to pass additional arbitrary commands as command line arguments.

For example, building and running the helloworld Docker container in a SLURM interactive session with commands

module load amd/singularity 

singularity build helloworld.sif docker://hello-world

singularity run helloworld.sif

prints out the typical welcome message but doesn't allow further interaction with the container, as shown below.

gfontana@mufasa2:~$ module load amd/singularity 
Loading amd/singularity
  Loading requirement: amd/go/go-1.25.3
gfontana@mufasa2:~$ singularity build helloworld.sif docker://hello-world
INFO:    Starting build...
INFO:    Fetching OCI image...
INFO:    Extracting OCI image...
INFO:    Inserting Singularity configuration...
INFO:    Creating SIF file...
INFO:    Build complete: helloworld.sif
gfontana@mufasa2:~$ singularity run helloworld.sif
WARNING: passwd file doesn't exist in container, not updating
WARNING: group file doesn't exist in container, not updating

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

gfontana@mufasa2:~$

`shell`

The singularity shell command starts a container image and opens an interactive shell in it. Similar to the exec sub-command, the shell sub-command allows executing commands in a container, but interactively. As a consequence, it is generally used in interactive SLURM sessions to, for example, launch interactive jobs, personalize existing containers, or connect to containers running as instances.

The general syntax for the shell sub-command closely resembles that of the exec sub-command, with the only difference being that we can't append at the end, as arguments, commands to be executed in the container:

singularity shell <image_folder_OR_file_OR_running_instance>

For example, from within an interactive SLURM session, we can start a container from a previously built sandbox image and open a terminal in it to install a desired Ubuntu package:

# Starting the container in interactive mode.
# Note: Remember the appropriate flags to make the container filesystem writable and connect to the terminal as
#       the container's root user.
singularity shell --fakeroot --writable ubuntu/

# From within the container, update 'apt' package list and install a package of interest (e.g., htop)
apt-get update
apt-get install htop

Running an image as a service

The singularity instance run command executes the instructions written in the %runscript section of an image definition file, like the run sub-command. However, instead of exiting when finished, it keeps the container running in the background (i.e., detached, as a service or instance) and accessible after the last command in the sequence has been executed. This command is used to run containers providing services that should remain continuously accessible for a specific time span, even when idle (e.g., web services or LLM runners such as Ollama). This time span can be specified using the #SBATCH --time=hh:mm:ss directive when the singularity instance run command is executed as a non-interactive SLURM job through sbatch (as it should be done in production).

The general syntax to run a container as an instance is as follows (to be run through an interactive/non-interactive SLURM session/job):

singularity instance run <image_folder_OR_file> <instance_name>

While an instance is running under a SLURM job launched from the Mufasa login subsystem, we can open another SSH session and log in directly to the Mufasa host (through its dedicated IP). From there, we can:

# Load the Singularity module, as usual, to access Singularity commands
module load amd/singularity

# Check which Singularity instances are currently running
singularity instance list

# Attach the terminal to a specific instance to run commands in it
singularity shell instance://<instance_name>

# Send single commands to a specific instance
singularity exec instance://<instance_name> <command_to_be_executed_in_the_instance>

When we don't need the running instance anymore, we can close the SSH session on the Mufasa host and simply stop the instance by terminating the associated SLURM job from the Mufasa login subsystem:

# Check the identification number of the SLURM job supporting the running Singularity instance
squeue

# Terminate the corresponding SLURM job
scancel <slurm_job_number>

IMPORTANT: This usage modality is generally not needed for typical Mufasa users' use cases. Given the limited resources available on the system, this modality must be restricted to cases that strictly require it, as it locks up resources for a certain time period, subtracting them from other potential users even if the service running is completely inactive. Whenever possible, the exec sub-command must always be preferred!

This example shows how to run a container as an instance (i.e., sometimes called detached mode) and attach to it to perform operations supported by services running in the same container. Note: If you really need this modality, first try this out in an interactive SLURM session (requesting only a few CPU cores and little memory) to get familiar with the process before launching your actual instances as non-interactive SLURM jobs.

# The following command builds the Ollama container image (it should be run only once)
singularity build --sandbox ollama_container/ docker://ollama/ollama

# The following must be run in an interactive SLURM session / non-interactive SLURM job to launch the container as an instance,
# to provide the Ollama main service (i.e., 'ollama serve') and associate a persistent storage for local LLMs.
# Notes:
#    - 'ollama_data' is an empty folder located in a path of your choice within your home.
#    - If launched from a non-interactive SLURM job, the 'singularity instance run' command must be followed by an appropriate 'sleep' instruction
#      to ensure the container doesn't stop after the Ollama main service has been started. A possible instruction of this kind:
#          sleep 24h
module load amd/singularity
singularity instance run --fakeroot \
	    --bind /home/username/singularity_ollama/ollama_data:/root/.ollama \
	    ollama_container/ ollama

# The following must be run directly on the Mufasa host, while the SLURM session/job is running, to connect to the Ollama instance
module load amd/singularity
singularity shell instance://ollama

# Once we are connected to the instance, we can, for example, download a small LLM and run it:
ollama pull gemma3n:e2b
ollama run gemma3n:e2b

@@ Line 418: / Line 418: @@
 gfontana@mufasa2:~$
+</pre>
 == <code>shell</code> ==

Difference between revisions of "Singularity"

Revision as of 15:29, 25 November 2025

Contents

Basic concepts

Images

Containers

Creation of a custom Singularity image

Example of definition file

Building Singularity images

Useful options

Using SLURM to build Singularity images

Build a SIF or sandbox image from a pre-assembled image in Docker Hub (or other repos)

Build a SIF or sandbox custom image from a definition file

Running Singularity images

Useful options

`exec`

Example script for model training

`run`

`shell`

Running an image as a service

Navigation menu

Search