System
Mufasa is a Linux server located in a server room managed by the System Administrators.
Job Users and Job Administrators can only access Mufasa remotely.
Remote access to Mufasa is performed using the SSH protocol for the execution of commands and the SFTP protocol for the exchange of files. Once logged in, a user interacts with Mufasa via a terminal (text-based) interface.
Mufasa 2.0
At the beginning of November 2025 Mufasa has been subjected to a comprehensive hardware and software overhaul: the new system is sometimes called Mufasa 2.0; to distinguish it from the old system, the latter is sometimes called "Mufasa 1.0".
This wiki is currently (November 2025) being updated to reflect the changes. Elements that changed significantly from Mufasa 1.0 to Mufasa 2.0 are highlighted in yellow. When the title of a section is highlighted, it means either that it is a new section, or that the section's contents significantly changed.
Hardware
Mufasa is a server for massively parallel computation. It has been set up and configured by E4 Computer Engineering with the support of the Biomechanics Group, the CartCasLab laboratory and the NearLab laboratory.
Mufasa's main hardware components are:
- Supermicro A+ Server 4124GS-TNR
- 2 AMD Epyc 7542 32-core processors (64 CPU cores total)
- 1 TB RAM
- 7 TB of SSDs (fast temporary repository for datasets actively being used)
- 25 TB of HDDs (for user
/homedirectories) - 8 Nvidia A100 GPUs [based on the Ampere architecture]
- Ubuntu Linux 24.04 LTS server operating system
System resources are shared among different users and processes in order to optimise their usage and availability. This sharing is managed by SLURM.
CPUs and GPUs
Mufasa is fitted with two 32-core CPU, so the system has a total of 64 phyical CPUs. Of the 64 CPUs, most are reserved for the SLURM job scheduling system and can only be accessed via SLURM; the remaining few are used by the bastion server).
For what concerns GPUs, some of the physical A100 GPUs have been subdivided into “virtual” GPUs with different capabilities using Nvidia's MIG system. Thus, the GPU complement of Mufasa comprises the following devices:
- 10 GPUs with 20GB of RAM
- 3 GPUs with 40 GB of RAM
Thanks to MIG, users can use all the GPUs listed above as if they were all physical devices installed on Mufasa, without having to worry (or even know) which actually are and which instead are virtual GPUs. How these devices are made available to Mufasa users is explained in User Jobs.
You can use command
nvidia-smi -L
to get in-depth information about the physical and virtual GPUs available to users in a system based on MIG. (On Mufasa, this command needs to be launched in a bash shell opened through SLURM in order to be able to access the GPUs.)
Accessing Mufasa
User access to Mufasa is always remote and exploits the SSH (Secure SHell) protocol.
To open a remote connection to Mufasa, open a local terminal on your computer and, in it, run command
ssh <username>@<IP_address>
where username is the username on Mufasa of the user and <IP_address> is one of the IP addresses of Mufasa, i.e. either 10.79.23.96 or 10.79.23.97
For example, user mrossi may access Mufasa with command
ssh mrossi@10.79.23.97
Access via SSH works with Linux, MacOs and Windows 10 (and later) terminals. For other Windows users, a handy alternative tool (also including an X server, required to run on Mufasa Linux programs with a graphical user interface) is MobaXterm.
If you don't have a user account on Mufasa, you first have to ask your supervisor for one. See Users for more information about Mufasa's users.
As soon as you launch the ssh command, you will be asked to type the password (i.e., the one of your user account on Mufasa). Once you provide the password, the local terminal on your computer becomes a remote terminal (a “remote shell”) through which you interact with Mufasa. The remote shell sports a command prompt such as
<username>@mufasa2:~$
(mufasa2 is the Linux hostname of Mufasa). For instance, user mrossi will see a prompt similar to this:
mrossi@mufasa2:~$
In the remote shell, you can issue commands to Mufasa by typing them after the prompt, then pressing the enter key. Being Mufasa a Linux server, it will respond to all the standard Linux system commands such as pwd (which prints the path to the current directory) or cd <destination_dir> (which changes the current directory). On the internet you can find many tutorials about the Linux command line, such as this one.
To close the SSH session run
exit
from the command prompt of the remote shell.
VPN
To be able to connect to Mufasa, your computer must belong to Polimi's LAN. This happens either because the computer is physically located at Politecnico di Milano and connected via ethernet, or because you are using Polimi's VPN (Virtual Private Network) to connect to its LAN from somewhere else (such as your home). In particular, using the VPN is the only way to use Mufasa from outside Polimi. See this DEIB webpage for instructions about how to activate VPN access.
SSH timeout
SSH sessions to Mufasa may be subjected to an inactivity timeout: i.e., after a given inactivity period the ssh session gets automatically closed. Users who need to be able to reconnect to the very same shell where they launched a program (for instance because their program is interactive or because it provides progress update messages) should use the screen command.
SSH and graphics
The standard form of the ssh command, i.e. the one described at the beginning of Accessing Mufasa, should always be preferred. However, it only allows text communication with Mufasa. In special cases it may be necessary to remotely run (on Mufasa) Linux programs that have a graphical user interface. These programs require interaction with the X server of the remote user's machine (which must use Linux as well). A special mode of operation of ssh is needed to enable this. This mode is engaged by running command ssh like this:
ssh -X <your username on Mufasa>@<Mufasa's IP address>
Bastion server
Differently from Mufasa 1.0, Mufasa 2.0 employs a bastion server to manage user connections. The bastion server is a Linux virtual machine with very limited resources (no GPUs, few CPUs, small RAM). Its task is only to provide users with a way to log into the system and launch User Jobs with SLURM. Jobs launched via SLURM run on Mufasa 2.0's physical hardware (not on the virtual hardware of the bastion server) and therefore can access to the hardware resources of Mufasa, such as the GPUs.
When you access Mufasa via SSH, the remote shell you are provided with is a shell in the bastion server, unable to perform computationally heavy tasks: for heavy tasks you have to launch a SLURM job. The only tasks you can execute directly from the bastion server shell are simple "housekeeping" tasks on your home directory, such as deleting files you do not need anymore.
Please note that if you try to run computationally heavy processes in the bastion server you can easily overwhelm its scarce resources, making it unavailable to all users and thus making Mufasa unreachable by anyone. Don't do that!
File transfer
Uploading files from local machine to Mufasa and downloading files from Mufasa onto local machines is done using the SFTP protocol (Secure File Transfer Protocol).
Linux and MacOS users can directly use the sftp package, as explained (for instance) by this guide. Windows users can interact with Mufasa via SFTP protocol using the MobaXterm software package. MacOS users can interact with Mufasa via SFTP also with the Cyberduck software package.
For Linux and MacOS user, file transfer to/from Mufasa occurs via an interactive sftp shell, i.e. a remote shell very similar to the one one described in Accessing Mufasa. The first thing to do is to open a terminal and run the following command (note the similarity to SSH connections):
sftp <username>@<IP_address>
where username is the username on Mufasa of the user, and <IP_address> is either 10.79.23.96 or 10.79.23.97
You will be asked your password. Once you provide it, you access an interactive sftp shell, where the command prompt takes the form
sftp>
From this shell you can run the commands to exchange files. Most of these commands have two forms: one to act on the remote machine (in this case, Mufasa) and one to act on the local machine (i.e. your own computer). To differentiate, the “local” versions usually have names that start with the letter “l” (lowercase L).
cd <path>
to change directory to <path> on the remote machine.
lcd <path>
to change directory to <path> on the local machine.
get <filename>
to download (i.e. copy) <filename> from the current directory of the remote machine to the current directory of the local machine.
put <filename>
to upload (i.e. copy) <filename> from the current directory of the local machine to the current directory of the remote machine.
Naturally, a user can only upload files to directories where they have write permission (usually only their own /home directory and its subdirectories). Also, users can only download files from directories where they have read permission. (File permission on Mufasa follow the standard Linux rules.)
In addition to the terminal interface, users of Linux distributions based on Gnome (such as Ubuntu) can use a handy graphical tool to exchange files with Mufasa. In Gnome's Nautilus file manager, write
sftp://<username>@<IP_address>
in the address bar of Nautilus, where username is your username on Mufasa and <IP_address> is either 10.79.23.96 or 10.79.23.97. Nautilus becomes a graphical interface to Mufasa's remote filesystem.
Using Mufasa
This section provide a brief guide to Mufasa users (especially those who are not experienced in the use of Linux and/or remote servers) about interacting with the system.
Storage spaces
User jobs require storage of programs and data files. On Mufasa, the space available to users for data storage is the /home/ directory. /home/ contains two types of directories:
- Personal directories
- Location and access
- Personal directories are in
/home/. They are dedicated to individual users of Mufasa. - The home directory of user
UserNameis/home/UserName/.
- Personal directories are in
- Usage
- The home directory of a user is their own personal space on Mufasa. Space is limited (see Disk quotas), so you'll need to do some "housekeeping" to avoid filling it up.
- The general rule is: keep in your home directory only the files that the work you are doing on Mufasa right now needs; remove a file as soon as it is not needed anymore for your current work.
- Mufasa is not a storage space!
- Shared directories
- Location and access
- Shared directories are in
/home/shared/. They are dedicated to research groups, and each group decides internally how to manage the group's directory. - The shared directory of group
GroupNameis/home/shared/GroupName/. Users who belong to such group can read from and write to the directory. - Directory
/home/shared/common/is available to all research groups. Any user can read from and write to the directory.
- Shared directories are in
- Usage
- Shared directories are used:
- - To share data. If multiple users are using the same data, it makes sense to put the data in a shared directory instead of having multiple copies of them in each user's home directory.
- - For faster read/write. Shared spaces are physically located on faster disks wrt the personal home directories (SSDs instead of mechanical HDDs). When a processing job requires reading or writing very large amounts of data, placing such data in a shared directory can significantly speed up the job.
- Important! Shared directories are used by several people, so it's important to quickly remove from them any file that is not actively in use.
Disk quotas
In Mufasa, Storage spaces are subjected to quotas: i.e., the files that are stored in them cannot occupy more than a given amount of disk space. Quotas apply both to personal directories (e.g., /home/UserX/) and to shared directories (e.g., /home/shared/ResearchGroupY/).
The quotas assigned to your user and the amount of it that you are currently using can be inspected with command
quota -s
The output of quota -s is similar to the following:
Filesystem space quota limit grace files quota limit grace /dev/sdc2 5552K 100G 150G 60 0 0
Here is a simple guide to the output of quota -s.
- Column "Filesystems"
- each line is associated to one of the filesystems where the user has been assigned a disk quota. There will be at least one line, corresponding to
/home/
- Columns titled "space" and "files"
- tell the user how much of their quota they are actually using: the first in term of bytes, the second in term of number of files (more precisely, of inodes).
- Columns titled "quota"
- tell the user how much is their soft limit, in term of bytes and files respectively. If the value is 0, it means there is no limit.
- Columns titled "limit"
- tell the user how much is their hard limit, in term of bytes and files respectively. If the value is 0, it means there is no limit.
- Columns titled "grace"
- tell the user how long they are allowed to stay above their soft limit, for what concerns bytes and files respectively. When these columns are empty (as in the example above) the user is not over quota.
The meaning of soft limit and hard limit is the following.
The hard limit cannot be exceeded. When a user reaches their hard limit, they cannot use any more disk space: for them, the filesystem behaves as if the disks are out of space. Disk writes will fail, temporary files will fail to be created, and the user will start to see warnings and errors while performing common tasks. The only disk operation allowed is file deletion.
The soft limit is, as the word goes, softer. When a user exceeds it, they are not immediately prevented from using more disk space (provided that they stay below the hard limit). However, as the user goes beyond the soft limit, their grace period begins: i.e. a period within which the user must reduce their amount of data back to below the soft limit. During the grace period, the "grace" column(s) of the output of quota show how much of the grace period remains to the user. If the user is still above their soft limit at the end of the grace period, the quota system will treat the soft limit as a hard limit: i.e. it will force the user to delete data until they are below the soft limit before they can write on disk again.
In the output of quota -s, the grace columns are blank except when a soft limit has been exceeded.
Finding out how much disk space you are using
If your user is the owner of directory /path/to/dir/ you can find out how much disk space is used by the directory with command du like this:
du -sh /path/to/dir/
The -sh flag is used to ask for options -s (which provides the overall size of the directory) and -h (which provides human-readable values using measurement units such as K (KBytes), M (MBytes), G (GBytes)).
In particular, you can find out how much disk space is used by your home directory with command
du -sh ~
In fact, in Linux the symbol ~ is shorthand for the path to the user's home directory.
If you want a detailed summary of how much disk space is used by each item (i.e., subdirectory or file) in a directory you own, use command
du -h /path/to/dir/
Hidden files and directories
In Linux, directories and files with a leading "." in their name are hidden. Usually these do not appear in listings, such as the output of the ls command, to avoid cluttering them up: however, they still occupy disk space.
The output of command du, however, also considers hidden elements and provides their size: therefore it can help you understand why the quota system says that you are using more disk space than reported by ls.
To get a list of all the files in a directory, including hidden ones, use command
ls -a
Changing file/directory ownership and permissions
Every file or directory in a Linux system is owned by both a user and a group. User and group ownerships are not connected, so a file can have as group owner a group that its user owner does not belong to.
Being able to manipulate who owns a file and what permissions any user has on that file is often important in a multi-user system such as Mufasa. This is a recapitulation of the main Linux commands to manipulate file permissions. Key commands are
chownto change user ownershipchgrpto change group ownershipchmodto change access permissions
All three accept option -R (uppercase) for recursive operation, so -if needed- you can change ownership and/or permissions of all contents of a directory and its subdirectories with a single command.
The syntax of chown commands is
chown <new_user_owner> <path/to/file>
where <new_user_owner> is the user part of the new file ownership.
The syntax of chgrp commands is
chgrp <new_group_owner> <path/to/file>
where <new_group_owner> is the group part of the new file ownership.
User and group ownership for a file can also be both changed at the same time with
chown <new_user_owner>:<new_group_owner> <path/to/file>
For what concerns chmod, the easiest way to use it makes use of symbolic descriptions of the permissions. The format for this is
chmod [users]<+|-><permissions> <path/to/file>
where
<path/to/file>is the file or directory that the change is applied to
[users]isugoor a subset of it; the three letters correspond respectively:- to the user who owns
<path/to/file> - to the group that owns
<path/to/file> - to everyone else (others)
- If
[users]is not specified, it is assumed to beu
- to the user who owns
+or-correspond to adding or removing permissions<permissions>isrwxor a subset, corresponding to read, write and execute permissions
Note that r, w and x permission have a different meaning for files and for directories.
- For files
- permission
rallows to read the contents of the file - permission
wallows to change the contents of the file - permission
xallows to execute the file (provided that it is a program: e.g., a shell script)
- For directories
- permission
rallows to list the files within the directory - permission
wallows to create, rename, or delete files within the directory - permission
xallows to enter the directory (i.e.,cdinto it) and access its files
For instance, if the owner of file myfile.txt runs
chmod g+rwx myfile.txt
they are granting permission to read, write and execute myfile.txt to all the Linux users belonging to the same group of the user.
If the owner of directory mydir runs
chmod go-x mydir
they are taking away permission to enter directory mydir from everyone except the user who owns the directory.
If you want additional information about file and directory permissions in a Linux system work, this is a good online guide.
Docker containers
As a general rule, all computation performed on Mufasa must occur within Docker containers. From Docker's documentation:
“Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure.
Docker provides the ability to package and run an application in a loosely isolated environment called a container. The isolation and security allow you to run many containers simultaneously on a given host. Containers are lightweight and contain everything needed to run the application, so you do not need to rely on what is currently installed on the host.
A container is a sandboxed process on your machine that is isolated from all other processes on the host machine. When running a container, it uses an isolated filesystem. [containing] everything needed to run an application - all dependencies, configuration, scripts, binaries, etc. The image also contains other configuration for the container, such as environment variables, a default command to run, and other metadata.”
Using Docker allows each user of Mufasa to build the software environment that their job(s) require. In particular, using Docker containers enables users to configure their own (containerized) system and install any required libraries on their own, without need to ask administrators to modify the configuration of Mufasa. As a consequence, users can freely experiment with their (containerized) system without risk to the work of other users and to the stability and reliability of Mufasa. In particular, containers allow users to run jobs that require multiple and/or obsolete versions of the same library.
A large number of preconfigured Docker containers are already available, so users do not usually need to start from scratch in preparing the environment where their jobs will run on Mufasa. The official Docker container repository is dockerhub.
How to run Docker containers on Mufasa is explained in User Jobs. There is also a page of this wiki dedicated to the preparation of Docker containers.
The SLURM job scheduling system
Mufasa uses SLURM (Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management) to manage shared access to its resources.
From SLURM's documentation:
“Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.”
Users of Mufasa must use SLURM to run any resource-heavy process.
A "resource-heavy process" is any computing job that requires one or more of the following:
- GPUs
- multiple CPUs
- more than a small amount of RAM.
Jobs run via SLURM have access to all the resources of Mufasa. Jobs run outside SLURM are executed by the bastion server virtual machine, which has minimal resources and no GPUs. Using SLURM is therefore the only way to execute resource-heavy jobs on Mufasa (this is a key difference between Mufasa 1.0 and Mufasa 2.0).
The use of a job scheduling system such as SLURM ensures that Mufasa's resources are exploited in an efficient way. The fact that a schedule exists means that usually a job does not get immediately executed as soon as it is launched: instead, the job gets queued and will be executed as soon as possible, according to the availability of resources in the machine.
Useful references for SLURM users are the collected man pages and the command overview.
SLURM is capable of managing complex computing systems composed of multiple clusters (i.e. sets) of machines, each comprising one node (i.e. machine) or more. The case of Mufasa is the simplest of all: Mufasa is the single node (called gn01) of a SLURM computing cluster composed of that single machine.
In order to let SLURM schedule job execution, before launching a job a user must specify what resources (such as RAM, processor cores, GPUs, ...) it requires. In managing process queues, SLURM considers such requirements and matches them with available resources. As a consequence, resource-heavy jobs generally take longer before thet get executed, while less demanding jobs are given higher priority in the execution queue.
All in all, the take-away message is: do not request more resources than your job actually needs.
In User Jobs it will be explained how the process of requesting resources is greatly simplified by making use of process queues with predefined resource allocations called partitions.
