Difference between revisions of "System"

From Mufasa (BioHPC)
Jump to navigation Jump to search
 
(228 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Mufasa is a Linux server located in a server room managed by the [[Roles|System Administrators]]. [[Roles|Job Administrators]] and [[Roles|Job Users]] can only access Mufasa remotely.  
Mufasa is a Linux server located in a server room managed by the [[Roles|System Administrators]].


Remote access to Mufasa is performed using the SSH protocol for the execution of commands and the SFTP protocol for the exchange of files. Once logged in, a user interacts with Mufasa via a terminal (text-based) interface.
[[Roles|Job Users]] and [[Roles|Job Administrators]] can only access Mufasa remotely.  


Remote access to Mufasa is performed using the [[System#Accessing Mufasa|SSH protocol]] for the execution of commands and the [[System#File transfer|SFTP protocol]] for the exchange of files. Once logged in, a user interacts with Mufasa via a terminal (text-based) interface.


= Hardware =


= Hardware =
[[File:hw.png|right|320px]]
Mufasa is a server for massively parallel computation. It has been set up and configured by [https://www.e4company.com/en/ E4 Computer Engineering] with the support of the [http://www.biomech.polimi.it/ Biomechanics Group], the [http://www.cartcas.polimi.it/ CartCasLab] laboratory and the [https://nearlab.polimi.it/ NearLab] laboratory.


Mufasa is a server for massively parallel computation. Its main hardware components are:
Mufasa's main hardware components are:


* 32-core, 64-thread AMD processor
* 2 AMD Epyc 7542 32-core processors (64 CPU cores total)
* 1 TB RAM
* 1 TB RAM
* 9 TB of SSDs (for OS and execution cache)
* 9 TB of SSDs (for OS and [[User Jobs#Automatic job caching|job caching]])
* 28TB of HDDs (for user /home directories)
* 28TB of HDDs (for user <code>/home</code> directories)
* 5 Nvidia A100 GPUs [based on the ''Ampere'' architecture]
* 5 Nvidia A100 GPUs [based on the ''Ampere'' architecture]
* Linux Ubuntu operating system
* [https://ubuntu.com/ Ubuntu Linux] operating system
 
Usually each of these resources (e.g., a GPU) is not fully assigned to a single user or a single job. On the contrary, resources are shared among different users and processes in order to optimise their usage and availability. Most of the management of this sharing is done by [[System#The SLURM job scheduling system|SLURM]].
 
== CPUs and GPUs ==
 
Mufasa is fitted with two 32-core CPU, so the system has a total of 64 phyical CPUs (each of which can run 2 threads). Of the 64 CPUs, 2 are reserved for jobs run outside the [[System#The SLURM job scheduling system|SLURM job scheduling system]] (i.e., for low-power "housekeeping" tasks) while the remaining 62 are reserved for jobs run via SLURM.
 
For what concerns GPUs, some of the 5 physical A100 processing cards (i.e., GPUs) are subdivided into “virtual” GPUs with different capabilities using [https://docs.nvidia.com/datacenter/tesla/mig-user-guide/ Nvidia's MIG system]. Command
 
<pre style="color: lightgrey; background: black;">
nvidia-smi -L
</pre>
 
provides an overview of the physical and virtual GPUs available to users in a system. (On Mufasa, this command may require to be launched in a bash shell via the SLURM job scheduling system (as explained in Section 2 of this document) in order to be able to access the GPUs.) The output of <code>nvidia-smi -L</code> is similar to the following:
 
<small><pre style="color: lightgrey; background: black;">
GPU 0: NVIDIA A100-PCIE-40GB (UUID: GPU-a9f6e4f2-2877-8642-1802-5eeb3518d415)
  MIG 3g.20gb    Device  0: (UUID: MIG-dd1ccc27-d106-5cd9-80f1-b6291f0d682d)
  MIG 3g.20gb    Device  1: (UUID: MIG-abe13a42-013b-5bef-aa5e-bbd268d72447)
GPU 1: NVIDIA A100-PCIE-40GB (UUID: GPU-5f28ca0a-5b2c-bfc7-5b9f-581b5ca1d110)
  MIG 3g.20gb    Device  0: (UUID: MIG-07372a92-2e37-5ad6-b334-add0100cf5e3)
  MIG 3g.20gb    Device  1: (UUID: MIG-a704d927-7303-5077-ab7c-6ead57329233)
GPU 2: NVIDIA A100-PCIE-40GB (UUID: GPU-fb86701b-5781-b63c-5cda-911cff3a5edb)
GPU 3: NVIDIA A100-PCIE-40GB (UUID: GPU-bbeed512-ab4c-e984-cfea-8067c009a600)
  MIG 3g.20gb    Device  0: (UUID: MIG-0d1232cd-6b37-5ac7-b00f-a9fdf6997b72)
  MIG 3g.20gb    Device  1: (UUID: MIG-bdbcf24a-a0aa-56fb-a7e4-fc18f17b7f24)
GPU 4: NVIDIA A100-PCIE-40GB (UUID: GPU-a9511357-2476-7ddf-c4c5-c90feb68acfd)
</pre></small>


Usually each of these resources (e.g., a GPU) is not fully assigned to a single user or a single job. On the contrary, access resources are shared among different users and processes in order to optimise their usage and availability.
This output shows that the physical Nvidia A100 GPUs installed on Mufasa have been so subdivided:


For what concerns GPUs, the 5 physical A100 GPUs are subdivided into “virtual” GPUs with different capabilities using Nvidia' MIG system. From [https://docs.nvidia.com/datacenter/tesla/mig-user-guide/ MIG's user guide]:
* two of the physical GPUs (GPU 2 and GPU 4) have not been subdivided at all
* three of the physical GPUs (GPU 0, GPU 1 and GPU 3) have been subdivided into 2 virtual GPUs with 20 GB of RAM each


<blockquote>“''The Multi-Instance GPU (MIG) feature allows GPUs based on the NVIDIA Ampere architecture (such as NVIDIA A100) to be securely partitioned into up to seven separate GPU Instances for CUDA applications, providing multiple users with separate GPU resources for optimal GPU utilization. This feature is particularly beneficial for workloads that do not fully saturate the GPU’s compute capacity and therefore users may want to run different workloads in parallel to maximize utilization.''”
Thanks to MIG, users can use all the GPUs listed above as if they were all physical devices installed on Mufasa, without having to worry (or even know) which actually are and which instead are virtual GPUs.
</blockquote>


In practice, MIG allows flexible partitioning of a very powerful (but single) GPU to create multiple virtual GPUs with different capabilities, that are then made available to users as if they were separate devices.
All in all, then, users of Mufasa are provided with the following set of '''8 GPUs''':


Command
:; 2 GPUs with 40 GB of RAM each
:; 6 GPUs with 20 GB of RAM each


<code>[https://developer.nvidia.com/nvidia-system-management-interface '''nvidia-smi''']</code>
How these devices are made available to Mufasa users is explained in [[User Jobs]].


(“smi” stands for System Management Interface) provides an overview of the physical and virtual GPUs available to users in a system<ref>On Mufasa, this command may require to be launched via the SLURM job scheduling system (as explained in Section 2 of this document) in order to be able to access the GPUs.
= Accessing Mufasa =
</ref>.


User access to Mufasa is always remote and exploits the ''SSH'' (''Secure SHell'') protocol.


To open a remote connection to Mufasa, open a local terminal on your computer and, in it, run command


= Accessing Mufasa via SSH =
<pre style="color: lightgrey; background: black;">
ssh <username>@<IP_address>
</pre>


User access to Mufasa is always remote and exploits the ''SSH'' (''Secure SHell'') protocol. To open a remote connection to Mufasa, open a local terminal on your computer and, in it, run command<ref>Linux, MacOs and Windows 10 (and later) terminals can be used. All, in fact, include the required SSH client. A handy alternative tool for Windows (also including an X server, required to run on Mufasa Linux programs with a graphical user interface) is [https://mobaxterm.mobatek.net/ MobaXterm].
where <code>username</code> is the username on Mufasa of the user and <code><IP_address></code> is one of the IP addresses of Mufasa, i.e. either <code>'''10.79.23.96'''</code> or <code>'''10.79.23.97'''</code>
</ref>


'''''ssh &lt;''''''your''''''_''''''username''''''_''''''on''''''_''''''Mufasa''''''&gt;@&lt;Mufasa's''''''_''''''IP''''''_''''''address''''''&gt;'''''
For example, user <code>mrossi</code> may access Mufasa with command


where ''&lt;Mufasa's_IP_address&gt;'' is any of the following two addresses:
<code>ssh mrossi@10.79.23.97</code>


'''''10.79.23.96'''''
Access via SSH works with Linux, MacOs and Windows 10 (and later) terminals. For Windows users, a handy alternative tool (also including an X server, required to run on Mufasa Linux programs with a graphical user interface) is [https://mobaxterm.mobatek.net/ MobaXterm].


10.79.23.97
If you don't have a user account on Mufasa, you first have to ask your supervisor for one. See [[Users]] for more information about Mufasa's users.


If you don't have a user configured on Mufasa, you first have to ask your supervisor for one. Information about the creation of users are provided by Section 1.6.
As soon as you launch the ''ssh'' command, you will be asked to type the password (i.e., the one of your user account on Mufasa). Once you provide the password, the local terminal on your computer becomes a remote terminal (a “remote shell”) through which you interact with Mufasa. The remote shell sports a ''command prompt'' such as


In order to connect to Mufasa your computer must belong to Polimi's LAN, either because it is physically located at Politecnico di Milano, or because you are using Polimi's VPN. Ask your supervisor about the VPN if you need to connect to Mufasa from non-Polimi locations, such as your home.
<pre style="color: lightgrey; background: black;">
<username>@rk018445:~$
</pre>


As soon as you launch the ''ssh'' command, you will be asked to type the password (i.e. the one of your user account on Mufasa). Once the password has been provided, the local terminal on your computer becomes a remote terminal (a “remote shell”) through which you interact with Mufasa<ref>The standard form of the ''ssh'' command, i.e. the one described above, should always be preferred. In special cases it may be necessary to remotely run (on Mufasa) Linux programs that have a graphical user interface. These programs require interaction with the X server of the remote user's Linux machine, and a special mode of operation of ''ssh'' is needed to enable this. This mode is engaged by running the command like this:<br />
(''rk018445'' is the Linux hostname of Mufasa). For instance, user <code>mrossi</code> will see a prompt similar to this:
''ssh -X &lt;your username on Mufasa&gt;@&lt;Mufasa's IP address&gt;''
</ref>. The shell sports a command prompt such as


&lt;your_username_on_Mufasa&gt;@rk018445:~$
<code>mrossi@rk018445:~$</code>


(''rk018445'' is the Linux hostname of Mufasa). You can issue commands to Mufasa by typing them after the prompt, then pressing the ''enter'' key. Being Mufasa a Linux server, it will respond to all the standard Linux system commands such as ''pwd'' (which prints the path to the current directory) or ''cd &lt;destination_dir&gt;'' (which changes the current directory'')''. On the internet you can find many tutorials about the Linux command line: for instance [https://linuxcommand.org/index.php this one].
In the remote shell, you can issue commands to Mufasa by typing them after the prompt, then pressing the ''enter'' key. Being Mufasa a Linux server, it will respond to all the standard Linux system commands such as <code>pwd</code> (which prints the path to the current directory) or <code>cd <destination_dir></code> (which changes the current directory). On the internet you can find many tutorials about the Linux command line, such as [https://linuxcommand.org/index.php this one].


To close the SSH session, just run
To close the SSH session run


'''''exit'''''
<pre style="color: lightgrey; background: black;">
exit
</pre>


from the command prompt of the remote shell.
from the command prompt of the remote shell.


SSH sessions to Mufasa are subjected to an inactivity timeout, i.e. after a given period during which no interaction between user and Mufasa occurred, the ssh session gets automatically closed and a new one must be opened in order to continue work. Users who need to be able to reconnect to the very same shell where they launched a program (for instance because their program is interactive or because it provides progress update messages) should use the ''screen'', as explained later in this document.
== VPN ==
 
To be able to connect to Mufasa, your computer must belong to Polimi's LAN. This happens either because the computer is physically located at Politecnico di Milano and connected via ethernet, or because you are using Polimi's VPN (Virtual Private Network) to connect to its LAN from somewhere else (such as your home). In particular, using the VPN is the ''only'' way to use Mufasa from outside Polimi. See [https://intranet.deib.polimi.it/ita/vpn-wifi this DEIB webpage] for instructions about how to activate VPN access.
 
== SSH timeout ==
 
SSH sessions to Mufasa may be subjected to an inactivity timeout: i.e., after a given inactivity period the ssh session gets automatically closed. Users who need to be able to reconnect to the very same shell where they launched a program (for instance because their program is interactive or because it provides progress update messages) should [[User Jobs#Detaching from a running job with screen|use the ''screen'' command]].
 
== SSH and graphics ==
 
The standard form of the ''ssh'' command, i.e. the one described at the beginning of [[system#Accessing Mufasa|Accessing Mufasa]], should always be preferred. However, it only allows text communication with Mufasa. In special cases it may be necessary to remotely run (on Mufasa) Linux programs that have a graphical user interface. These programs require interaction with the X server of the remote user's machine (which must use Linux as well). A special mode of operation of ''ssh'' is needed to enable this. This mode is engaged by running command <code>ssh</code> like this:
 
<code> ssh -X <your username on Mufasa>@<Mufasa's IP address></code>
 
== File transfer ==
 
Uploading files from local machine to Mufasa and downloading files from Mufasa onto local machines is done using the ''SFTP'' protocol (''Secure File Transfer Protocol'').
 
Linux and MacOS users can directly use the ''sftp'' package, as explained (for instance) by [https://geekflare.com/sftp-command-examples/ this guide]. Windows users can interact with Mufasa via SFTP protocol using the [https://mobaxterm.mobatek.net/ MobaXterm] software package. MacOS users can interact with Mufasa via SFTP also with the [https://cyberduck.io/ Cyberduck] software package.
 
For Linux and MacOS user, file transfer to/from Mufasa occurs via an ''interactive sftp shell'', i.e. a remote shell very similar to the one one described in [[Accessing Mufasa|Accessing Mufasa]].
The first thing to do is to open a terminal and run the following command (note the similarity to SSH connections):
 
<pre style="color: lightgrey; background: black;">
sftp <username>@<IP_address>
</pre>
 
where <code>username</code> is the username on Mufasa of the user, and <code><IP_address></code> is either <code>'''10.79.23.96'''</code> or <code>'''10.79.23.97'''</code>
 
You will be asked your password. Once you provide it, you access an interactive sftp shell, where the command prompt takes the form
 
<pre style="color: lightgrey; background: black;">
sftp>
</pre>
 
From this shell you can run the commands to exchange files. Most of these commands have two forms: one to act on the remote machine (in this case, Mufasa) and one to act on the local machine (i.e. your own computer). To differentiate, the “local” versions usually have names that start with the letter “l” (lowercase L).
 
<pre style="color: lightgrey; background: black;">
cd <path>
</pre>
to change directory to <code><path></code> on the remote machine.
 
<pre style="color: lightgrey; background: black;">
lcd <path>
</pre>
to change directory to <code><path></code> on the local machine.
 
<pre style="color: lightgrey; background: black;">
get <filename>
</pre>
to download (i.e. copy) <code><filename></code> from the current directory of the remote machine to the current directory of the local machine.
 
<pre style="color: lightgrey; background: black;">
put <filename>
</pre>
to upload (i.e. copy) <code><filename></code> from the current directory of the local machine to the current directory of the remote machine.
 
Naturally, a user can only upload files to directories where they have write permission (usually only their own /home directory and its subdirectories). Also, users can only download files from directories where they have read permission. (File permission on Mufasa follow the standard Linux rules.)
 
In addition to the terminal interface, users of Linux distributions based on Gnome (such as Ubuntu) can use a handy graphical tool to exchange files with Mufasa. In Gnome's Nautilus file manager, write
 
<code>sftp://<username>@<IP_address></code>
 
in the address bar of Nautilus, where <code>username</code> is your username on Mufasa and <code><IP_address></code> is either <code>10.79.23.96</code> or <code>10.79.23.97</code>. Nautilus becomes a graphical interface to Mufasa's remote filesystem.
 
= Using Mufasa =
 
This section provide a brief guide to Mufasa users (especially those who are not experienced in the use of Linux and/or remote servers) about interacting with the system.
 
== Storage spaces ==
 
User jobs require storage of programs and data files. On Mufasa, the space available to users for data storage is the <code>/home/</code> directory. <code>/home/</code> contains three types of directories:
 
; Personal directories
: Each user has a personal ''home directory'' where they can store their own files. The home directory is the one with the same name of the user. By default, only the owner of a home directory can access its contents.
 
; Group directories
: Each research group has a common ''group directory'' where group members can store files that they share with other group members. The group directory is the one called <code>shared-<groupname></code>, where <code><groupname></code> is the corresponding [[Users#Group_names|user group]]. The owner of group directory is user <code>root</code>, while group ownership is assigned to <code><groupname></code>. On Mufasa, group directories have GUID activated. This means that any file or directory created inside <code>shared-<groupname></code> has group ownership assigned to <code><groupname></code>: so editing permissions on the new file or directory extend to all group members.
 
; The ''<code>shared-public</code>'' directory
: This is a shared directory common to all users of Mufasa. Users that share files but do not belong to the same research group can use it to store their shared files.
 
== Disk quotas ==
 
On Mufasa, the directories in <code>/home/</code> must be used as a temporary storage area for user programs and their data, limited to the execution period of the jobs that use the data. They are not intended for long-term storage. For this reason, disk usage is subjected to a quota system.
 
=== User quotas ===
 
Each user is assigned a ''disk quota'', i.e. an amount of space that they can use before the user is blocked by the quota system. Note that the quota applies not only to the data created and/or uploaded by you as a user, but also to data created by programs run by your user.
 
The quotas assigned to your user and the amount of it that you are currently using can be inspected with command
 
<pre style="color: lightgrey; background: black;">
quota -s
</pre>
 
The output of <code>quota -s</code> is similar to the following:
 
<pre style="color: lightgrey; background: black;">
Filesystem  space  quota  limit  grace  files  quota  limit  grace
/dev/sdb1  11104K    100G    150G              1      0      0       
/dev/sdc2  5552K    100G    150G              60      0      0       
</pre>
 
Here is a simple guide to the output of <code>quota -s</code>.
 
:; Column "Filesystems"
:: identifies the filesystems where the user has been assigned a disk quota. On Mufasa, <code>/dev/sdb1</code> is the SSD disk space used as [[User Jobs#Automatic job caching|cache space]], while <code>/dev/sdc2</code> is the HDD space used for the <code>/home</code> directories.
 
:; Columns titled "space" and "files"
:: tell the user how much of their quota they are actually using: the first in term of bytes, the second in term of number of files (more precisely, of ''inodes'').
 
:; Columns titled "quota"
:: tell the user how much is their ''soft limit'', in term of bytes and files respectively. If the value is 0, it means there is no limit.
 
:; Columns titled "limit"
:: tell the user how much is their ''hard limit'', in term of bytes and files respectively. If the value is 0, it means there is no limit.
 
:; Columns titled "grace"
:: tell the user how long they are allowed to stay above their ''soft limit'',  for what concerns bytes and files respectively. When these columns are empty (as in the example above) the user is not over quota.
 
The meaning of '''soft limit''' and '''hard limit''' is the following.
 
The hard limit cannot be exceeded. When a user reaches their hard limit, they cannot use any more disk space: for them, the filesystem behaves as if the disks are out of space. Disk writes will fail, temporary files will fail to be created, and the user will start to see warnings and errors while performing common tasks. The only disk operation allowed is file deletion.
 
The soft limit is, as the word goes, softer. When a user exceeds it, they are not immediately prevented from using more disk space (provided that they stay below the hard limit). However, as the user goes beyond the soft limit, their '''grace period''' begins: i.e. a period within which the user must reduce their amount of data back to below the soft limit. During the grace period, the "grace" column(s) of the output of <code>quota</code> show how much of the grace period remains to the user. If the user is still above their soft limit at the end of the grace period, the quota system will treat the soft limit as a hard limit: i.e. it will force the user to delete data until they are below the soft limit before they can write on disk again.
 
In the output of <code>quota -s</code>, the grace columns are blank except when a soft limit has been exceeded.
 
=== Group and project quotas ===
 
While on Mufasa disk quotas are usually assigned ''per-user'', the quota system also enables the setup of ''per-group'' quotas (i.e., limits to the disk space that, collectively, a group of users can use) and ''per-project'' quotas (i.e., limits to the amount of data that a specific directory and all its subdirectories can contain).
 
A comprehensive view of the quota situation for one's user and user groups is provided by command
 
<pre style="color: lightgrey; background: black;">
quotainfo
</pre>
 
For what concerns project quotas, on Mufasa they are applied to group directories in <code>/home/</code>.
 
== Finding out how much disk space you are using ==
 
If your user is the owner of directory <code>/path/to/dir/</code> you can find out how much disk space is used by the directory with command <code>du</code> like this:
 
<pre style="color: lightgrey; background: black;">
du -sh /path/to/dir/
</pre>
 
The <code>-sh</code> flag is used to ask for options <code>-s</code> (which provides the overall size of the directory) and <code>-h</code> (which provides ''human-readable'' values using measurement units such as K (KBytes), M (MBytes), G (GBytes)).
 
In particular, you can find out how much disk space is used by your home directory with command
 
<pre style="color: lightgrey; background: black;">
du -sh ~
</pre>
 
In fact, in Linux the symbol <code>~</code> is shorthand for the path to the current user's home directory.
 
If you want a detailed summary of how much disk space is used by each item (i.e., subdirectory or file) in a directory you own, use command
 
<pre style="color: lightgrey; background: black;">
du -h /path/to/dir/
</pre>
 
For instance, for user gfontana the output of
 
<pre style="color: lightgrey; background: black;">
du -h ~
</pre>
 
may be similar to the following
 
<pre style="color: lightgrey; background: black;">
gfontana@rk018445:~$ du -h ~
12K /home/gfontana/.ssh
356K /home/gfontana/.cache/gstreamer-1.0
5.0M /home/gfontana/.cache/tracker
5.3M /home/gfontana/.cache
  [...other similar lines...]
4.0K /home/gfontana/.config/htop
32K /home/gfontana/.config
8.0K /home/gfontana/.slurm
6.3M /home/gfontana</pre>


=== Hidden files and directories ===


In Linux, directories and files with a leading "." in their name are ''hidden''. These do not appear in listings, such as the output of the <code>ls</code> command, to avoid cluttering them up: however, they still occupy disk space.


= File transfer =
The output of command <code>du</code>, however, also considers hidden elements and provides their size: therefore it can help you understand why the quota system says that you are using more disk space than reported by <code>ls</code>.


Uploading files from local machine to Mufasa and downloading files from Mufasa onto local machines is done using the ''sftp'' (''Secure File Transfer Protocol'') protocol.
== Changing file/directory ownership and permissions ==


For this, Linux and MacOS users can directly use the ''sftp'' package, as explained (for instance) in [https://geekflare.com/sftp-command-examples/ this guide]. In order to access Mufasa for file transfer, the first thing to do is to run the following command (note the similarity to SSH connections):
Every file or directory in a Linux system is owned by both a user and a group. User and group ownerships are not connected, so a file can have as group owner a group that its user ownwer does not belong to.


'''''s''''''ftp'''''' &lt;''''''your''''''_''''''username''''''_''''''on''''''_''''''Mufasa''''''&gt;@&lt;Mufasa's''''''_''''''IP''''''_''''''address''''''&gt;'''''
Being able to manipulate who owns a file and what permissions any user has on that file is often important in a multi-user system such as Mufasa. This is a recapitulation of the main Linux commands to manipulate file permissions. Key commands are


You will be asked your password. Once you provide it, you access (via the terminal) an interactive sftp shell, where the command prompt takes the form
:'''<code>chown</code>''' to change ownership - user part
:'''<code>chgrp</code>''' to change ownership - group part
:'''<code>chmod</code>''' to change access permissions


''sftp&gt;''
All three accept option <code>-R</code> (uppercase) for recursive operation, so -if needed- you can change ownership and/or permissions of all contents of a directory and its subdirectories with a single command.


You can run the required ''sftp'' commands from this shell. Most of these commands have two forms: one to act on the remote machine (i.e. Mufasa) and one to act on the local machine (i.e. the user's computer). To differentiate, the “local” versions usually have names that start with the letter “l” (lowercase L).
The syntax of <code>chown</code> commands is


MacOS users can interact with Mufasa via SFTP also using the [https://cyberduck.io/ Cyberduck] software package.
<pre style="color: lightgrey; background: black;">
chown <new_owner> <path/to/file>
</pre>


Windows users can interact with Mufasa via SFTP protocol using the [https://mobaxterm.mobatek.net/ MobaXterm] software package.
where <code><new_owner></code> is the user part of the new file ownership.


The most basic ''sftp'' commands (to be issued from the sftp command prompt) are:
The syntax of <code>chgrp</code> commands is


'''''cd ''''''&lt;''''''path''''''&gt;'''''Change directory to &lt;path&gt; on remote machine (i.e. Mufasa)
<pre style="color: lightgrey; background: black;">
chgrp <new_group> <path/to/file>
</pre>


'''''lcd ''''''&lt;''''''path''''''&gt;'''''''''''Change directory to &lt;path&gt; on local machine (i.e. user's machine)
where <code><new_owner></code> is the group part of the new file ownership.


'''''get &lt;file&gt;'''''Downloads (i.e. copies) &lt;file&gt; from current directory of remote<br />
User and group ownership for a file can also be both changed at the same time with
machine tocurrent directory of local machine


'''''put &lt;file&gt;'''''Uploads (i.e. copies) &lt;file&gt; from current directory of local machine to<br />
<pre style="color: lightgrey; background: black;">
current directory of remote machine
chown <new_owner>:<new_group> <path/to/file>
</pre>


'''''exit'''''Quit sftp
For what concerns <code>chmod</code>, the easiest way to use it makes use of symbolic descriptions of the permissions. The format for this is


Of course, a user can only upload files to directories where they have write permission (usually only their own /home directory and its subdirectories), and can only download files that they have read permission.
<pre style="color: lightgrey; background: black;">
chmod [users]<+|-><permissions> <path/to/file>
</pre>


where


:<code><path/to/file></code> is the file or directory that the change is applied to


= Docker containers =
:<code>[users]</code> is '''<code>ugo</code>''' or a subset of it; the three letters correspond respectively:
:::to the '''u'''ser who owns <code><path/to/file></code> (also used if <code>[users]</code> is not specified)
:::to the '''g'''roup that owns <code><path/to/file></code>
:::to everyone else ('''o'''thers)
:'''<code>+</code>''' or '''<code>-</code>''' correspond to adding or removing permissions
:<code><permissions></code> is '''<code>rwx</code>''' or a subset, corresponding to '''r'''ead, '''w'''rite and e'''x'''ecute permissions


'''As a general rule, all computation performed on Mufasa must occur within '''[https://www.docker.com/ '''Docker containers''']. This allows every user to configure their own execution environment without any risk of interfering with everyone else's.
Note that <code>r</code>, <code>w</code> and <code>x</code> permission have a different meaning for files and for directories.


From [https://docs.docker.com/get-started/ Docker's documentation]:
;For files:
: permission <code>r</code> allows to read the contents of the file
: permission <code>w</code> allows to change the contents of the file
: permission <code>x</code> allows to execute the file (provided that it is a program: e.g., a shell script)


<blockquote>“''Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure.''
;For directories:
</blockquote>
: permission <code>r</code> allows to list the files within the directory
<blockquote>Docker provides the ability to package and run an application in a loosely isolated environment called a container. The isolation and security allow you to run many containers simultaneously on a given host. Containers are lightweight and contain everything needed to run the application, so you do not need to rely on what is currently installed on the host.
: permission <code>w</code> allows to create, rename, or delete files within the directory
</blockquote>
: permission <code>x</code> allows to enter the directory (i.e., <code>cd</code> into it) and access its files
<blockquote>''A container is a sandboxed process on your machine that is isolated from all other processes on the host machine. When running a container, it uses an isolated filesystem. [containing] everything needed to run an application - all dependencies, configuration, scripts, binaries, etc. The image also contains other configuration for the container, such as environment variables, a default command to run, and other metadata.''”
</blockquote>
Using Docker allows each user of Mufasa to build the software environment that their job(s) require. In particular, using Docker containers enables users to configure their own (containerized) system and install any required libraries on their own, without need to ask administrators to modify the configuration of Mufasa. As a consequence, users can freely experiment with their (containerized) system without risk to the work of other users and to the stability and reliability of Mufasa. In particular, containers allow users to run jobs that require multiple and/or obsolete versions of the same library.


A large number of preconfigured Docker containers are already available, so users do not usually need to start from scratch in preparing the environment where their jobs will run on Mufasa. The official Docker container repository is [https://hub.docker.com/search?q=&type=image dockerhub].
For instance


How to run Docker containers on Mufasa will be explained in Part 2 of this document.
<pre style="color: lightgrey; background: black;">
chmod g+rwx myfile.txt
</pre>


adds permission to read, write and execute myfile.txt to all the Linux users of the same group of the user that the file belongs to;


<pre style="color: lightgrey; background: black;">
chmod go-x mydir
</pre>


== <span id="anchor-6"></span>The SLURM job scheduling system ==
takes away permission to enter directory <dirname> from everyone except the user who owns the directory.


Mufasa uses [https://slurm.schedmd.com/overview.html SLURM] to manage shared access to its resources. '''Users of Mufasa must use SLURM to run and manage the jobs they run on the machine'''<ref>It is possible for users to run jobs without using SLURM; however, running jobs run this way is only intended for “housekeeping” activities and only provides access to a small subset of Mufasa's resources. For instance, jobs run outside SLURM cannot access the GPUs, can only use a few processor cores, can only access a small portion of RAM. Using SLURM is therefore necessary for any resource-intensive job.
If you want additional information about file and directory permissions in a Linux system work, [https://www.redhat.com/sysadmin/linux-file-permissions-explained this is a good online guide].
</ref>. From [https://slurm.schedmd.com/documentation.html SLURM's documentation]:


<blockquote>“''Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.''”
= Docker containers =
</blockquote>
The use of a job scheduling system ensures that Mufasa's resources are exploited in an efficient way. However, the fact that a schedule exists means that usually a job does not get immediately executed as soon as it is launched: instead, the job gets ''queued'' and will be executed as soon as possible, according to the availability of resources in the machine.


Useful references for SLURM users are the [https://slurm.schedmd.com/man_index.html collected man pages] and the [https://slurm.schedmd.com/pdfs/summary.pdf command overview].
[[File:262px-docker_logo_cropped.jpg|right|262px]]
'''As a general rule, all computation performed on Mufasa must occur within [https://www.docker.com/ Docker containers]'''. From [https://docs.docker.com/get-started/ Docker's documentation]:


In order to let SLURM schedule job execution, before launching a job a user must specify what resources (such as RAM, processor cores, GPUs, ...) it requires. While managing process queues, SLURM will consider such requirements and match them with the available resources. As a consequence, resource-heavy jobs generally take longer to get executed, while less demanding jobs are usually put into execution quickly. On the other hand, processes that -while running- try to use more resources than they requested get killed by SLURM to avoid damaging other jobs.
<blockquote>
“''Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure.


All in all, the take-away message is: ''consider carefully how much resources to ask for your job''.
Docker provides the ability to package and run an application in a loosely isolated environment called a container. The isolation and security allow you to run many containers simultaneously on a given host. Containers are lightweight and contain everything needed to run the application, so you do not need to rely on what is currently installed on the host.


In Part 2 of this document it will be explained how resource requests can be greatly simplified by making use of predefined resource sets called ''SLURM partitions''.
A container is a sandboxed process on your machine that is isolated from all other processes on the host machine. When running a container, it uses an isolated filesystem. [containing] everything needed to run an application - all dependencies, configuration, scripts, binaries, etc. The image also contains other configuration for the container, such as environment variables, a default command to run, and other metadata.''”
</blockquote>
Using Docker allows each user of Mufasa to build the software environment that their job(s) require. In particular, using Docker containers enables users to configure their own (containerized) system and install any required libraries on their own, without need to ask administrators to modify the configuration of Mufasa. As a consequence, users can freely experiment with their (containerized) system without risk to the work of other users and to the stability and reliability of Mufasa. In particular, containers allow users to run jobs that require multiple and/or obsolete versions of the same library.


A large number of preconfigured Docker containers are already available, so users do not usually need to start from scratch in preparing the environment where their jobs will run on Mufasa. The official Docker container repository is [https://hub.docker.com/search?q=&type=image dockerhub].


How to run Docker containers on Mufasa is explained in [[User Jobs|User Jobs]]. There is also a page of this wiki [[Docker|dedicated to the preparation of Docker containers]].


= Users and groups =
= The SLURM job scheduling system =


As already explained, only Mufasa users can access the machine and interact with it. Creation of new users is done by Job Administrators or by specially designated users within each research group.
[[File:262px-Slurm logo.png|right|262px]]
Mufasa uses [https://slurm.schedmd.com/overview.html SLURM] (''Slurm Workload Manager'', formerly known as ''Simple Linux Utility for Resource Management'') to manage shared access to its resources.


Mufasa usernames have the form '''''xyyy''''' (all lowercase) where '''''x''''' is the first letter of the first name and '''''yyy''''' is the complete surname. For instance, user Mario Rossi will be assigned user name ''mrossi''. If multiple users with the same surname and first letter of the name exist, those created after the first are given usernames ''xyyy01'', ''xyyy02'', and so on.
'''Users of Mufasa must use SLURM to run and manage all processing-heavy jobs they run on the machine'''. It is possible for users to run jobs without using SLURM; however, running jobs run this way is only intended for “housekeeping” activities and only provides access to a small subset of Mufasa's resources. For instance, jobs run outside SLURM cannot access the GPUs, can only use a few processor cores, can only access a small portion of RAM. Using SLURM is therefore necessary for any resource-intensive job.


On Linux machines such as Mufasa, users belong to ''groups''. On Mufasa, groups are used to identify the research group that a specific user is part of. Assigment of Mufasa's users to groups follow these rules:
From [https://slurm.schedmd.com/documentation.html SLURM's documentation]:


* All users belong to group '''''users'''''.
<blockquote>“''Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.''
* Additionally, each user must belong to ''one and only one'' of the following (within brackets is the name of the faculty who is in charge of Mufasa for each group):
</blockquote>
** '''''nearmrs''''', i.e. [https://nearlab.polimi.it/medical/ Medical Robotics Section of NearLab] (prof. De Momi);
** '''''nearnes''''', i.e. [https://nearlab.polimi.it/neuroengineering/ NeuroEngineering Section of NearLab] (prof. Ferrante);
** '''''cartcas''''', i.e. [http://www.cartcas.polimi.it/ CartCasLab] (prof. Cerveri);
** '''''biomech''''', i.e. [http://www.biomech.polimi.it/ Biomechanics Research Group] (prof. Votta);
** '''''bio''''', for BioEngineering users not belonging to the research groups listed above.


Users who are not Job Administrators but have been given the power to create users can do so with command
The use of a job scheduling system such as SLURM ensures that Mufasa's resources are exploited in an efficient way. The fact that a schedule exists means that usually a job does not get immediately executed as soon as it is launched: instead, the job gets ''queued'' and will be executed as soon as possible, according to the availability of resources in the machine.


''sudo /opt/share/sbin/add_user.sh -u &lt;user&gt; -g users,&lt;group&gt;''
Useful references for SLURM users are the [https://slurm.schedmd.com/man_index.html collected man pages] and the [https://slurm.schedmd.com/pdfs/summary.pdf command overview].


where ''&lt;user&gt;'' is the username of the new user and ''&lt;group&gt;'' is one of the 6 groups from the list above.
SLURM is capable of managing complex computing systems composed of multiple '''clusters''' (i.e. sets) of machines, each comprising one '''node''' (i.e. machine) or more. The case of Mufasa is the simplest of all: ''Mufasa is the single node (called '''<code>gn01</code>''') of a SLURM computing cluster composed of that single machine.''


For instance, in order to create a user on Mufasa for a person named Mario Rossi belonging to the NeuroEngineering Section of NearLab, the following command will be used:
In order to let SLURM schedule job execution, before launching a job a user must specify what resources (such as RAM, processor cores, GPUs, ...) it requires. In managing process queues, SLURM considers such requirements and matches them with available resources. As a consequence, resource-heavy jobs generally take longer before thet get executed, while less demanding jobs are usually put into execution quickly. Processes that -while they are running- try to use more resources than they requested at launch time get killed by SLURM.


''sudo /opt/share/sbin/add_user.sh -u mrossi -g users,nearnes''
All in all, the take-away message is: [[User Jobs#Choosing the partition on which to run a job|''consider carefully how much of each resource to ask for your job'']].


New users are created with a predefined password, that they will be asked to change at their first login. For security reason, it is important that such first login occurs as soon as possible.
In [[User Jobs]] it will be explained how the process of requesting resources is greatly simplified by making use of process queues with predefined resource allocations called [[User Jobs#SLURM Partitions|''partitions'']].

Latest revision as of 09:34, 1 October 2024

Mufasa is a Linux server located in a server room managed by the System Administrators.

Job Users and Job Administrators can only access Mufasa remotely.

Remote access to Mufasa is performed using the SSH protocol for the execution of commands and the SFTP protocol for the exchange of files. Once logged in, a user interacts with Mufasa via a terminal (text-based) interface.

Hardware

Hw.png

Mufasa is a server for massively parallel computation. It has been set up and configured by E4 Computer Engineering with the support of the Biomechanics Group, the CartCasLab laboratory and the NearLab laboratory.

Mufasa's main hardware components are:

  • 2 AMD Epyc 7542 32-core processors (64 CPU cores total)
  • 1 TB RAM
  • 9 TB of SSDs (for OS and job caching)
  • 28TB of HDDs (for user /home directories)
  • 5 Nvidia A100 GPUs [based on the Ampere architecture]
  • Ubuntu Linux operating system

Usually each of these resources (e.g., a GPU) is not fully assigned to a single user or a single job. On the contrary, resources are shared among different users and processes in order to optimise their usage and availability. Most of the management of this sharing is done by SLURM.

CPUs and GPUs

Mufasa is fitted with two 32-core CPU, so the system has a total of 64 phyical CPUs (each of which can run 2 threads). Of the 64 CPUs, 2 are reserved for jobs run outside the SLURM job scheduling system (i.e., for low-power "housekeeping" tasks) while the remaining 62 are reserved for jobs run via SLURM.

For what concerns GPUs, some of the 5 physical A100 processing cards (i.e., GPUs) are subdivided into “virtual” GPUs with different capabilities using Nvidia's MIG system. Command

nvidia-smi -L

provides an overview of the physical and virtual GPUs available to users in a system. (On Mufasa, this command may require to be launched in a bash shell via the SLURM job scheduling system (as explained in Section 2 of this document) in order to be able to access the GPUs.) The output of nvidia-smi -L is similar to the following:

GPU 0: NVIDIA A100-PCIE-40GB (UUID: GPU-a9f6e4f2-2877-8642-1802-5eeb3518d415)
  MIG 3g.20gb     Device  0: (UUID: MIG-dd1ccc27-d106-5cd9-80f1-b6291f0d682d)
  MIG 3g.20gb     Device  1: (UUID: MIG-abe13a42-013b-5bef-aa5e-bbd268d72447)
GPU 1: NVIDIA A100-PCIE-40GB (UUID: GPU-5f28ca0a-5b2c-bfc7-5b9f-581b5ca1d110)
  MIG 3g.20gb     Device  0: (UUID: MIG-07372a92-2e37-5ad6-b334-add0100cf5e3)
  MIG 3g.20gb     Device  1: (UUID: MIG-a704d927-7303-5077-ab7c-6ead57329233)
GPU 2: NVIDIA A100-PCIE-40GB (UUID: GPU-fb86701b-5781-b63c-5cda-911cff3a5edb)
GPU 3: NVIDIA A100-PCIE-40GB (UUID: GPU-bbeed512-ab4c-e984-cfea-8067c009a600)
  MIG 3g.20gb     Device  0: (UUID: MIG-0d1232cd-6b37-5ac7-b00f-a9fdf6997b72)
  MIG 3g.20gb     Device  1: (UUID: MIG-bdbcf24a-a0aa-56fb-a7e4-fc18f17b7f24)
GPU 4: NVIDIA A100-PCIE-40GB (UUID: GPU-a9511357-2476-7ddf-c4c5-c90feb68acfd)

This output shows that the physical Nvidia A100 GPUs installed on Mufasa have been so subdivided:

  • two of the physical GPUs (GPU 2 and GPU 4) have not been subdivided at all
  • three of the physical GPUs (GPU 0, GPU 1 and GPU 3) have been subdivided into 2 virtual GPUs with 20 GB of RAM each

Thanks to MIG, users can use all the GPUs listed above as if they were all physical devices installed on Mufasa, without having to worry (or even know) which actually are and which instead are virtual GPUs.

All in all, then, users of Mufasa are provided with the following set of 8 GPUs:

2 GPUs with 40 GB of RAM each
6 GPUs with 20 GB of RAM each

How these devices are made available to Mufasa users is explained in User Jobs.

Accessing Mufasa

User access to Mufasa is always remote and exploits the SSH (Secure SHell) protocol.

To open a remote connection to Mufasa, open a local terminal on your computer and, in it, run command

ssh <username>@<IP_address>

where username is the username on Mufasa of the user and <IP_address> is one of the IP addresses of Mufasa, i.e. either 10.79.23.96 or 10.79.23.97

For example, user mrossi may access Mufasa with command

ssh mrossi@10.79.23.97

Access via SSH works with Linux, MacOs and Windows 10 (and later) terminals. For Windows users, a handy alternative tool (also including an X server, required to run on Mufasa Linux programs with a graphical user interface) is MobaXterm.

If you don't have a user account on Mufasa, you first have to ask your supervisor for one. See Users for more information about Mufasa's users.

As soon as you launch the ssh command, you will be asked to type the password (i.e., the one of your user account on Mufasa). Once you provide the password, the local terminal on your computer becomes a remote terminal (a “remote shell”) through which you interact with Mufasa. The remote shell sports a command prompt such as

<username>@rk018445:~$

(rk018445 is the Linux hostname of Mufasa). For instance, user mrossi will see a prompt similar to this:

mrossi@rk018445:~$

In the remote shell, you can issue commands to Mufasa by typing them after the prompt, then pressing the enter key. Being Mufasa a Linux server, it will respond to all the standard Linux system commands such as pwd (which prints the path to the current directory) or cd <destination_dir> (which changes the current directory). On the internet you can find many tutorials about the Linux command line, such as this one.

To close the SSH session run

exit

from the command prompt of the remote shell.

VPN

To be able to connect to Mufasa, your computer must belong to Polimi's LAN. This happens either because the computer is physically located at Politecnico di Milano and connected via ethernet, or because you are using Polimi's VPN (Virtual Private Network) to connect to its LAN from somewhere else (such as your home). In particular, using the VPN is the only way to use Mufasa from outside Polimi. See this DEIB webpage for instructions about how to activate VPN access.

SSH timeout

SSH sessions to Mufasa may be subjected to an inactivity timeout: i.e., after a given inactivity period the ssh session gets automatically closed. Users who need to be able to reconnect to the very same shell where they launched a program (for instance because their program is interactive or because it provides progress update messages) should use the screen command.

SSH and graphics

The standard form of the ssh command, i.e. the one described at the beginning of Accessing Mufasa, should always be preferred. However, it only allows text communication with Mufasa. In special cases it may be necessary to remotely run (on Mufasa) Linux programs that have a graphical user interface. These programs require interaction with the X server of the remote user's machine (which must use Linux as well). A special mode of operation of ssh is needed to enable this. This mode is engaged by running command ssh like this:

ssh -X <your username on Mufasa>@<Mufasa's IP address>

File transfer

Uploading files from local machine to Mufasa and downloading files from Mufasa onto local machines is done using the SFTP protocol (Secure File Transfer Protocol).

Linux and MacOS users can directly use the sftp package, as explained (for instance) by this guide. Windows users can interact with Mufasa via SFTP protocol using the MobaXterm software package. MacOS users can interact with Mufasa via SFTP also with the Cyberduck software package.

For Linux and MacOS user, file transfer to/from Mufasa occurs via an interactive sftp shell, i.e. a remote shell very similar to the one one described in Accessing Mufasa. The first thing to do is to open a terminal and run the following command (note the similarity to SSH connections):

sftp <username>@<IP_address>

where username is the username on Mufasa of the user, and <IP_address> is either 10.79.23.96 or 10.79.23.97

You will be asked your password. Once you provide it, you access an interactive sftp shell, where the command prompt takes the form

sftp>

From this shell you can run the commands to exchange files. Most of these commands have two forms: one to act on the remote machine (in this case, Mufasa) and one to act on the local machine (i.e. your own computer). To differentiate, the “local” versions usually have names that start with the letter “l” (lowercase L).

cd <path>

to change directory to <path> on the remote machine.

lcd <path>

to change directory to <path> on the local machine.

get <filename>

to download (i.e. copy) <filename> from the current directory of the remote machine to the current directory of the local machine.

put <filename>

to upload (i.e. copy) <filename> from the current directory of the local machine to the current directory of the remote machine.

Naturally, a user can only upload files to directories where they have write permission (usually only their own /home directory and its subdirectories). Also, users can only download files from directories where they have read permission. (File permission on Mufasa follow the standard Linux rules.)

In addition to the terminal interface, users of Linux distributions based on Gnome (such as Ubuntu) can use a handy graphical tool to exchange files with Mufasa. In Gnome's Nautilus file manager, write

sftp://<username>@<IP_address>

in the address bar of Nautilus, where username is your username on Mufasa and <IP_address> is either 10.79.23.96 or 10.79.23.97. Nautilus becomes a graphical interface to Mufasa's remote filesystem.

Using Mufasa

This section provide a brief guide to Mufasa users (especially those who are not experienced in the use of Linux and/or remote servers) about interacting with the system.

Storage spaces

User jobs require storage of programs and data files. On Mufasa, the space available to users for data storage is the /home/ directory. /home/ contains three types of directories:

Personal directories
Each user has a personal home directory where they can store their own files. The home directory is the one with the same name of the user. By default, only the owner of a home directory can access its contents.
Group directories
Each research group has a common group directory where group members can store files that they share with other group members. The group directory is the one called shared-<groupname>, where <groupname> is the corresponding user group. The owner of group directory is user root, while group ownership is assigned to <groupname>. On Mufasa, group directories have GUID activated. This means that any file or directory created inside shared-<groupname> has group ownership assigned to <groupname>: so editing permissions on the new file or directory extend to all group members.
The shared-public directory
This is a shared directory common to all users of Mufasa. Users that share files but do not belong to the same research group can use it to store their shared files.

Disk quotas

On Mufasa, the directories in /home/ must be used as a temporary storage area for user programs and their data, limited to the execution period of the jobs that use the data. They are not intended for long-term storage. For this reason, disk usage is subjected to a quota system.

User quotas

Each user is assigned a disk quota, i.e. an amount of space that they can use before the user is blocked by the quota system. Note that the quota applies not only to the data created and/or uploaded by you as a user, but also to data created by programs run by your user.

The quotas assigned to your user and the amount of it that you are currently using can be inspected with command

quota -s

The output of quota -s is similar to the following:

Filesystem   space   quota   limit   grace   files   quota   limit   grace
 /dev/sdb1  11104K    100G    150G               1       0       0        
 /dev/sdc2   5552K    100G    150G              60       0       0        

Here is a simple guide to the output of quota -s.

Column "Filesystems"
identifies the filesystems where the user has been assigned a disk quota. On Mufasa, /dev/sdb1 is the SSD disk space used as cache space, while /dev/sdc2 is the HDD space used for the /home directories.
Columns titled "space" and "files"
tell the user how much of their quota they are actually using: the first in term of bytes, the second in term of number of files (more precisely, of inodes).
Columns titled "quota"
tell the user how much is their soft limit, in term of bytes and files respectively. If the value is 0, it means there is no limit.
Columns titled "limit"
tell the user how much is their hard limit, in term of bytes and files respectively. If the value is 0, it means there is no limit.
Columns titled "grace"
tell the user how long they are allowed to stay above their soft limit, for what concerns bytes and files respectively. When these columns are empty (as in the example above) the user is not over quota.

The meaning of soft limit and hard limit is the following.

The hard limit cannot be exceeded. When a user reaches their hard limit, they cannot use any more disk space: for them, the filesystem behaves as if the disks are out of space. Disk writes will fail, temporary files will fail to be created, and the user will start to see warnings and errors while performing common tasks. The only disk operation allowed is file deletion.

The soft limit is, as the word goes, softer. When a user exceeds it, they are not immediately prevented from using more disk space (provided that they stay below the hard limit). However, as the user goes beyond the soft limit, their grace period begins: i.e. a period within which the user must reduce their amount of data back to below the soft limit. During the grace period, the "grace" column(s) of the output of quota show how much of the grace period remains to the user. If the user is still above their soft limit at the end of the grace period, the quota system will treat the soft limit as a hard limit: i.e. it will force the user to delete data until they are below the soft limit before they can write on disk again.

In the output of quota -s, the grace columns are blank except when a soft limit has been exceeded.

Group and project quotas

While on Mufasa disk quotas are usually assigned per-user, the quota system also enables the setup of per-group quotas (i.e., limits to the disk space that, collectively, a group of users can use) and per-project quotas (i.e., limits to the amount of data that a specific directory and all its subdirectories can contain).

A comprehensive view of the quota situation for one's user and user groups is provided by command

quotainfo

For what concerns project quotas, on Mufasa they are applied to group directories in /home/.

Finding out how much disk space you are using

If your user is the owner of directory /path/to/dir/ you can find out how much disk space is used by the directory with command du like this:

du -sh /path/to/dir/

The -sh flag is used to ask for options -s (which provides the overall size of the directory) and -h (which provides human-readable values using measurement units such as K (KBytes), M (MBytes), G (GBytes)).

In particular, you can find out how much disk space is used by your home directory with command

du -sh ~

In fact, in Linux the symbol ~ is shorthand for the path to the current user's home directory.

If you want a detailed summary of how much disk space is used by each item (i.e., subdirectory or file) in a directory you own, use command

du -h /path/to/dir/

For instance, for user gfontana the output of

du -h ~

may be similar to the following

gfontana@rk018445:~$ du -h ~
12K	/home/gfontana/.ssh
356K	/home/gfontana/.cache/gstreamer-1.0
5.0M	/home/gfontana/.cache/tracker
5.3M	/home/gfontana/.cache
  [...other similar lines...]
4.0K	/home/gfontana/.config/htop
32K	/home/gfontana/.config
8.0K	/home/gfontana/.slurm
6.3M	/home/gfontana

Hidden files and directories

In Linux, directories and files with a leading "." in their name are hidden. These do not appear in listings, such as the output of the ls command, to avoid cluttering them up: however, they still occupy disk space.

The output of command du, however, also considers hidden elements and provides their size: therefore it can help you understand why the quota system says that you are using more disk space than reported by ls.

Changing file/directory ownership and permissions

Every file or directory in a Linux system is owned by both a user and a group. User and group ownerships are not connected, so a file can have as group owner a group that its user ownwer does not belong to.

Being able to manipulate who owns a file and what permissions any user has on that file is often important in a multi-user system such as Mufasa. This is a recapitulation of the main Linux commands to manipulate file permissions. Key commands are

chown to change ownership - user part
chgrp to change ownership - group part
chmod to change access permissions

All three accept option -R (uppercase) for recursive operation, so -if needed- you can change ownership and/or permissions of all contents of a directory and its subdirectories with a single command.

The syntax of chown commands is

chown <new_owner> <path/to/file>

where <new_owner> is the user part of the new file ownership.

The syntax of chgrp commands is

chgrp <new_group> <path/to/file>

where <new_owner> is the group part of the new file ownership.

User and group ownership for a file can also be both changed at the same time with

chown <new_owner>:<new_group> <path/to/file>

For what concerns chmod, the easiest way to use it makes use of symbolic descriptions of the permissions. The format for this is

chmod [users]<+|-><permissions> <path/to/file>

where

<path/to/file> is the file or directory that the change is applied to
[users] is ugo or a subset of it; the three letters correspond respectively:
to the user who owns <path/to/file> (also used if [users] is not specified)
to the group that owns <path/to/file>
to everyone else (others)
+ or - correspond to adding or removing permissions
<permissions> is rwx or a subset, corresponding to read, write and execute permissions

Note that r, w and x permission have a different meaning for files and for directories.

For files
permission r allows to read the contents of the file
permission w allows to change the contents of the file
permission x allows to execute the file (provided that it is a program: e.g., a shell script)
For directories
permission r allows to list the files within the directory
permission w allows to create, rename, or delete files within the directory
permission x allows to enter the directory (i.e., cd into it) and access its files

For instance

chmod g+rwx myfile.txt

adds permission to read, write and execute myfile.txt to all the Linux users of the same group of the user that the file belongs to;

chmod go-x mydir

takes away permission to enter directory <dirname> from everyone except the user who owns the directory.

If you want additional information about file and directory permissions in a Linux system work, this is a good online guide.

Docker containers

262px-docker logo cropped.jpg

As a general rule, all computation performed on Mufasa must occur within Docker containers. From Docker's documentation:

Docker is an open platform for developing, shipping, and running applications. Docker enables you to separate your applications from your infrastructure.

Docker provides the ability to package and run an application in a loosely isolated environment called a container. The isolation and security allow you to run many containers simultaneously on a given host. Containers are lightweight and contain everything needed to run the application, so you do not need to rely on what is currently installed on the host.

A container is a sandboxed process on your machine that is isolated from all other processes on the host machine. When running a container, it uses an isolated filesystem. [containing] everything needed to run an application - all dependencies, configuration, scripts, binaries, etc. The image also contains other configuration for the container, such as environment variables, a default command to run, and other metadata.

Using Docker allows each user of Mufasa to build the software environment that their job(s) require. In particular, using Docker containers enables users to configure their own (containerized) system and install any required libraries on their own, without need to ask administrators to modify the configuration of Mufasa. As a consequence, users can freely experiment with their (containerized) system without risk to the work of other users and to the stability and reliability of Mufasa. In particular, containers allow users to run jobs that require multiple and/or obsolete versions of the same library.

A large number of preconfigured Docker containers are already available, so users do not usually need to start from scratch in preparing the environment where their jobs will run on Mufasa. The official Docker container repository is dockerhub.

How to run Docker containers on Mufasa is explained in User Jobs. There is also a page of this wiki dedicated to the preparation of Docker containers.

The SLURM job scheduling system

262px-Slurm logo.png

Mufasa uses SLURM (Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management) to manage shared access to its resources.

Users of Mufasa must use SLURM to run and manage all processing-heavy jobs they run on the machine. It is possible for users to run jobs without using SLURM; however, running jobs run this way is only intended for “housekeeping” activities and only provides access to a small subset of Mufasa's resources. For instance, jobs run outside SLURM cannot access the GPUs, can only use a few processor cores, can only access a small portion of RAM. Using SLURM is therefore necessary for any resource-intensive job.

From SLURM's documentation:

Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. Slurm has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.

The use of a job scheduling system such as SLURM ensures that Mufasa's resources are exploited in an efficient way. The fact that a schedule exists means that usually a job does not get immediately executed as soon as it is launched: instead, the job gets queued and will be executed as soon as possible, according to the availability of resources in the machine.

Useful references for SLURM users are the collected man pages and the command overview.

SLURM is capable of managing complex computing systems composed of multiple clusters (i.e. sets) of machines, each comprising one node (i.e. machine) or more. The case of Mufasa is the simplest of all: Mufasa is the single node (called gn01) of a SLURM computing cluster composed of that single machine.

In order to let SLURM schedule job execution, before launching a job a user must specify what resources (such as RAM, processor cores, GPUs, ...) it requires. In managing process queues, SLURM considers such requirements and matches them with available resources. As a consequence, resource-heavy jobs generally take longer before thet get executed, while less demanding jobs are usually put into execution quickly. Processes that -while they are running- try to use more resources than they requested at launch time get killed by SLURM.

All in all, the take-away message is: consider carefully how much of each resource to ask for your job.

In User Jobs it will be explained how the process of requesting resources is greatly simplified by making use of process queues with predefined resource allocations called partitions.