<aside> 💡 Welcome to the Quick-Start Guide for Using the Abacus Cluster!

Please note that this guide is not exhaustive. For more detailed information, please refer to the main page and explore the introductory section for an overview of a typical workflow. This guide aims to get you up and running quickly with the essential steps and best practices for using the Abacus cluster.

</aside>

Cluster Overview

This cluster is organized as follow:

  1. Compute nodes: at the moment we have 3 GPU capable nodes:

    1. two nodes with 8xL40S (maryam and fermi)
    2. one node with 8xH100 (gauss)

    Users are not allowed to log-in directly into the nodes

    In addition, each compute node have also plenty of NVMe (super fast) storage, to optimize the usage of the GPU.

  2. Login node: we have a node (login node) which is the user’s entry point to request compute resources

  3. Shared storage: a storage server with fast SSD disks on which we create and share data across compute nodes

Resource access and management is handled by the SLURM workload manager.

In order to improve the reproducibility of your work and maintainability of the cluster, your code must be run inside containers.

Connecting to the cluster

access the cluster, follow these steps:

  1. Set Up the Jumphost

    First, configure the jumphost by following the instructions provided on this page.

  2. Request Access

    Contact one of the system administrators to request access to the server. Ask them to add you to the server and provide you with your username and password.

  3. Login to the Cluster

    Once you have your credentials, you can log in to the login node (a virtual machine from which you will launch your code on the compute nodes) via SSH using the following command:

    ssh "username"@abacus-login.fbk.eu Replace "username" with the username provided by the administrator.

  4. Change Your Password

    On your first login, you will be prompted to enter the password provided by the admins. After logging in, change your password using the following command:

    passwd "username"

  5. Set Up Passwordless SSH

    To avoid entering your password every time you connect, you can set up SSH key-based authentication. The steps differ slightly depending on whether you're using Windows or Linux:

    For Linux:

    a. Generate an SSH Key Pair (if you don’t already have one):

    ssh-keygen -t rsa -b 4096 -C "comment-for-remembering-its-purpose/remote-location"
    

    Follow the prompts, and when asked where to save the key, press Enter to accept the default location (~/.ssh/id_rsa). You should also set a passphrase for extra security.

    b. Copy Your Public Key to the Cluster:

    ssh-copy-id "username"@abacus-login.fbk.eu
    

    You will be prompted for your password one last time during this step. The command will copy your public key to the appropriate location on the server, enabling passwordless login in the future.

    c. Verify Passwordless Login:

    After copying the key, you can test the connection:

    ssh "username"@abacus-login.fbk.eu
    

    If everything is set up correctly, you should log in without being prompted for your user’s password but for you ssh key passphrase (if you set it, as you should).

    <aside> 💡 HINT:

    Here is an example of config file in you .ssh dir, which will allow u to ssh into the frontend (login node) via fbk’s jumphost

    Host fbkjumphost
      HostName jump.fbk.eu
      User <your user>
      IdentityFile <path/to/your/PRIVATE/key>
        
    Host login
      HostName abacus-login.fbk.eu
      User <your user>
      IdentityFile  <path/to/your/PRIVATE/key>
      LocalForward <choose-a-port> localhost:<choose-a-port>
      ProxyJump fbkjumphost
    

    now you can connect via ssh login

    </aside>

    For Windows:

    a. Install OpenSSH (if not already installed):

    Recent versions of Windows 10 and 11 include OpenSSH by default. If it’s not installed, you can add it through the "Optional Features" in the Windows settings.

    b. Generate an SSH Key Pair:

    Open a PowerShell terminal and run:

    ssh-keygen -t rsa -b 4096 -C "[email protected]"
    

    Follow the prompts, and when asked where to save the key, press Enter to accept the default location (C:\\Users\\YourUsername\\.ssh\\id_rsa).

    c. Copy Your Public Key to the Cluster:

    Use the following command in PowerShell:

    ssh-copy-id "username"@abacus-login.fbk.eu
    

    If ssh-copy-id is not available, you can manually copy the public key:

    d. Verify Passwordless Login:

    After copying the key, test the connection:

    ssh "username"@abacus-login.fbk.eu
    

    If everything is set up correctly, you should log in without being prompted for your user’s password but for you ssh key passphrase (if you set it, as you should).

Working with your data

The cluster is equipped with a storage node named laplace, which is mounted across all compute nodes at /storage. To work with your data on the cluster, you must first copy your local data to this storage location.

To copy your local data to the appropriate folder on the storage node, use the following scp command (assuming you’ve created the ssh config file):