Skip to content

Latest commit

 

History

History
168 lines (127 loc) · 4.5 KB

File metadata and controls

168 lines (127 loc) · 4.5 KB

Leonardo HPC — Environment Setup Guide

1. Setting Up an Account

Create your account at the CINECA UserDB portal: https://userdb.hpc.cineca.it/

Once you have set up your UserDB profile and obtained your CINECA account name, follow the steps below.


2. SSH Access (macOS)

Install step CLI

brew install step

If Homebrew is not yet installed, run the following first:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

Bootstrap the CA

step ca bootstrap \
  --ca-url=https://sshproxy.hpc.cineca.it \
  --fingerprint 2ae1543202304d3f434bdc1a2c92eff2cd2b02110206ef06317e70c1c1735ecd

Generate an SSH Certificate

Replace your.name@email.com with the email you used when registering on UserDB:

step ssh certificate your.name@email.com --provisioner cineca-hpc key_filename

Configure SSH

Add the following to ~/.ssh/config (create it if it doesn't exist via nano ~/.ssh/config):

Host leonardo
    HostName login07-ext.leonardo.cineca.it
    User your_cineca_username
    IdentityFile /Users/your_mac_username/.ssh/key_filename

Set correct permissions:

chmod 600 ~/.ssh/config

You can now connect simply with:

ssh leonardo

3. Setting Up the Python Environment on the Cluster

Load Python module

module purge
module load python/3.11.7

Install uv via a bootstrap environment

Since the cluster does not have uv available by default and PyPI access is restricted outside a venv, first create a temporary environment to install uv:

python -m venv elliot-env
source elliot-env/bin/activate
pip install uv

Make uv permanently available

Copy the uv binary to ~/.local/bin so it persists across all environments:

mkdir -p ~/.local/bin
cp $HOME/elliot-env/bin/uv ~/.local/bin/uv
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

Verify:

uv --version

Install Python 3.12 via uv

The project requires Python 3.12, which is not available as a cluster module. uv can fetch it directly:

uv python install 3.12

Create the project environment

uv venv -p 3.12 elliot-venv
source elliot-venv/bin/activate

4. Install the Project

Clone the repository

git clone https://github.com/elliot-project/elliot-cli.git

Install in editable mode

uv pip install -e ./elliot-cli

5. Set HuggingFace Cache Directory

Compute nodes have no internet access, so all models and datasets must be pre-downloaded. Set HF_HOME to point to a storage area large enough to hold your models and datasets.

Warning: $HOME on Leonardo has a quota of only 50 GB — far too small for most LLMs and multimodal datasets. Do not use $HOME as your HF cache directory.

Leonardo provides several larger storage areas (full details):

Area Quota Purge policy Notes
$WORK 1 TB 6 months post-project Persistent, parallel I/O, recommended
$FAST 1 TB 6 months post-project Faster I/O than $WORK, no extension option
$SCRATCH 20 TB Files purged after 40 days of inactivity Temporary only

Recommended: use $WORK for persistent caches (models you reuse across runs):

mkdir -p $WORK/hf_cache
export HF_HOME="$WORK/hf_cache"
echo 'export HF_HOME="$WORK/hf_cache"' >> ~/.bashrc
source ~/.bashrc

If you need more space for a single campaign and don't need the cache long-term, $SCRATCH (up to 20 TB) is an option — but files are automatically deleted after 40 days of inactivity and must not be kept alive artificially with touch:

mkdir -p $SCRATCH/hf_cache
export HF_HOME="$SCRATCH/hf_cache"

6. Running Evaluations

# Run evaluations using a task group (recommended)
oellm schedule-eval \
    --models "microsoft/DialoGPT-medium,EleutherAI/pythia-160m" \
    --task-groups "open-sci-0.01"

# Or specify individual tasks
oellm schedule-eval \
    --models "EleutherAI/pythia-160m" \
    --tasks "hellaswag,mmlu" \
    --n-shot 5

Additional Resources

Full instructions are also available here: https://iffmd.fz-juelich.de/e-hu5RBHRXG6DTgD9NVjig#Leonardo-Access-and-Usage-LAION-Open-Psi-open-sci-Ontocord-AI-openEuroLLM