Skip to content

flexi-framework/saexi

Repository files navigation

logo

About

Python scripts to generate meshes and execute queue jobs to evaluate the scaling performance of the FLEXI-family of physics codes.

Requirements

These scripts are written with the limited modules on HPC systems in mind. In general, the only requirements should be a working python environment, a HOPR executable, the executable for one of the FLEXI-family of physics codes and access to the HPC queueing system.

Python modules

The following python modules are required and can be installed using the include requirements.txt file.

pip install --user -r requirements.txt

⚠️ Minimum required python version is 3.9 since numpy changed its boolean data format!

⚠️ Make sure to check the policies of the HPC system you are using before installing pip packages!

Usage

General

To run SÆXI, invoke the main script with:

saexi.py [mode] [-h] [-v]

SÆXI has 5 modes of operation, each intended for a different stage of the scaling test process.

Mode Description
prepare Generate the meshes for the scaling jobs
submit Generate the job data and submit scaling jobs to the queue
analyze Parse the peformance information from the scaling jobs
archive Archive the scaling results in the scaling database
fetch Pull latest results for this system from the scaling database and plot

The specific command and arguments for each mode are given in the subsequent sections.

1. PREPARE mode - Mesh generation

The prepare mode generates the required mesh files in the scaling directory. Uses the mesh generation software HOPR to generate a series of simple box grids.

saexi.py prepare [hopr] [-m MESH_PREFIX] [-f PARAM_FILE] [-s START] [-e END] [-x XSIZE] [-y YSIZE] [-z ZSIZE] [-i CONFIGFILE]

with the arguments:

Argument Description
hopr Path to HOPR executable (REQUIRED)
-m MESH_PREFIX, --mesh-prefix MESH_PREFIX Path to the directory meshes will be written to (REQUIRED)
-f PARAM_FILE, --parameterfile PARAM_FILE Path to the template parameter file that will be used to generate input HOPR input files (default: included parameter_hopr.org)
-s START, --start START Start power of 2 for scaling meshes and core counts (default: 1, e.g., 2^START)
-e END, --end END End power of 2 for scaling meshes and core counts (default: 1, e.g., 2^END)
-x XSIZE, --xsize XSIZE Number of elements in x-direction of first mesh (default: 16)
-y YSIZE, --ysize YSIZE Number of elements in y-direction of all meshes (default: 16)
-z ZSIZE, --zsize ZSIZE Number of elements in z-direction of all meshes (default: 16)
-i CONFIGFILE, --configfile CONFIGFILE Path to which a SÆXI configuration save file will be written (YAML format). If no path supplied, no file is written.

2. SUBMIT mode - Generate scaling jobs and submit

The submit mode generates the results directory structure, symlinks the mesh files, generates the parameter files.

If a supported HPC system is detected, the script also automatically submits the scaling jobs. Additional HPC systems can be defined in the .queues directory in the project root.

saexi.py submit [flexi] [-p PREFIX] [-m MESH_PATH] [-a ACCOUNT] [-q QUEUE] [-c SCALING] [-r RUNS] [-t TIME] [-f PARAM_FILE] [-s START] [-e END] [-x XSIZE] [-y YSIZE] [-z ZSIZE] [-i CONFIGFILE]

with the arguments:

Argument Description
flexi Path to FLEXI-family code executable (REQUIRED)
-m MESH_PREFIX, --mesh-prefix MESH_PREFIX Path to the directory meshes will be written to (REQUIRED)
-p PREFIX, --prefix PREFIX Path to the scaling results directory (REQUIRED)
-a ACCOUNT, --account ACCOUNT Account name/ID on the current HPC system (REQUIRED)
-q QUEUE, --queue QUEUE Desired queue to submit the scaling runs to (default: from .queues file for current HPC system)
-c SCALING, --scaling SCALING Scaling strategy, strong or weak (default: weak)
-r RUNS, --runs RUNS Number of runs per configuration (default: 3)
-t TIME, --time TIME Estimated runtime for each case in minutes (default: 25)
-f PARAM_FILE, --parameterfile PARAM_FILE Path to the template parameter file that will be used to generate input FLEXI input files (default: included parameter_flexi.org)
-s START, --start START Start power of 2 for scaling meshes and core counts (default: 1, e.g., 2^START)
-e END, --end END End power of 2 for scaling meshes and core counts (default: 1, e.g., 2^END)
-x XSIZE, --xsize XSIZE Number of elements in x-direction of mesh for strong scaling or first mesh for weak scaling (default: 16, REQUIRED for strong scaling)
-y YSIZE, --ysize YSIZE Number of elements in y-direction of mesh for strong scailng or all meshes for weak scaling (default: 16)
-z ZSIZE, --zsize ZSIZE Number of elements in z-direction of mesh for strong scailng or all meshes for weak scaling (default: 16)
-i CONFIGFILE, --configfile CONFIGFILE Path from/to which a SÆXI configuration save file will be read/written (YAML format). If no path supplied, no file is read and a default file is written in PREFIX. Values from read file are overwritten by command line arguments.

3. ANALYZE mode - Parse, process and write performance data

The analyze mode traverses the scaling directory, assembles the output information from each run and generates the final csv output.

saexi.py analyze [-p PREFIX] [-i CONFIGFILE]

with the arguments

Argument Description
-p PREFIX, --prefix PREFIX Path to the scaling results directory (REQUIRED)
-i CONFIGFILE, --configfile CONFIGFILE Path to a SÆXI configuration save file (YAML format). If no path supplied, uses default file placed written in PREFIX by the submit step.

The final scaling results are written to a file called results.csv, which is placed in the scaling results directory given in PREFIX.

4. ARCHIVE mode - Store scaling results in database

The archive stores the scaling data from a run into the FLEXI scaling database.

⚠️ Running this mode requires access to the IAG GitLab and is intended for INTERNAL USE ONLY.

saexi.py archive [-p PREFIX] [-g GIT_REPO] [--mpi MPI] [--compiler COMPILER]

with the arguments

Argument Description
-p PREFIX, --prefix PREFIX Path to the scaling results directory (REQUIRED)
-g GIT_REPO, --gitrepo GIT_REPO Path to the LOCAL clone of the Git repository for the FLEXI-family solver used for the scaling tests. (REQUIRED)
--mpi MPI Name and version of the MPI used build the FLEXI-family solver for the scaling runs being archived. Format as / (i.e. openmpi/5.0.3) (REQUIRED)
--compiler COMPILER Name and version of the compiler used to build the FLEXI-family solver for the scaling runs being archived. Format as / (i.e. gnu/13.1.0) (REQUIRED)

This mode takes the results.csv file from the given PREFIX, generates a YAML file with information about the run and system configuration used for the cases and then commits both to the scaling database.

5. FETCH mode - Retrieve and plot scaling data from database

The fetch pulls scaling data for the latest run on your current HPC system from the FLEXI scaling database and generates a scaling plot of it.

The generated plot is placed in the directory saexi.py is called from.

⚠️ Running this mode requires access to the IAG GitLab and is intended for INTERNAL USE ONLY.

saexi.py fetch [-g GIT_REPO] [-c SCALING]

with the arguments

Argument Description
-g GIT_REPO, --gitrepo GIT_REPO Path to the LOCAL clone of the Git repository for the FLEXI-family solver used for the scaling tests. (REQUIRED)
-c SCALING, --scaling SCALING Scaling strategy, strong or weak (default: weak)

Adding new clusters

This script uses pysqa to abstract the cluster details and peform job submission. To add a new cluster, add its name to .queues/cluster.yaml. Then, place a matching queue description in .queues/cluster_queues.yaml and provide the necessary job submission script cluster.sh.

About

Automatic scaling testing utilities for HPC systems

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors