Skip to content

Move the regression suite from Julia into Catch2#735

Open
hughcars wants to merge 45 commits into
mainfrom
hughcars/regression-test-refactor
Open

Move the regression suite from Julia into Catch2#735
hughcars wants to merge 45 commits into
mainfrom
hughcars/regression-test-refactor

Conversation

@hughcars

Copy link
Copy Markdown
Collaborator

Move the regression suite from Julia into Catch2

Replaces the Julia harness (test/examples/runtests.jl + helpers) with regression cases written in Catch2 and built into palace-unit-tests. Reference CSVs and per-case tolerances are unchanged.

Architecture

  • palace::Run — library entry point lifted out of main.cpp's solve block. Production main() and the test binary both call it.
  • test/unit/regression_helpers.{hpp,cpp}RunRegressionCase stages each case under $TMPDIR/palace-regression/<case>/ (symlinks of examples/<case>/*), runs palace::Run, then on rank 0 only diffs every reference CSV column-wise via Catch2's WithinRel || WithinAbs matchers. Source tree stays read-only.
  • test/unit/regression/cases.cpp — 27 TEST_CASEs tagged [Regression]. Two shared check factories cover the magnitude-only patterns (CompareComplexMagnitudes for paired Re{X}/Im{X} columns, TestFarfield for the 3-component vector magnitude on farfield-rE.csv).

CTest entries register as regression-<case> (-np 2 by default; CMake var PALACE_REGRESSION_NUMPROC overrides). CI bypasses CTest and invokes mpirun -np $NUM_PROC_TEST palace-unit-tests "[Regression]" directly so rank count tracks the runner.

See test/unit/README.md for the filter model, override precedence, and how to add a case.

Validation

ctest -R "^regression-" -j 2 → 27/27 pass (~39m on a 2026 M-series). Tampering with a reference CSV fails the suite with Catch2's matcher diagnostic naming row, column, ref/actual, and tolerance.

Notes for reviewers

  • Stacked on hughcars/boundarymode-2d; the 2D cases (cavity2d/*, cpw2d/*, cpw/wave_2dmode) only exist on that base.
  • macOS workflow drops julia-actions/setup-julia (only runtests.jl used it). Linux keeps it for plot_farfield.jl.
  • adapter/slp and transmon/* are not ported — both were dormant in Julia's default case list (slp had no fixture; transmon was cost-gated).
  • Re-baselining is now palace -np N <config> && cp postpro/...csv test/examples/ref/....

@hughcars hughcars force-pushed the hughcars/regression-test-refactor branch 4 times, most recently from 5f279e1 to f5f2715 Compare May 18, 2026 22:10
Base automatically changed from hughcars/boundarymode-2d to main May 19, 2026 21:24
@hughcars hughcars force-pushed the hughcars/regression-test-refactor branch from f5f2715 to 71be6df Compare May 19, 2026 21:25
@hughcars hughcars requested a review from Sbozzolo May 19, 2026 21:26
@hughcars hughcars marked this pull request as ready for review May 19, 2026 21:26
@hughcars hughcars removed the request for review from Sbozzolo May 19, 2026 21:26
@hughcars hughcars force-pushed the hughcars/regression-test-refactor branch 9 times, most recently from 97c82ec to 2e28389 Compare May 29, 2026 20:04
@hughcars hughcars added the trigger-long-tests Trigger running long tests for this PR now label May 29, 2026
@hughcars hughcars force-pushed the hughcars/regression-test-refactor branch 4 times, most recently from 05a0418 to c400e4b Compare June 2, 2026 17:46
@hughcars hughcars requested review from Sbozzolo and simlapointe June 8, 2026 13:37
@hughcars hughcars force-pushed the hughcars/regression-test-refactor branch from 6683962 to 6fa1bb4 Compare June 9, 2026 13:58

@simlapointe simlapointe left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, tested it on my end and things are mostly working. Flagged a few potential issues below.

Comment thread test/unit/regression/cases.cpp Outdated
// fixtures are added.

// ===========================================================================
// 2D cases. Live on hughcars/boundarymode-2d; not yet on origin/main.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated comment


if [[ "${{ inputs.variant }}" == *"+cuda"* ]]; then
export PALACE_DEVICE=GPU
SOLVER_OVERRIDE+=(--palace-device GPU)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think --palace-device GPU is sufficient to run on GPU. I think --device cuda is required to run on GPU because otherwise the actual MFEM device is still on CPU. It's a bit unclear to me what --palace-device GPU does.

I tried on my end with a GPU build and with --palace-device GPU it uses the cpu:

user:~/palace/build$ mpirun -n 1 bin/palace-unit-tests  --palace-device GPU cpw2d_thin
Device configuration: cpu
Memory configuration: host-umpire
libCEED backend: /cpu/self/xsmm/blocked
Filters: "cpw2d_thin"
...

while with --device cuda it uses the GPU:

user:~/palace/build$ mpirun -n 1 bin/palace-unit-tests --device cuda cpw2d_thin
Device configuration: cuda,cpu
Memory configuration: host-umpire,cuda-umpire
Use GPU-aware MPI:    no
libCEED backend: /gpu/cuda/magma
Filters: "cpw2d_thin" [GPU]
...

Comment thread test/unit/regression/cases.cpp Outdated
auto mag_sq = [&](palace::Table &t)
{
auto sq = [](double x, double y) { return x * x + y * y; };
return sq(t[3].data[i], t[4].data[i]) + sq(t[5].data[i], t[6].data[i]) +

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This hard codes the farfield to be columns 3-8, which is true for driven cases but NOT for eigenmode cases so the wrong columns will be compared. See test/examples/ref/cpw/lumped_eigen/farfield-rE.csv, farfield columns are 5-10. The previous julia script counted the six complex E-field columns from the right.

opts.rtol = 2.0e-2;
opts.atol = 1.0e-11;
opts.excluded_columns = {"Maximum", "Minimum"};
palace::test::RunRegressionCase("cpw", "cpw_lumped_uniform.json", "lumped_uniform", opts);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cpw_lumped_uniform test case has a farfield-rE.csv file but we are not using TestFarfield... This is maybe ok if the test passes, but we should double check that this is intended. I think the old julia script used the custom farfield test.

Comment thread palace/driver.cpp Outdated

void Run(IoData &iodata, MPI_Comm comm, int omp_threads, const char *git_tag)
{
MakeOutputFolder(iodata, comm);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hasn't MakeOutputFolder already been called at palace/main.cpp:228?

Comment thread test/unit/main.cpp
// entry point; long regression cases are tagged [Regression][Long]
// so they're caught here too). With any explicit user selector,
// run exactly what was asked for.
if (!user_selected_tests)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this changes explicit tag/name selection semantics for normal unit tests. For example, on a single-rank run, bin/palace-unit-tests "[vector]" now includes [Parallel]-only tests such as Vector Sum - Real - Asymmetric Sizes, because the runtime [Serial] filter is skipped whenever the user provided any selector.

Comment thread test/unit/main.cpp Outdated
Comment on lines +136 to +144
// Whether the user passed any explicit test selector on the command
// line (name, wildcard, or tag spec). Bare invocation leaves this
// empty and triggers our convenience defaults below; anything
// explicit runs exactly what was asked for. This matters for
// regression cases in particular: they're tagged `[Regression]`
// only, so auto-adding `[Serial]` or `~[Regression]` would either
// fold in unrelated unit tests (separate positive filters are
// OR'd in Catch2) or silently exclude the very test the user
// selected by name.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this comment is still accurate

while IFS= read -r case; do
[[ -z "$case" ]] && continue
echo "=== Running $case ==="
mpirun -np $NUM_PROC_TEST palace-unit-tests \

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relies on the palace-unit-tests binary’s compile-time PALACE_EXAMPLES_DIR_DEFAULT / PALACE_REGRESSION_REF_DIR_DEFAULT paths. Could this be fragile for the split Spack test jobs, because the binary may have been built/installed from a different checkout or Spack stage than this runner’s $GITHUB_WORKSPACE?

@Sbozzolo Sbozzolo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We should also remove the .gitlab-ci.yml (it uses the Julia harness, and we don't really use it)
  • I wonder if we should just take the plunge and use this as an opportunity to decouple the notion of "example" from "regression test"

Comment thread test/unit/regression_helpers.cpp Outdated
}

const std::size_t n_cols = std::min(actual.n_cols(), reference.n_cols());
const std::size_t n_rows = std::min(actual.n_rows(), reference.n_rows());

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When actual.n_rows() = 0 and skip_rowcount = true, no check is performed, even when reference.n_rows != 0 (which should lead to a failure)

Comment on lines +280 to +295
if (Mpi::Root(comm))
{
std::error_code ec;
std::filesystem::remove_all(stage_, ec);
std::filesystem::create_directories(stage_);
for (const auto &entry : std::filesystem::directory_iterator(example_dir))
{
const auto name = entry.path().filename().string();
// postpro/ and log/ are Palace outputs; don't stage stale copies.
if (name == "postpro" || name == "log")
{
continue;
}
std::filesystem::create_symlink(entry.path(), stage_ / name);
}
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a risk of deadlock here. If rook throws an exception, all the other ranks will indefinitely wait at the Mpi::Barrier below.

# version of spack-packages
spack-version: 'develop'
julia-version: 'release'
julia-version: 'dont-install'

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just remove julia from the setup for self-hosted runners.

Comment thread palace/main.cpp Outdated
mfem::Device device(ConfigureDevice(iodata.solver.device),
utils::GetDeviceId(world_comm, ngpu));
ConfigureCeedBackend(iodata.solver.ceed_backend);
// Initialise device + numerics. The BlockTimer is scoped to this block so it

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Initialise device + numerics. The BlockTimer is scoped to this block so it
// Initialize device + numerics. The BlockTimer is scoped to this block so it

Comment thread CHANGELOG.md
`palace-unit-tests` binary. Run with
`mpirun -np $NUM_PROC_TEST palace-unit-tests "[Regression]"` or via
CTest as `ctest -R "^regression-"`. Reference CSVs under
`test/examples/ref/` are unchanged. Re-baselining is now

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using the baseline script. It was a nice way to quickly re-baseline one or multiple experiments at the same time. Maybe we can extract the basic functionalities in a copy&pastable bash command that we add to the docs?

@Sbozzolo

Copy link
Copy Markdown
Member

Sorry, I clicked "submit too early", I am not done yet :)

@Sbozzolo Sbozzolo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really great feature!

Important:

  • The spack test system doesn't seem to work and only spheres get tested (see https://github.com/awslabs/palace/actions/runs/27211376257/job/80384686984?pr=735)
  • I think the why we set MPI ranks conflicts with the slot count in the ctest job server, so that we might end up getting oversubscribed count of CPUs. Fixing this, would also unify the two tests categories under ctest, which I think would be very nice.

Comment on lines +109 to +143
# Translate the action's `test-cases` input to a list of Catch2
# case names. "default" / empty -> the [Regression] tag (the
# always-on suite, enumerated dynamically); "long" -> the
# [Long] tag (transmon eigenmodes, ~10 min each); anything else
# is taken as a space-separated list of Catch2 TEST_CASE names
# (e.g. "transmon_coarse cylinder_cavity_pec"), passed through
# verbatim.
translate_cases() {
local raw="$1"
if [[ -z "$raw" || "$raw" == "default" ]]; then
palace-unit-tests "[Regression]~[Long]" --list-tests | awk '/^ [a-z]/ {print $1}'
return
fi
if [[ "$raw" == "long" ]]; then
palace-unit-tests "[Long]" --list-tests | awk '/^ [a-z]/ {print $1}'
return
fi
echo "$raw" | tr ' ' '\n'
}

# One mpirun per case. Each case runs in a fresh process so
# long-lived library state (libCEED kernel cache, MFEM /
# HYPRE singletons, PETSc heap) doesn't accumulate across the
# whole suite.
run_cases() {
while IFS= read -r case; do
[[ -z "$case" ]] && continue
echo "=== Running $case ==="
mpirun -np $NUM_PROC_TEST palace-unit-tests \
"${SOLVER_OVERRIDE[@]}" \
"$case"
done
}

julia --project=test/examples --color=yes test/examples/runtests.jl \
--palace-solver ${{ inputs.solver }} \
${TEST_CASES:+--test-cases "$TEST_CASES"}
translate_cases "${{ inputs.test-cases }}" | run_cases

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace all of this with using ctest instead of manually calling the binary?

It requires some reworking on run_regreesion_test.sh, but this code relies on string parsing and seems pretty fargile.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that this is not working at he moment.

If I look at the test suite in this PR, I only see the sphere example being run.

E.g.: https://github.com/awslabs/palace/actions/runs/27211376257/job/80384686984?pr=735

# ctest -j N controls how many cases run concurrently.
timeout-minutes: 40
env:
NUM_PROC_TEST_MAX: '8'

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can just set PALACE_REGRESSION_NUMPROC instead of NUM_PROC is all we do is reassigning it

Comment on lines +554 to +559
# ctest fans the [Regression] cases out, one fresh
# palace-unit-tests process per case (label "regression"). Each
# process runs at $PALACE_REGRESSION_NUMPROC MPI ranks (read by
# the test/unit/run_regression_test.sh wrapper at run time);
# ctest -j N controls how many cases run concurrently. Total
# ranks in flight = NUM_PROC_TEST * N, kept <= NUM_PROC_TEST_MAX.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important comment below.

I think we should think about this a little bit more.

This is different compared to non-regression tests. For non-regression tests, -j M means that up to M slots are used (typically, one slot = CPU). This is not the number of jobs (because CMake knows that there are Parallel jobs, which use 2 slots each). So, if you run a full non-regression + regression test, I think that we would end up oversubscribing CPUs.

Regression tests do not set PROCESSORS, so ctest assigns them one slot, even if they use multiple CPUs.

It is also counter-intuitive that -j has different meaning for different classes of tests.

Comment thread palace/driver.hpp Outdated
// Run a Palace solve end-to-end on an already-parsed `iodata`.
//
// Preconditions:
// - MPI has been initialised.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// - MPI has been initialised.
// - MPI has been initialized.

Comment thread palace/driver.hpp
// - libCEED, HYPRE, and (if built in) SLEPc/PETSc are initialised.
// - `iodata.problem.output` is set; relative mesh paths in the config
// are resolvable from the current working directory (or have been
// rewritten to absolute paths by the caller).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a list of things to need to happen doesn't seem very robust. Instead, I wonder if we should have our Initialize class/function to set these things up.

I've always found difficult to understand what needs to be initialized and how (and ran into lots of problems with setting up the unit test framwrok).

Comment on lines +184 to +196
- `[Regression]` tests are end-to-end Palace solves diffed against the
reference data under `test/examples/ref/`. Slow enough to deserve
their own ctest registration (`regression-*`, label `regression`)
and skipped from the default unit-test sweep. Opt in with
`ctest -L "^regression$"` or
`palace-unit-tests "[Regression]~[Long]"`. Each case lives in
`test/unit/regression/cases.cpp`.
- `[Long]` is a *modifier* on a regression case for solves that take
long enough (~10 minutes) that they shouldn't run on every PR.
These cases are tagged `[Regression][Long]` and registered as
`long-*` ctest entries (label `long`); the long-tests CI workflow
runs them via `ctest -L "^long$"` when the `trigger-long-tests`
PR label is applied.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Long only apply for regression?

I think it should also apply to regular unit tests, so that the two axes are truly orthogonal. This would also allow us to add long unit tests, which is something we'd get for free.

Comment thread test/unit/regression/cases.cpp Outdated
Comment on lines +4 to +9
// Regression cases: full Palace solves diffed against test/examples/ref/.
// Each case carries [Serial][Parallel][GPU][Regression] (the category
// tag is orthogonal to execution style — every case is valid at any
// rank count and on either CPU or GPU). [Long] modifies a regression
// case to be skipped from the default sweep and run only under the
// long-tests CI workflow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before, we excluded some regression tests from GPU runs because they took too long. I don't see that here and I am not sure if the intersection GPU with Regression works

Comment thread test/unit/CMakeLists.txt Outdated
Comment on lines +112 to +118
# Source-tree roots used by the regression suite. Baking these into the
# binary keeps the common path ("run the built binary") free of env
# plumbing. RunRegressionCase honours a PALACE_EXAMPLES_DIR /
# PALACE_REGRESSION_REF_DIR environment override and a Catch2 CLI flag
# for relocated installs; these are the last-resort defaults.
PRIVATE "PALACE_EXAMPLES_DIR_DEFAULT=\"${CMAKE_SOURCE_DIR}/../examples\""
PRIVATE "PALACE_REGRESSION_REF_DIR_DEFAULT=\"${CMAKE_CURRENT_SOURCE_DIR}/../examples/ref\""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably won't work with space because of relocation of the source code. Everything that is needed for tests has to be in PALACE_TEST_DATA_DIR. I think we should move the regression data there.

Comment thread test/unit/CMakeLists.txt
TEST_SPEC "[Regression]~[Long]"
TEST_EXECUTOR ${CMAKE_CURRENT_BINARY_DIR}/run_regression_test.sh
EXTRA_ARGS ${MPIEXEC_POSTFLAGS}
PROPERTIES

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment on PROCESSORS and job server slots

Comment thread docs/src/developer/testing.md Outdated
Comment on lines +498 to +504
The regression machinery reads the source-tree `examples/` and
`test/examples/ref/` roots via a two-level override chain:

```bash
julia --project runtests.jl --num-proc-test 4
```
| Precedence | Mechanism |
|:---------- |:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| 1 (high) | `--examples-dir` / `--regression-ref-dir` / `--regression-run-dir` CLI flags |
| 2 (low) | Compile-time `PALACE_EXAMPLES_DIR_DEFAULT` / `PALACE_REGRESSION_REF_DIR_DEFAULT` (wired from CMake); run-dir falls back to `std::filesystem::temp_directory_path() / "palace-regression"` |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should just treat regression test data as any other test data (possibly moving the test/unit/data folder to a test/data folder.

hughcars added 22 commits June 22, 2026 15:37
… check

TestFarfield hardcoded columns 3-8, which are wrong for both driven
output (excitation column shifts them to 4-9) and eigenmode output
(5-10/6-11). Index the six complex E-field columns from the right, as
the Julia harness did. Also restore the farfield custom check on
cpw_lumped_uniform, which the port had dropped.
- palace::Run no longer calls MakeOutputFolder (main.cpp already does it
  before WriteResolvedConfig); the regression harness now calls it
  itself. Removes the double call and its spurious overwrite warning.
- ScopedExampleStage broadcasts the rank-0 staging result and fails
  collectively, instead of blocking other ranks at the barrier if root
  errors.
- Empty actual output against a non-empty reference now fails even under
  skip_rowcount.
--palace-device only set Solver.Device in each case's config, which the
test harness never consumed (the MFEM device is built once at startup
from --device), so the CUDA CI path silently ran on CPU. Remove the dead
flag and pass --device cuda, which actually switches the device and adds
the [GPU] tag.
Add PALACE_EXAMPLES_DIR / PALACE_REGRESSION_REF_DIR env overrides (CLI
flag > env > compile-time default) and REQUIRE the config, its mesh, and
the reference directory up front. On a miss the test fails on every rank
with a message pointing at the --examples-dir / --regression-ref-dir
flags and the env vars, so a suite can be run only once the data is
vended.
The spack test jobs install palace from the binary cache, so there is no
build tree for catch_discover_tests. Add GenerateRegressionCTest.cmake,
which lists the installed binary's [Regression] cases (--list-tests
--reporter xml) and writes a relocatable CTestTestfile.cmake referencing
the binary by path, with PROCESSORS set. The run-regression-tests action
now vends examples/ref from the checkout via env, generates the manifest,
and runs ctest -- replacing the fragile --list-tests | awk | per-case
mpirun loop. Linear solver and GPU device are baked into the generated
commands so they vary per job without rebuilding. Non-spack workflows
and the catch_discover_tests registration are unchanged.
Nothing installs Julia on the self-hosted runners anymore (the regression
harness is retired and docs use julia-actions/setup-julia), so remove the
julia-version input/output and the Install Julia step from setup-runner,
and the now-redundant 'dont-install' args.
Catch2 intersects multiple specs, so the pre-PR code appended [Serial]/
[Parallel] unconditionally to keep a user selector confined to the
runnable execution style. The port skipped that whenever a selector was
given, so e.g. palace-unit-tests "[vector]" on one rank wrongly ran
[Parallel]-only cases. Restore the unconditional rank/device tags and
only add ~[Regression] on a bare invocation (regression stays reachable
by name or tag since specs AND together).
Drop [GPU] from cpw_wave_adaptive so the GPU selection (which intersects
[GPU]) skips it, matching the old harness's device != GPU guard. Pass the
device through the generator's enumeration so the GPU manifest excludes
it too, rather than only the per-case run.
Exclude [Long] from the serial/mpi/gpu unit sweeps and the bare
invocation default (not just regression), so a [Long] tag keeps any test
-- unit or regression -- out of the default runs and routes it to the
long-* registration. Enables long unit tests without further plumbing.
Main added dielectric_grating/uniform (Floquet ports). Add the Catch2
case plus a port-floquet-S.csv check that compares only |S[...]| (dB)
magnitudes, skipping phase columns, NaN evanescent modes, and signals
below -200 dB, matching the Julia test_floquet_sparams.
The old Julia harness imposed no per-case timeout; the generated manifest's
600s/1800s caps killed legitimately-slow matrix combos (transmon_amr,
cpw_lumped_adaptive on GPU, cylinder_floquet on STRUMPACK+OpenMP). Rely on
the job timeout-minutes instead, as before.
Regression cases run palace::Run in-process, but test/unit/main.cpp
initialised MPI at the default thread level while the standalone driver
requests MPI_THREAD_MULTIPLE when STRUMPACK is built with PtScotch /
SLATE-ScaLAPACK. Those backends call MPI from OpenMP threads, so the
under-initialised harness segfaulted on STRUMPACK + OMP_NUM_THREADS>=2 +
np>=2 (test-x64-gcc-openmp). Mirror the driver's guard. Reproduced and
verified fixed on an x64 box (antenna_halfwave_dipole, cpw_lumped_eigen).
Port the Julia harness's abs_columns option: gauge-dependent signed
columns are compared by magnitude only. transmon_coarse/transmon_amr
use it for kappa_ext, whose sign is not reproducible for near-zero
(high-Q) external coupling. The old long-tests never actually ran the
transmon cases, so the dropped option went unnoticed until ctest ran
transmon_amr and its iteration-2 kappa_ext sign-flip failed the 1% rtol.
Use the build-tree CTest registrations for macOS unit tests too, set
PROCESSORS on regression/long cases, and let the CMake CI jobs run the
regression label with ctest -j instead of a single sequential pass. Keep
rank count at the default 2; CTest controls occupancy.

Also prefer checkout-neighbor examples/ref paths when running from a
local build directory, after explicit CLI/env overrides and before the
compiled-in fallback.
The CMake CI jobs now let ctest schedule regression cases, but the first
version left the build-tree registration at its default np=2. That changed
the old CMake CI rank count and exposed ARPACK tail-mode differences in
cylinder_cavity_pec. Configure PALACE_REGRESSION_NUMPROC_DEFAULT from the
same capped runner core count used before, so the wrapper rank count and
CTest PROCESSORS reservation agree.
Pass PALACE_REGRESSION_NUMPROC_DEFAULT through the superbuild to the inner
Palace test registration; the previous commit set it only on the outer
project, so the CMake CI rows still ran regression cases at np=2 and
exposed ARPACK tail-mode sensitivity in cylinder_cavity_pec.

Also restore the Julia harness behavior of comparing only the requested
Eigenmode.N leading rows when a solve reports extra eigenpairs.
@hughcars hughcars force-pushed the hughcars/regression-test-refactor branch from 0940fa0 to 3d0c6a8 Compare June 22, 2026 19:37
@hughcars hughcars requested review from Sbozzolo and simlapointe June 23, 2026 00:22

@Sbozzolo Sbozzolo left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes do not address the problem with the job slots.

Adding PALACE_REGRESSION_PROCESSORS adds some notion of parallelism, but there's no connection with MPIEXEC_NUMPROC_FLAG, so you will still have oversubscribed runs.

I looked at it and I didn't find any simple way to make the number of MPI ranks a runtime variable.

I see two options:

  • we make the number of MPI ranks for regerssion tests a build time option (just pass PALACE_REGERSSIN_PROCESSORS`). This is the simplest and would remove the need for a run_regression_test.sh
  • we fold this into GenerateRegressionCTest.cmake and do generation at test time instead of build

I personally would prefer going with the much simpler case of fixing the number of ranks at build time. This puts unit and regression tests on the same footing, so that one can just run ctest -j $(nproc) to run everything fully utilizing all the CPUs.

The default value could be 2, but on our machines it could be nproc/4 or so. This would speed CI up because many of our tests do not benefit from running with 64 cores.

Comment thread test/unit/CMakeLists.txt Outdated
Comment on lines +114 to +120
# Source-tree roots used by the regression suite. Baking these into the
# binary keeps the common path ("run the built binary") free of env
# plumbing. RunRegressionCase honours a PALACE_EXAMPLES_DIR /
# PALACE_REGRESSION_REF_DIR environment override and a Catch2 CLI flag
# for relocated installs; these are the last-resort defaults.
PRIVATE "PALACE_EXAMPLES_DIR_DEFAULT=\"${CMAKE_SOURCE_DIR}/../examples\""
PRIVATE "PALACE_REGRESSION_REF_DIR_DEFAULT=\"${CMAKE_CURRENT_SOURCE_DIR}/../examples/ref\""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just move the regression test data in TEST_DATA_DIR. If we do so, we don't need to do any of the GenerateRegerssionCTest.cmake dance because I already took care of implmenting relocation of test data.

This also reinforces the split bteween exampes and regerssion tests

Comment thread test/unit/CMakeLists.txt Outdated
Comment on lines +208 to +210
if(PALACE_WITH_OPENMP)
math(EXPR PALACE_REGRESSION_PROCESSORS "${PALACE_REGRESSION_PROCESSORS} * 2")
endif()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the build system should know about the details of how of CI is set up. We should just divide by 2 in CI.

Comment thread docs/src/developer/testing.md Outdated
Comment on lines +500 to +501
The regression machinery reads the source-tree `examples/` and
`test/examples/ref/` roots via the following override chain:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This system of override is pretty complex, as I say elsewhere, I think we should treat regression data as test data and get rid of this override.

hughcars added 4 commits June 23, 2026 19:57
Move regression configs, meshes, and references under test/unit/data so the
Catch2 harness reads only PALACE_TEST_DATA_DIR/regression. Drop the
examples/ref override chain and the runtime mpirun wrapper; regression ctest
entries now use a build-time MPI rank count with an explicit PROCESSORS slot
reservation. CI passes those CMake knobs instead of reassigning runtime env.

Spack split jobs still generate a small ctest manifest for the installed
binary, but it no longer vends source-tree data paths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

trigger-long-tests Trigger running long tests for this PR now

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants