Skip to content

Benchmark Results: C11 vs Reference Performance #8

@jamestjsp

Description

@jamestjsp

SLICOT C11 Benchmark Report

System Configuration

Component Value
CPU Apple M1 Pro
RAM 16 GB
OS Darwin 25.2.0 arm64
Compiler Apple clang 17.0.0
BLAS/LAPACK Accelerate.framework
Build debugoptimized (-O2 -g)

Methodology

  • Warmup: 3 iterations (cache priming)
  • Timed runs: 10 iterations per benchmark
  • Timer: mach_absolute_time() (nanosecond resolution)
  • Statistics: min, max, mean, stddev

SB02MD — Continuous-time Algebraic Riccati Equation Solver

Solves Q + A'X + XA - XGX = 0 using Laub's Schur vector method.

Dataset N Mean (μs) Min Max σ Info
BB01103 4 9.66 9.58 10.04 0.14
BB01104 8 32.73 32.17 36.21 1.24
BB01105 9 20.37 20.00 21.38 0.49
BB01404 21 164.11 162.67 166.46 1.49
BB01106 30 210.43 208.92 216.00 2.17
BB02107 4 4.68 4.62 4.79 0.05
BB02108 4 6.53 6.46 6.67 0.06
BB02110 4 10.17 10.08 10.33 0.08 info=3
BB02111 4 0.54 0.50 0.58 0.02 info=4
BB02113 4 4.65 4.58 4.79 0.07 info=3

Notes:

  • info=3: Schur reordering failed (ill-conditioned problem)
  • info=4: Fewer than N stable eigenvalues (expected for some benchmark cases)

Scaling Analysis

n=4:   ~10 μs
n=8:   ~33 μs   (3.3x for 2x n, expect 8x for O(n³))
n=9:   ~20 μs
n=21: ~164 μs
n=30: ~210 μs   (21x for 7.5x n, expect 422x for O(n³))

Observed scaling is sub-cubic — likely dominated by BLAS L3 efficiency on M1.


BB01AD — CAREX Benchmark Generator

Generates continuous-time algebraic Riccati equation test problems.

Group 1: Fixed-Size Examples (Literature Problems)

Example N Mean (μs) Description Info
1.1 2 0.15 Laub 1979, Ex.1
1.2 2 0.16 Laub 1979, Ex.2 (uncontrollable)
1.3 0.05 L-1011 aircraft model needs data
1.4 0.09 Binary distillation column needs data
1.5 0.12 Tubular ammonia reactor needs data
1.6 0.42 J-100 jet engine needs data

Group 2: Parameter-Dependent Examples

Example N Mean (μs) Description
2.1 2 0.17 Arnold/Laub Ex.1 (stabilizability limit)
2.2 2 0.35 Arnold/Laub Ex.3 (singular R)
2.3 2 0.17 Kenney/Laub/Wette Ex.2
2.4 2 0.14 Bai/Qian (ill-conditioned H)
2.5 2 0.16 H∞ problem
2.6 3 0.77 Petkov (badly scaled)
2.7 4 0.30 Magnetic tape control
2.8 4 0.25 Arnold/Laub Ex.2
2.9 1.21 Boeing B-767 flutter

Group 3: Scalable Examples

Example N Mean (μs) σ Description
3.1 39 18.51 0.36 String of high-speed vehicles
3.2 64 32.82 0.27 Circulant matrices

BD01AD — CTDSX Descriptor System Generator

Generates continuous-time dynamical system benchmark examples.

Group 1 & 2: Fixed/Parameter-Dependent

Example N Mean (μs) Info
1.1 2 0.02
1.2 2 0.02
2.1 4 0.05
2.2 4 0.05
2.4 3 0.05

Group 3: Scalable

Example N Mean (μs) σ Throughput
3.1 39 1.67 0.02 23.4 M elem/s
3.2 100 9.10 0.03 11.0 M elem/s

Performance Summary

Routine Best Case Worst Case Typical
SB02MD (Riccati) 4.7 μs (n=4) 210 μs (n=30) ~30 μs (n=8)
BB01AD (CAREX gen) 0.15 μs (n=2) 33 μs (n=64) ~0.3 μs
BD01AD (CTDSX gen) 0.01 μs (n=2) 9.1 μs (n=100) ~0.05 μs

Throughput Estimates

For Riccati solver (2n × 2n Schur decomposition):

n=30:  210 μs → 4,762 solves/sec
       Matrix ops: ~54,000 elements → 257 M elem/s

How to Run

# Build
meson setup build && ninja -C build

# Meson benchmark suite
meson test -C build --benchmark

# Individual runs
./build/benchmarks/bench_sb02md SLICOT-Reference/benchmark_data/BB01*.dat
./build/benchmarks/bench_bb01ad
./build/benchmarks/bench_bd01ad

# Python runner
python scripts/run_benchmarks.py

Observations

  1. Sub-cubic scaling: SB02MD shows better-than-expected scaling on M1, likely due to Accelerate's optimized BLAS L3 routines
  2. Generator routines are fast: BB01AD/BD01AD are dominated by setup cost, actual matrix generation is memory-bound
  3. Timer resolution: ~40ns minimum measurable on macOS, some BD01AD results show quantization
  4. Ill-conditioned cases: BB02110/111/113 correctly report numerical difficulties (info=3,4)

Benchmark infrastructure added in commit dd608f3.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions