An in-silico validation suite for the Homeodynamic Remediation (HDR) Framework v7.4 — a multi-mode adaptive control system for constrained stochastic linear dynamical systems (SLDS). The suite validates mathematical properties, implementation correctness, and empirical performance across synthetic physiological scenarios covering 32 claims spanning v5.0, v7.0, and v7.1 of the framework.
HDR/
├── hdr_validation/ # Python package
│ ├── control/ # Control policies (LQR, MPC, Mode B/C, tube-MPC, MI-MPC, supervisor)
│ ├── inference/ # Estimation (ICI, IMM, Kalman, particle filter, variational, population)
│ ├── model/ # System models (SLDS, HSMM, coherence, safety, target set, extensions)
│ ├── identification/ # v7.0 identification (hierarchical, BOED, committor, transition rates)
│ └── stages/ # Stage scripts for stages 08–20
├── results/ # Per-stage result artifacts (auto-generated)
├── run_all.py # Orchestration script (stages 01–20)
├── smoke_runner.py # Smoke profile runner
├── standard_runner.py # Standard profile runner
├── extended_runner.py # Extended profile runner
├── extended_512_runner.py # Extended profile (T=512)
├── validation_runner.py # Validation profile runner
├── highpower_runner.py # Benchmark A (20 seeds × 30 ep/seed)
├── test_*.py # 31 pytest test files (307 tests)
├── config.json # Master configuration
└── paper_defaults.json # Reference parameter values from paper
- Python >= 3.10
- numpy >= 1.24
- scipy >= 1.10
- pandas >= 1.5 (for result aggregation)
- pytest >= 7.4 (for testing)
- matplotlib >= 3.7 (optional, for plotting)
python run_all.py --full-validationThis runs four phases:
| Phase | What runs | Claims covered |
|---|---|---|
| 1 | Extended profile, stages 01–03c + 05–07 | 3–14 |
| 2 | Highpower benchmark (20 seeds × 30 ep/seed) | 1–2 |
| 3 | Stages 08–20 at production scale | 9, 13, 15–36 |
| 4 | Full pytest suite (31 files, 307 tests) | 15–32 |
python run_all.py --profiles smoke # Fastest (1 seed, 8 episodes)
python run_all.py --profiles smoke standard extended validation
python run_all.py --resume --skip-done # Resume interrupted runspython run_all.py --stages 08 08b 09 10 11 --force
python run_all.py --stages 12 13 14 15 16 17 18 19 20export OPENBLAS_NUM_THREADS=1
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
python highpower_runner.pyOutputs to results/stage_04/highpower/.
pytest # All 31 test files
pytest test_ici.py -v # Specific test fileAll result artifacts are regenerated by re-running the appropriate stages:
# Stages 08–20 (profile-independent)
python run_all.py --stages 08 08b 09 10 11 12 13 14 15 16 17 18 19 20 --force
# Stage 04 highpower benchmark
python highpower_runner.py
# Full regeneration of all stages across all profiles
python run_all.py --full-validation --forceArtifacts are written to results/stage_{id}/ with atomic file I/O. Each artifact carries an hdr_version field and generated_at ISO timestamp. Use --resume --skip-done to skip already-completed stages when resuming long runs.
| ID | Description |
|---|---|
| 01 | Mathematical validation (tau, committor) |
| 02 | Synthetic dataset generation |
| 03 | Mode identification and calibration |
| 03b | ICI calibration and regime boundaries |
| 03c | Mode C validation |
| 04 | Mode A performance vs baselines |
| 05 | Mode B structured exploration |
| 06 | State coherence checks |
| 07 | Robustness across parameter sweeps |
| 08 | Ablation study |
| 08b | Multi-axis asymmetric ablation |
| 09 | Baseline comparison |
| 10 | Mode B FP/FN sweep |
| 11 | Riccati invariant set verification |
| 12 | Hierarchical coupling estimation (v7.0) |
| 13 | Inference backbone benchmark (v7.0) |
| 14 | Population planning (v7.0) |
| 15 | Proxy composite estimation (v7.0) |
| 16 | Model-failure extension integration (v7.1) |
| 17 | Gompertz mortality & complexity collapse (v7.5) |
| 18 | Closed-loop ICI benchmark (v7.5) |
| 19 | Out-of-family stress tests (v7.5) |
| 20 | Structured vs unstructured identification (v7.5) |
def lyapunov_cost(A: np.ndarray, Q: np.ndarray, x: np.ndarray) -> tuple[float, np.ndarray]: P = solve_discrete_lyapunov(A, Q) x = np.asarray(x, dtype=float) return float(x.T @ P @ x), P
def dare_terminal_cost(A: np.ndarray, B: np.ndarray, Q: np.ndarray, R: np.ndarray) -> tuple[np.ndarray, np.ndarray]: P = solve_discrete_are(A, B, Q, R) K = np.linalg.solve(R + B.T @ P @ B, B.T @ P @ A) return P, K
def tau_sandwich(A: np.ndarray, Q: np.ndarray, x: np.ndarray, target: TargetSet, rho: float) -> dict[str, float]: proj = target.project_box(x) xbar = np.asarray(x) - proj tau_h = tau_tilde(x, target, Q, rho, method="box") tau_L, P = lyapunov_cost(A, Q, xbar) eigvals = np.linalg.eigvalsh(np.sqrt(Q) @ P @ np.linalg.pinv(np.sqrt(Q))) eigvals = np.real(eigvals) return { "tau_tilde": float(tau_h), "tau_L": float(tau_L), "lower_coeff": float(np.min(eigvals)), "upper_coeff": float(np.max(eigvals)), }
The headline Benchmark A result (20 seeds × 30 episodes per seed) is produced by a standalone script that is NOT part of run_all.py:
export OPENBLAS_NUM_THREADS=1
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
python highpower_runner.py
Outputs are written to results/stage_04/highpower/: highpower_summary.json — machine-readable metrics and per-seed gains highpower_table.txt — human-readable results table manuscript_language.txt — recommended manuscript wording
Expected values (fixed seeds 101–2020, 30 ep/seed): N_maladaptive : 179 Mean gain : +0.037 (95 % CI [+0.031, +0.042]) Win rate : 0.838 Safety delta : -0.0001
To verify robustness to within-seed correlation, run the 100-seed cluster bootstrap analysis:
python cluster_bootstrap_runner.py
This runs Stage 04 with 100 seeds × 30 episodes (3,000 total), then computes:
- Episode-level and seed-cluster bootstrap 95% CIs
- ICC (one-way random effects, seed as grouping factor)
- Design effect (DEFF) and effective N
- Multi-seed Stage 10 and Stage 15 sweeps
Outputs written to: results/stage_04/cluster_ci_report.json results/stage_04/threshold_claims_audit.txt results/stage_10/multiseed_sweep.json results/stage_15/multiseed_results.json
Before running any script or pytest, pin BLAS threads to prevent non-determinism and hangs on shared compute nodes:
export OPENBLAS_NUM_THREADS=1
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
Add these lines to your shell profile or CI environment.