HDR Validation Suite v7.4.0

An in-silico validation suite for the Homeodynamic Remediation (HDR) Framework v7.4 — a multi-mode adaptive control system for constrained stochastic linear dynamical systems (SLDS). The suite validates mathematical properties, implementation correctness, and empirical performance across synthetic physiological scenarios covering 32 claims spanning v5.0, v7.0, and v7.1 of the framework.

Repository Structure

HDR/
├── hdr_validation/              # Python package
│   ├── control/                 # Control policies (LQR, MPC, Mode B/C, tube-MPC, MI-MPC, supervisor)
│   ├── inference/               # Estimation (ICI, IMM, Kalman, particle filter, variational, population)
│   ├── model/                   # System models (SLDS, HSMM, coherence, safety, target set, extensions)
│   ├── identification/          # v7.0 identification (hierarchical, BOED, committor, transition rates)
│   └── stages/                  # Stage scripts for stages 08–20
├── results/                     # Per-stage result artifacts (auto-generated)
├── run_all.py                   # Orchestration script (stages 01–20)
├── smoke_runner.py              # Smoke profile runner
├── standard_runner.py           # Standard profile runner
├── extended_runner.py           # Extended profile runner
├── extended_512_runner.py       # Extended profile (T=512)
├── validation_runner.py         # Validation profile runner
├── highpower_runner.py          # Benchmark A (20 seeds × 30 ep/seed)
├── test_*.py                    # 31 pytest test files (307 tests)
├── config.json                  # Master configuration
└── paper_defaults.json          # Reference parameter values from paper

Requirements

Python >= 3.10
numpy >= 1.24
scipy >= 1.10
pandas >= 1.5 (for result aggregation)
pytest >= 7.4 (for testing)
matplotlib >= 3.7 (optional, for plotting)

Running the Validation Suite

Full validation (all 32 claims)

python run_all.py --full-validation

This runs four phases:

Phase	What runs	Claims covered
1	Extended profile, stages 01–03c + 05–07	3–14
2	Highpower benchmark (20 seeds × 30 ep/seed)	1–2
3	Stages 08–20 at production scale	9, 13, 15–36
4	Full pytest suite (31 files, 307 tests)	15–32

Per-profile runs

python run_all.py --profiles smoke             # Fastest (1 seed, 8 episodes)
python run_all.py --profiles smoke standard extended validation
python run_all.py --resume --skip-done         # Resume interrupted runs

Selective stage execution

python run_all.py --stages 08 08b 09 10 11 --force
python run_all.py --stages 12 13 14 15 16 17 18 19 20

Benchmark A (high-power run)

export OPENBLAS_NUM_THREADS=1
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
python highpower_runner.py

Outputs to results/stage_04/highpower/.

Running tests

pytest                          # All 31 test files
pytest test_ici.py -v           # Specific test file

Regenerating Result Artifacts

All result artifacts are regenerated by re-running the appropriate stages:

# Stages 08–20 (profile-independent)
python run_all.py --stages 08 08b 09 10 11 12 13 14 15 16 17 18 19 20 --force

# Stage 04 highpower benchmark
python highpower_runner.py

# Full regeneration of all stages across all profiles
python run_all.py --full-validation --force

Artifacts are written to results/stage_{id}/ with atomic file I/O. Each artifact carries an hdr_version field and generated_at ISO timestamp. Use --resume --skip-done to skip already-completed stages when resuming long runs.

Stage Reference

ID	Description
01	Mathematical validation (tau, committor)
02	Synthetic dataset generation
03	Mode identification and calibration
03b	ICI calibration and regime boundaries
03c	Mode C validation
04	Mode A performance vs baselines
05	Mode B structured exploration
06	State coherence checks
07	Robustness across parameter sweeps
08	Ablation study
08b	Multi-axis asymmetric ablation
09	Baseline comparison
10	Mode B FP/FN sweep
11	Riccati invariant set verification
12	Hierarchical coupling estimation (v7.0)
13	Inference backbone benchmark (v7.0)
14	Population planning (v7.0)
15	Proxy composite estimation (v7.0)
16	Model-failure extension integration (v7.1)
17	Gompertz mortality & complexity collapse (v7.5)
18	Closed-loop ICI benchmark (v7.5)
19	Out-of-family stress tests (v7.5)
20	Structured vs unstructured identification (v7.5)

Claim Validation

def lyapunov_cost(A: np.ndarray, Q: np.ndarray, x: np.ndarray) -> tuple[float, np.ndarray]: P = solve_discrete_lyapunov(A, Q) x = np.asarray(x, dtype=float) return float(x.T @ P @ x), P

def dare_terminal_cost(A: np.ndarray, B: np.ndarray, Q: np.ndarray, R: np.ndarray) -> tuple[np.ndarray, np.ndarray]: P = solve_discrete_are(A, B, Q, R) K = np.linalg.solve(R + B.T @ P @ B, B.T @ P @ A) return P, K

def tau_sandwich(A: np.ndarray, Q: np.ndarray, x: np.ndarray, target: TargetSet, rho: float) -> dict[str, float]: proj = target.project_box(x) xbar = np.asarray(x) - proj tau_h = tau_tilde(x, target, Q, rho, method="box") tau_L, P = lyapunov_cost(A, Q, xbar) eigvals = np.linalg.eigvalsh(np.sqrt(Q) @ P @ np.linalg.pinv(np.sqrt(Q))) eigvals = np.real(eigvals) return { "tau_tilde": float(tau_h), "tau_L": float(tau_L), "lower_coeff": float(np.min(eigvals)), "upper_coeff": float(np.max(eigvals)), }

Reproducing Benchmark A (high-power run)

The headline Benchmark A result (20 seeds × 30 episodes per seed) is produced by a standalone script that is NOT part of run_all.py:

export OPENBLAS_NUM_THREADS=1
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1
python highpower_runner.py

Outputs are written to results/stage_04/highpower/: highpower_summary.json — machine-readable metrics and per-seed gains highpower_table.txt — human-readable results table manuscript_language.txt — recommended manuscript wording

Expected values (fixed seeds 101–2020, 30 ep/seed): N_maladaptive : 179 Mean gain : +0.037 (95 % CI [+0.031, +0.042]) Win rate : 0.838 Safety delta : -0.0001

Cluster-aware CI analysis (WP-2.3)

To verify robustness to within-seed correlation, run the 100-seed cluster bootstrap analysis:

python cluster_bootstrap_runner.py

This runs Stage 04 with 100 seeds × 30 episodes (3,000 total), then computes:

Episode-level and seed-cluster bootstrap 95% CIs
ICC (one-way random effects, seed as grouping factor)
Design effect (DEFF) and effective N
Multi-seed Stage 10 and Stage 15 sweeps

Outputs written to: results/stage_04/cluster_ci_report.json results/stage_04/threshold_claims_audit.txt results/stage_10/multiseed_sweep.json results/stage_15/multiseed_results.json

Environment setup (required on multi-core Linux)

Before running any script or pytest, pin BLAS threads to prevent non-determinism and hangs on shared compute nodes:

export OPENBLAS_NUM_THREADS=1
export OMP_NUM_THREADS=1
export MKL_NUM_THREADS=1

Add these lines to your shell profile or CI environment.

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
.github/workflows		.github/workflows
archive		archive
hdr_validation		hdr_validation
reports		reports
results		results
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAIM_CRITERIA.md		CLAIM_CRITERIA.md
CLAIM_MATRIX.md		CLAIM_MATRIX.md
CLAUDE.md		CLAUDE.md
EXTENDED_PROFILE_REPORT.md		EXTENDED_PROFILE_REPORT.md
FAILURES.md		FAILURES.md
README.md		README.md
RECONCILIATION_v74.md		RECONCILIATION_v74.md
REPRODUCIBILITY.md		REPRODUCIBILITY.md
REPUBLISH_NOTE.md		REPUBLISH_NOTE.md
STANDARD_PROFILE_REPORT.md		STANDARD_PROFILE_REPORT.md
VALIDATION_FAILURES.md		VALIDATION_FAILURES.md
VALIDATION_PLAN.md		VALIDATION_PLAN.md
alpha.json		alpha.json
analyse_highpower.py		analyse_highpower.py
analyse_mismatch.py		analyse_mismatch.py
chance_calibration.json		chance_calibration.json
check_claims.py		check_claims.py
cluster_bootstrap_runner.py		cluster_bootstrap_runner.py
config (1).json		config (1).json
config (15).json		config (15).json
config (2).json		config (2).json
config (22).json		config (22).json
config (36).json		config (36).json
config (8).json		config (8).json
config.json		config.json
conftest.py		conftest.py
derive_criterion.py		derive_criterion.py
extended_512.json		extended_512.json
extended_512_runner.py		extended_512_runner.py
extended_runner.py		extended_runner.py
generate_reports.py		generate_reports.py
highpower_runner.py		highpower_runner.py
manuscript_claims.json		manuscript_claims.json
negative_control.json		negative_control.json
noise_missing.json		noise_missing.json
observability_diagnostic.py		observability_diagnostic.py
paper_defaults.json		paper_defaults.json
plotting.py		plotting.py
pyproject.toml		pyproject.toml
pytest_final.txt		pytest_final.txt
requirements.lock		requirements.lock
requirements.txt		requirements.txt
run_all.py		run_all.py
run_all_manifest.json		run_all_manifest.json
smoke.json		smoke.json
smoke_final.txt		smoke_final.txt
smoke_runner.py		smoke_runner.py
special_sweeps_target_drift.json		special_sweeps_target_drift.json
standard.json		standard.json
standard_runner.py		standard_runner.py
test_adaptive.py		test_adaptive.py
test_adaptive_delta.py		test_adaptive_delta.py
test_coherence.py		test_coherence.py
test_committor.py		test_committor.py
test_committor_jump.py		test_committor_jump.py
test_extensions.py		test_extensions.py
test_hsmm.py		test_hsmm.py
test_ici.py		test_ici.py
test_ici_compound.py		test_ici_compound.py
test_identification.py		test_identification.py
test_imm.py		test_imm.py
test_interaction_matrix.py		test_interaction_matrix.py
test_mimpc.py		test_mimpc.py
test_mode_c.py		test_mode_c.py
test_mode_c_fisher.py		test_mode_c_fisher.py
test_mpc.py		test_mpc.py
test_multirate.py		test_multirate.py
test_particle.py		test_particle.py
test_recovery.py		test_recovery.py
test_safety.py		test_safety.py
test_saturation.py		test_saturation.py
test_stability_check.py		test_stability_check.py
test_stage_08.py		test_stage_08.py
test_stage_08b.py		test_stage_08b.py
test_stage_09.py		test_stage_09.py
test_stage_10.py		test_stage_10.py
test_stage_11.py		test_stage_11.py
test_stage_16.py		test_stage_16.py
test_stage_17.py		test_stage_17.py
test_stage_18.py		test_stage_18.py
test_stage_18b.py		test_stage_18b.py
test_stage_18c.py		test_stage_18c.py
test_stage_20.py		test_stage_20.py
test_supervisor.py		test_supervisor.py
test_tube_mpc.py		test_tube_mpc.py
test_variational.py		test_variational.py
validation.json		validation.json
validation_runner.py		validation_runner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HDR Validation Suite v7.4.0

Repository Structure

Requirements

Running the Validation Suite

Full validation (all 32 claims)

Per-profile runs

Selective stage execution

Benchmark A (high-power run)

Running tests

Regenerating Result Artifacts

Stage Reference

Claim Validation

Reproducing Benchmark A (high-power run)

Cluster-aware CI analysis (WP-2.3)

Environment setup (required on multi-core Linux)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HDR Validation Suite v7.4.0

Repository Structure

Requirements

Running the Validation Suite

Full validation (all 32 claims)

Per-profile runs

Selective stage execution

Benchmark A (high-power run)

Running tests

Regenerating Result Artifacts

Stage Reference

Claim Validation

Reproducing Benchmark A (high-power run)

Cluster-aware CI analysis (WP-2.3)

Environment setup (required on multi-core Linux)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages