feat: add simulate subcommand for E2E upload test suite by Rohit-Ekbote · Pull Request #797 · runwhen-contrib/runwhen-local

Rohit-Ekbote · 2026-05-05T20:46:45Z

Summary

Adds a simulate subcommand to runwhen-local that produces deterministic workspace uploads from a YAML test config. Intended for use by an external test suite verifying platform-side ingestion behavior — not for end-user use.
A new test_synth INDEXER component synthesizes resources from a YAML config under the existing kubernetes platform; a bundled synthetic codecollection at src/simulator-codecollection/ provides the passthrough generation rule and minimal runbook/sli/slo templates; the CLI subcommand reuses the existing /run/ REST flow and upload code path.
No core production code paths modified. src/enrichers/generation_rules.py, src/workspace_builder/views.py, etc. are untouched. Integration is via the existing component framework, the existing codeCollections request setting, and a small additive change to run.py.
Design spec at docs/superpowers/specs/2026-05-05-e2e-upload-test-suite-design.md, implementation plan at docs/superpowers/plans/2026-05-05-e2e-upload-test-suite.md, user-guide at docs/user-guide/features/simulator.md.

Known caveats (documented in user-guide)

SLX directory names are shortened/hashed by the rule engine — not the verbatim slug. Test assertions should not hard-code them.
SLI/SLO templates use a deliberate render-time exception to skip emission when the corresponding subdict is absent; this produces benign entries in skipped_templates_report.md.
The simulator pretends its inventory is a Kubernetes inventory (synthesized resources are typed deployment under platform kubernetes) to avoid registering a new platform handler.

Test Plan

All 11 simulator integration tests pass (tests_simulator, tests_simulator_cli, tests_simulator_envelope)
Existing test suite unaffected — no production paths touched
Verify end-to-end against a real PAPI by the external test suite repo

🤖 Generated with Claude Code

Captures the brainstorming output for a runwhen-local-side simulator that lets a separate test suite repo verify PAPI upload behavior without live clusters or cloud APIs. Architecture A2: new simulate subcommand on run.py + test_synth component + passthrough generation rule + minimal stub templates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Eight TDD-friendly tasks decomposing the simulator design spec into bite-sized increments: scaffold and register the test_synth indexer, synthesize TestResource instances from a YAML config, add the passthrough generation rule, build minimal templates with conditional SLI/SLO rendering, wire up the simulate CLI subcommand on run.py, and emit the task_id JSON envelope on stdout. Notes deviation from the spec's src/simulator/ layout — the existing component framework requires components to live in src/indexers/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ernetes platform

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The generation-rule engine de-duplicates SLXs by full_name across rules, so per-rule predicate gating won't accumulate output items into a single SLX directory. Instead, the passthrough rule unconditionally schedules runbook/sli/slo, and the SLI/SLO templates raise a deliberate Jinja ZeroDivisionError when match_resource.sli or match_resource.slo is empty, which render_output_items catches and records as a skip. test_synth also tags each resource with has_sli/has_slo flags ("yes"/"no") for diagnostics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a `simulate` subcommand to `src/run.py` that drives the simulator pipeline end-to-end against a YAML test config. The subcommand materializes the bundled `simulator-codecollection/` as a temp git repo (so the codecollection loader can clone it), forces the components list to test_synth,generation_rules,render_output_items, and overrides the request data with `testConfig` and `codeCollections` before reusing the existing RUN_COMMAND dispatch. Extracts the codecollection materialization helper into utils.py so it can be shared between run.py and tests_simulator.py, and adds a CLI test that verifies argparse accepts `simulate` and the CLI reaches the REST POST attempt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a pure helper build_simulate_envelope() to construct a single-line JSON object {"task_id", "workspace_name"} and wires it into run.py's upload success path. Emitted on stdout only when the original command was simulate, allowing the external e2e test suite to capture task_id for polling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Documents invocation, test-config schema, output envelope, and the simulator's deliberate scope boundaries (workspace lifecycle and post-upload verification belong to the calling test suite). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The CLI test was invoking the system interpreter from PATH, which doesn't share the test runner's installed dependencies (e.g., requests). Switching to sys.executable ensures the subprocess uses the same interpreter and package set as the test process. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The simulator doesn't do live discovery, so it has no need for a cloudConfig section. Previously, running simulate without a workspaceInfo.yaml hit the "cloudConfig is missing" guard and failed. Now the simulate command path silently defaults cloud_config to {} when none is supplied, while preserving the strict guard for the regular run command. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The fatal() call on line 870 was a plain string instead of an f-string, so the literal text "{e}" was emitted instead of the real exception. This made every connection-level upload failure indistinguishable, including TLS verification failures, hostname resolution errors, and timeouts. Changing to an f-string surfaces the real cause. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Templates now produce the same shape as a real workspace builder run: metadata.name uses {workspace}--{slx_name}, labels include the standard slx/workspace/locationId/locationName via common-labels.yaml, annotations include fullSlxName + sourceGenerationRule* + qualifiers + generated-by, and spec contents match the platform schema: - slx.yaml: alias, asMeasuredBy, imageURL, statement, owners, configProvided, tags, additionalContext - runbook.yaml: location + codeBundle.{repoUrl, ref, pathToRobot} + configProvided + secretsProvided - sli.yaml: location + locations + codeBundle + description + displayUnits + intervalStrategy + intervalSeconds + alertConfig - slo.yaml: location + codeBundle + target + configProvided Test config schema gained optional fields (repoURL, ref, alias, owners, statement, tags, configProvided, additionalContext per SLX; pathToRobot, configProvided, secretsProvided per output item). The previous "passthrough variables" model couldn't produce a runnable workspace because spec.codeBundle was missing — uploads succeeded but the platform created the workspace alone with 0 SLXs/SLIs/SLOs/runbooks. After this change the upload archive carries the full structural shape the platform's UploadProcessor expects. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Major schema extension to the simulator's testConfig YAML, all backwards compatible — old test configs keep working with the existing single synthetic deployment per SLX behavior. New top-level sections: - inventory.{clusters, resources}: declare multi-cluster topology with K8s resources of any kind (Deployment, StatefulSet, Service, Ingress, ...). Resources are referenced by id from SLX entries. - slxGroups: workspace-level groupings rendered into workspace.yaml's spec.slxGroups. Group SLXs by theme, cluster, ownership, etc. - slxRelationships: SLX-to-SLX dependencies rendered into workspace.yaml's spec.slxRelationships. - slxs[*].resources: list of inventory.resources ids this SLX targets. 1:1 (one SLX per resource), 1:N (multiple SLXs share a resource), and N:1 (one SLX aggregates multiple resources via additionalContext.child- Resources) are all supported. Auto-derived SLX fields: When an SLX binds to inventory resources, the simulator auto-fills tags (platform, cluster, namespace, kind, resource_name, resource_type, plus [k8s]<label-key> for each K8s label on the resource), additionalContext (hierarchy, qualified_name, resourcePath, childResources for N:1), and qualifiers. User-supplied tags / additionalContext on the SLX entry override the auto-derived versions. New plumbing: - src/enrichers/test_groups.py: post-generation_rules enricher that reads testConfig and populates GROUPS_PROPERTY / SLX_RELATIONSHIPS_PROPERTY with the right workspace-prefixed SLX names. - src/component.py: registers test_groups in the ENRICHER stage. - src/run.py: simulate dispatch includes test_groups in the components list. - 6 new tests in tests_simulator.py covering inventory parsing, K8s kind variety, k8s-prefixed label tags, N:1 child resources, slxGroups rendering, slxRelationships rendering. - Updated user-guide doc with the full schema. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds examples/simulator/test.yaml as a complete working example covering the major schema features (multi-cluster inventory, kind variety, 1:N and N:1 cardinality, slxGroups, slxRelationships) so users can copy and adapt. The user-guide doc now links to it. Also commits src/poetry.lock for reproducible installs. The Poetry project has a pyproject.toml but was missing a lockfile, so each poetry install resolved fresh and could pull different versions across developers / CI. Committing the lock follows Poetry's documented recommendation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reduces test config verbosity. Two related conveniences: (A) Top-level `defaults` section. Per-SLX fields inherited when the entry doesn't supply the field. Top-level scalars are simple replacement; runbook/sli/slo subdicts deep-merge (per-SLX keys win). Defaults alone don't trigger sli/slo rendering — the SLX must still opt in by having the key. (B) `pathToRobot` auto-derivation. When a runbook/sli/slo subdict is non-empty but lacks pathToRobot, the simulator fills in the conventional codebundles/<codeBundle>/<runbook|sli|slo>.robot path. Together these eliminate ~40% of the per-SLX boilerplate. The example config went from repeating repoURL/ref/secretsProvided/pathToRobot on every SLX to declaring them once at the top. 3 new tests verify defaults inheritance, pathToRobot auto-derivation, and deep-merge semantics for runbook subdict. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The simulate CLI is a client that POSTs to the workspace builder's Django REST service; the consumer must have the service running before invoking simulate. Adds a Prerequisites section with two run shapes: - Local dev (manage.py runserver in a second terminal) — good for iterating on the simulator code itself. - Containerized (docker run runwhen-local + docker exec simulate) — the shape an external test suite repo should adopt, pinning a specific image tag and mounting test fixtures into /shared. References .test/k8s/upload/Taskfile.yaml for a concrete reference. Also documents the --rest-service-host CLI flag for non-default hosts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ervice When simulate is invoked without a user-supplied --rest-service-host, the CLI probes localhost:8000. If reachable, it uses the existing service. If not, it spawns an embedded Django dev server on a free port, waits for it to bind, runs the request through it, and tears it down via atexit. This makes the simulator self-contained for standalone consumers — no need to start manage.py runserver in another terminal first. The existing run / upload commands and the existing Docker / containerized flow are untouched: when something is already serving on localhost:8000 (or --rest-service-host is supplied), no embedding happens. The embedded server binds on 127.0.0.1 but the CLI hits it through "localhost" so Django's default ALLOWED_HOSTS=["localhost"] config accepts the requests. Verified end-to-end on a cold invocation: 5-SLX example uploaded to a real PAPI without any pre-running REST service. All 20 simulator unit tests still pass (they POST directly to the Django test client and don't touch the embedded path). Doc updated: Prerequisites section now leads with the standalone shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Rohit-Ekbote and others added 11 commits May 5, 2026 23:23

feat(simulator): scaffold test_synth indexer as no-op

d75eb30

feat(simulator): synthesize TestResource instances from testConfig YAML

e1ba328

feat(simulator): passthrough rule via synthetic codecollection on kub…

e50ac0d

…ernetes platform

feat(simulator): runbook template + render verification

b3e931e

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Rohit-Ekbote requested a review from stewartshea as a code owner May 5, 2026 20:46

Rohit-Ekbote and others added 8 commits May 6, 2026 12:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add simulate subcommand for E2E upload test suite#797

feat: add simulate subcommand for E2E upload test suite#797
Rohit-Ekbote wants to merge 19 commits into
mainfrom
emdash/qna-6ux

Rohit-Ekbote commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Rohit-Ekbote commented May 5, 2026

Summary

Known caveats (documented in user-guide)

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant