Skip to content

Codegen audit: strict request/response validation + spec-evolution coverage#11

Merged
caballeto merged 6 commits into
mainfrom
chore/codegen-audit-phase-0-and-1
Apr 21, 2026
Merged

Codegen audit: strict request/response validation + spec-evolution coverage#11
caballeto merged 6 commits into
mainfrom
chore/codegen-audit-phase-0-and-1

Conversation

@caballeto
Copy link
Copy Markdown
Member

Part of the cross-repo codegen audit (devhelmhq/mono#246). Brings the CLI in line with policies P1–P5 from cowork/design/040-codegen-policies.md and adds spec-evolution coverage so future regressions surface in CI.

Summary

Codegen pipeline

  • scripts/generate-zod.mjs: switch to shouldExportAllSchemas: true + strictObjects: true. Was silently using a non-existent exportSchemas flag, so single-use DTOs were inlined and lost. Schema count: 136 → 340.
  • Generated spec-facts.generated.ts enum tuples now power both CLI flag enumerations and Zod constraints — single source of truth.
  • Removed hand-rolled src/lib/response-schemas.ts (was drifting from the spec); all consumers now import from api-zod.generated.ts.
  • TS strictness: enabled noUncheckedIndexedAccess.

Request-side validation (P2)

  • ResourceConfig<T> no longer has a T = unknown default; every call site commits to a concrete DTO.
  • New CreatableResource<T> / UpdatableResource<T> interfaces require bodyBuilder, createRequestSchema, updateRequestSchema (typed ZodType<object> so primitive schemas are a compile error).
  • Removed every fallback ternary (?? unwrapData(...), bodyBuilder ? ... : raw, responseSchema ? ... : ...) in crud-commands.ts — paths are now unconditionally schema-validated.
  • Per-resource createRequestSchema / updateRequestSchema wired up for monitors, alert-channels, notification-policies, environments, secrets, tags, resource-groups, webhooks, status-pages, incidents, api-keys.

Response-side validation (P1)

  • Every responseSchema is wired through parseSingle / fetchPaginatedValidated with a .strict() envelope, so unknown top-level fields surface as a typed ValidationError.

Type cleanup

  • Dropped per-variant as CreateMonitorRequest['config'], as CreateAlertChannelRequest['config'], as components['schemas']['StatusPageBranding'] casts in buildMonitorConfig, parseAlertChannelConfigFlag, and loadBrandingFile — outer schema parse enforces shape at runtime.
  • Replaced as IncidentSeverity / as boolean with String(...) / Boolean(...) coercions backstopped by the request-schema parse.

Resource-level fixes that fell out of strict validation

  • Removed dead SECRETS.--environment flag (no API field).
  • Added ENVIRONMENTS.--default for the now-required isDefault field.
  • Marked RESOURCE_GROUPS.update.--name required (matches UpdateResourceGroupRequest which is not .partial()).

Spec-evolution coverage (in monorepo PR)

The new tests/surfaces/evolution/surfaces/test_cli.py scenarios in mono#246 exercise this branch's build to assert P1 (response strict envelope), P4 (error taxonomy + exit codes), and forward-compat (add_property on response/request DTO, add_inline_enum_value on INCIDENT_SEVERITIES, unknown-flag rejection).

Test plan

  • npm run build — all generators + tsc clean
  • npm test — 846 unit tests passing
  • npm run lint — clean
  • tests/surfaces/cli/ integration suite passes against this branch
  • tests/surfaces/evolution/surfaces/test_cli.py — 15 passing + 2 documented xfails

Made with Cursor

…s hook

Phase 0/1 of the cross-repo codegen audit. Aligns with the policy doc
landing in monorepo at `cowork/design/040-codegen-policies.md`.

- `scripts/generate-zod.mjs`: when an enum referenced by the
  spec-facts module is missing from the spec (e.g. after a
  drop_inline_enum_value mutation in the evolution harness), emit
  `export const HTTP_METHODS = [] as const` instead of dropping the
  export entirely. This keeps handwritten code that imports the
  constant compilable; if the constant becomes empty at runtime the
  surrounding command-validation logic still rejects every value as
  expected.
- `scripts/regen-from.sh`: standardised entrypoint the
  spec-evolution harness drives. Forces `tsc -b --force` so the
  incremental cache never masks a stale build between mutations.

Two known limitations are tracked as `xfail` in the harness suite:
  * dropping a `oneOf` variant referenced by name in handwritten
    apiSchemas.ts re-exports
  * dropping a property referenced by name on a typed Zod helper
Those require a follow-up to detach handwritten code from generated
identifiers and will be addressed in Phase 2.

Made-with: Cursor
…s from spec, enable noUncheckedIndexedAccess

Phase 2 cleanup across the CLI codegen pipeline:

- Wire `parseSingle`/`parsePage`/`parseCursorPage` response validation
  through `apiGetSingle` so commands no longer rely on `data as T` casts
  in `checkedFetch` (Bug #5, P1 strict-fail on response envelopes).
- Generate `MATCH_RULE_TYPES`, `WEBHOOK_EVENT_TYPES`, and
  `STATUS_PAGE_VISIBILITIES` from the OpenAPI spec via
  `spec-facts.generated.ts` instead of hardcoding them in
  `yaml/zod-schemas.ts` and command flags (Bug #14).
- Drive the imperative `--visibility` flag for `status-pages` from the
  spec-derived `STATUS_PAGE_VISIBILITIES` tuple instead of a hardcoded
  one-element array (Bugs #1, #2).
- Drop the `@ts-nocheck` pragma on `api-zod.generated.ts` (now
  preprocessed for Zod-3 compatibility during generation) and fix the
  type fallout in command files.
- Enable `noUncheckedIndexedAccess: true` in `tsconfig.json` and refactor
  index-loop sites in YAML resolver / validator / interpolation /
  child-reconciler / parser / state show to satisfy it.

Made-with: Cursor
…s.ts

The CLI's openapi-zod-client invocation passed `exportSchemas: true` —
which isn't a valid option and was silently ignored — so single-use
response DTOs (MonitorDto, IncidentDto, AuthMeResponse, …) were inlined
into endpoint definitions and stripped by extractSchemas(). To fill the
gap, response-schemas.ts hand-rolled `.passthrough()` Zod schemas for a
handful of DTOs, which then drifted from the spec (missing fields,
wrong nullability, no enum narrowing).

Changes:
  * Replace `exportSchemas` with the actual option name
    `shouldExportAllSchemas: true` (matches sdk-js). The schema count
    jumps 136 → 340 and `apiSchemas.MonitorDto` etc. are now first-class
    exports.
  * Delete src/lib/response-schemas.ts entirely.
  * auth/me, status, yaml/entitlements switch to apiSchemas.AuthMeResponse
    / DashboardOverviewDto from the generated file. AuthMeResponse type
    now flows from `z.infer<typeof apiSchemas.AuthMeResponse>` so spec
    drift surfaces as a typed ValidationError at the API boundary.
  * Wire `responseSchema: apiSchemas.<Dto>` into every CRUD ResourceConfig
    in resources.ts (monitors, incidents, alert-channels, notification-
    policies, environments, secrets, tags, resource-groups, webhooks,
    api-keys, dependencies, status-pages). Update ResourceConfig docs to
    reflect that schemas are no longer opt-in.
  * Update entitlements.test fixture to match the strict generated schema
    (KeyInfo.createdAt now requires offset, EntitlementDto requires
    defaultValue + overridden, PlanInfo requires trialActive). Tests
    previously passed against the lenient hand-rolled passthrough.

Pre-existing build breakage on this branch (introduced by the latest
spec sync that narrowed subscribedEvents and MatchRule.type to literal
unions) is fixed in passing:

  * Tighten YamlMatchRule.type and YamlWebhook.subscribedEvents to the
    spec-facts narrowed types (MatchRuleTypes, WebhookEventTypes).
  * Make sortedIds<T extends string>() generic so it preserves the
    narrowed item type through the snapshot pipeline.
  * Add parseWebhookEvents() in resources.ts to validate --events flag
    values against WEBHOOK_EVENT_TYPES with a clear single-shot error
    listing all unknown values.
  * Cast WebhookEndpointDto.subscribedEvents → WebhookSnapshot enum at
    one boundary point with a comment flagging the spec asymmetry
    (DTO emits string[]; CreateRequest narrows to enum) for follow-up
    in the API repo.

All 846 unit tests + lint pass.

Made-with: Cursor
Tightens the CRUD factory's `ResourceConfig` so every code path is
schema-backed at runtime — no more `?? unwrapData(...)` fall-throughs:

- `responseSchema`, `bodyBuilder`, `createRequestSchema`,
  `updateRequestSchema` are now non-optional on the
  `CreatableResource` / `UpdatableResource` subtypes.
- `T` on `ResourceConfig<T>` no longer defaults to `unknown`; every
  call site must commit to a concrete DTO.
- `ZodType<object>` (rather than `ZodType<unknown>`) on request schemas
  so a primitive schema like `z.string()` is a compile error.
- Added explicit `bodyBuilder`s for `ENVIRONMENTS`, `SECRETS`, `TAGS`,
  `RESOURCE_GROUPS` (previously leaked raw flag dicts into `apiPost`).
- Per-request runtime validation via `parseSchema(createRequestSchema,
  ...)` / `updateRequestSchema` — invalid input now fails locally with
  a `ValidationError` and a path into the offending field instead of a
  generic 400 from the API.
- Dropped per-variant `as CreateMonitorRequest['config']` /
  `as CreateAlertChannelRequest['config']` /
  `as components['schemas']['StatusPageBranding']` casts in
  `buildMonitorConfig`, `parseAlertChannelConfigFlag`, and
  `loadBrandingFile` — the outer schema parse enforces the union /
  shape at runtime, so the inner casts were noise.

Resource-level fixes that fell out of strict request validation:

- `SECRETS.--environment` flag was dead (no API field) — removed.
- `ENVIRONMENTS.--default` flag added; `CreateEnvironmentRequest.isDefault`
  is required (not nullish) so omitting it would now reject locally.
- `RESOURCE_GROUPS.updateFlags.name` marked `required: true` to match
  `UpdateResourceGroupRequest` (no `.partial()`).

Build, lint, and 846 unit tests all pass.

Made-with: Cursor
Adds three integration tests that exercise the deploy state machine's
recovery path when an API call mid-apply succeeds for a parent resource
but fails for one of its dependents. Each test runs deploy once with a
failure injected at a specific call site, asserts the partial state
file contents (parent persisted, missing dependent absent), then
re-runs deploy against the same fake API and asserts convergence.

Coverage:
- (a) resource group created + monitor created, but membership add
  fails on first deploy; second deploy adds the missing membership.
- (b) status page created, but a child component create fails;
  second deploy creates the missing component.
- (c) status page created, but a child component group create fails;
  second deploy creates the missing group.

These pin the post-fix behavior of the historical "phantom success"
bug where the entire desired state was written even when individual
operations had failed mid-run.

Made-with: Cursor
Adds a `PartialApplyError` carrying `{id?, children?}` that handlers raise
when an apply step fails after partially mutating the API. Wired through
four layers so any kind of partial failure now persists what succeeded:

- applyChildDiff continues through deletes/creates/updates on individual
  failure, accumulates surviving childState, and throws PartialApplyError
  with it. Failed deletes carry the orphan forward; failed updates record
  an empty-attrs marker so the next diff still sees drift and retries.
- reconcileStatusPageChildren catches errors from the groups phase, drains
  partial children, and still runs the components phase so a single broken
  group doesn't block independent siblings.
- statusPage applyCreate/applyUpdate re-throw with the parent id attached
  so the applier can record it in state alongside surviving children.
- applier.ts catch branches detect PartialApplyError, persist partial
  state via stateEntries + refs.set, then record the failure.

Mirrors the partial-state-with-warning contract used by the Terraform
provider for monitor.Update / webhook.Create.

Test coverage: 10 partial-failure convergence scenarios (a-j) covering
both code paths (create + update), both child types (groups +
components), both top-level (RG memberships) and child-level (page
children), all op types (creates/updates/deletes failing), cross-phase
isolation, and out-of-band drift recovery via state-pinned PUT.

Made-with: Cursor
@caballeto caballeto merged commit 184e16b into main Apr 21, 2026
3 checks passed
@caballeto caballeto deleted the chore/codegen-audit-phase-0-and-1 branch April 21, 2026 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant