Skip to content

feat(cli): bundle agentic-CLI improvements (swamp-club#235)#1305

Merged
stack72 merged 3 commits intomainfrom
worktree-235
May 5, 2026
Merged

feat(cli): bundle agentic-CLI improvements (swamp-club#235)#1305
stack72 merged 3 commits intomainfrom
worktree-235

Conversation

@stack72
Copy link
Copy Markdown
Contributor

@stack72 stack72 commented May 5, 2026

Summary

Closes swamp-club#235.

Bundles 6 agentic-CLI ergonomics fixes that make swamp friendlier for AI-agent integrations and JSON-piping workflows.

What's in the bundle

  1. JSON-mode logger isolationparentSinks: "override" on the run/workflow/logtape-meta category loggers in jsonMode. Closes the leak path where info-level records could inherit root sinks under `--json`.
  2. Single-emitter `renderError` — in JSON mode, error JSON lands on stdout exactly once. `createJsonErrorSink` deleted (only 2 `logger.fatal` callers in src/, both inside `renderError`). `buildErrorJson` gains an optional `code` field; `UserError` gains a `code` constructor parameter.
  3. `--input key:json=` suffix — JSON-typed values for arrays/objects, with leaf-segment-only suffix stripping for nested keys (`server.config:json={...}`). Auto-detection deferred to swamp-club follow-up.
  4. `--repo-dir` parity — added to `model type search` and top-level `doctor` so agents passing the flag uniformly stop hitting "Unknown option" failures.
  5. `data gc --json` discoverability — gate already correct (no prompt in JSON mode); added explicit help-text example.
  6. `--timeout` flag — `model method run` and `workflow run` accept `30s`/`5m`/`1h` etc. Uses libswamp's existing `LibSwampContext.withTimeout(ms)` and `parseDuration`. Surfaces `code: "cancelled"` on abort. Documented as cooperative — see swamp-club#247 for built-in models honoring AbortSignal.

Design + skills

  • `design/rendering.md` — new "JSON Mode Output Contract" section with four normative guarantees.
  • `design/inputs.md` — `:json` suffix documentation with flat/nested examples.
  • `.claude/skills/swamp-model` and `swamp-workflow` — `:json` example added.

Regression guard

  • `integration/json_isolation_test.ts` exercises both directions: stdout-only-JSON under `--json` for version/data list/model method run, and confirms log mode still emits records.

Out of scope, tracked separately

Test plan

  • `deno fmt`
  • `deno lint`
  • `deno check`
  • `deno run test` — 5429 passed, 0 failed
  • `deno run compile` — binary builds
  • Manual smoke against fresh scratch repo:
    • `model method run --json` stdout is parseable JSON, no log leaks
    • `--input keywords:json=["a","b"]` and `--input server.config:json={...}` work
    • `swamp model type search --repo-dir ` outside a repo
    • `swamp doctor --repo-dir ` accepts the flag
    • `data gc --json` non-interactive
    • `--timeout 1s` parses and threads signal (cooperative cancellation; see swamp type get should be replaced by swamp model type get #247)

🤖 Generated with Claude Code

…:json, --timeout, --repo-dir parity (swamp-club#235)

Bundle 6 agentic-CLI ergonomics fixes (#235):

1. JSON-mode logger isolation — set parentSinks: 'override' on
   ["model","method","run"], ["workflow","run"], and ["logtape","meta"]
   category loggers in jsonMode so they cannot inherit the root sink. Clear
   logtape.meta's own sinks too. Closes the leak path identified in #235.

2. Single-emitter renderError — in jsonMode, write JSON to stdout exactly
   once and skip logger.fatal. Delete createJsonErrorSink (audit confirmed
   only 2 logger.fatal calls in src/, both inside renderError). Extend
   buildErrorJson with optional `code` field; UserError gains a code
   parameter. Updated existing integration tests to assert against the
   combined stdout+stderr stream rather than stderr alone.

3. --input :json suffix — keys ending :json (on the leaf segment of any
   dot-notation path) parse the value as JSON. Bypasses the @file shorthand.
   Hard error on parse failure. Test matrix covers flat array/object, nested
   keys, escape interactions, no-suffix fallback, parse failures, and
   precedence with --input-file.

4. --repo-dir on `model type search` and top-level `doctor` — closes the
   "Unknown option --repo-dir" failure mode for agentic flows that pass
   --repo-dir uniformly.

5. data gc --json — already non-interactive (gate at outputMode === 'log');
   add an explicit JSON example to help text so agents can discover the
   bypass without reading source.

6. --timeout flag for `model method run` and `workflow run` — wires
   AbortController via libswamp's existing LibSwampContext.withTimeout(ms).
   Reuses libswamp's parseDuration; adds parseTimeout helper for second-
   level granularity. Surfaces SwampError code "cancelled" via the new
   buildErrorJson code field. Documented as cooperative cancellation —
   built-in models that don't honor AbortSignal are tracked separately
   (swamp-club#247).

Design docs:
- design/rendering.md gains a normative "JSON Mode Output Contract" section
  with four guarantees: stdout = single JSON document; stderr free for logs;
  errors emit { error, stack?, code? } on stdout with non-zero exit;
  no interactive prompts in JSON mode.
- design/inputs.md gains a :json suffix section with examples.
- swamp-model and swamp-workflow skills updated with :json examples.

Regression test:
- integration/json_isolation_test.ts asserts BOTH directions across version,
  data list, and model method run: --json stdout-only-JSON; log mode still
  emits.

Out of scope, tracked separately:
- swamp-club#246 — Reader-lock or lock-free read path for data list/get/search/query (original #235 item 5).
- swamp-club#247 — Built-in models must honor AbortSignal so --timeout works in practice.
- swamp-uat#185 — UAT (adversarial): --timeout cancellation must release model locks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
github-actions[bot]
github-actions Bot previously approved these changes May 5, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. --timeout description is too long (src/cli/commands/model_method_run.ts:137, src/cli/commands/workflow_run.ts:128). At ~220 characters, it'll produce multi-line wrapped text that breaks the visual alignment of the options table in --help. Compare to --driver: "Override execution driver (e.g. raw, docker)". Suggested trim:

    "Cancellation deadline (e.g. 30s, 5m, 1h). Cooperative — only honored by methods that check AbortSignal."
    

    The "long-running model methods that ignore the signal" caveat is already documented in design/inputs.md and the PR description.

  2. --repo-dir description on type_search is inconsistent (src/cli/commands/type_search.ts:84). Every other command that accepts --repo-dir uses "Repository directory (env: SWAMP_REPO_DIR)". This one says "Repository directory (informational; type search does not require an initialized repo)" — dropping the env var hint and adding a parenthetical that reads as if the flag is ignored. Users scripting via SWAMP_REPO_DIR won't see the familiar pattern, and the word "informational" implies the flag does nothing. Consider: "Repository directory (env: SWAMP_REPO_DIR; not required for type search)".

  3. :json suffix is not mentioned in --input help text (model_method_run.ts, workflow_run.ts). The --input description still says "Input values (key=value or JSON)" — the new :json suffix is entirely invisible from --help. The PR added a discoverability example for data gc --json (item 5); the same treatment would help item 3. Consider adding one example per command, e.g.:

    .example("Pass an array input", "swamp model method run my-model search --input 'keywords:json=[\"a\",\"b\"]'")
    
  4. Minor wording inconsistency in --timeout description between the two commands: model_method_run.ts says "...long-running model methods that ignore the signal will not be interrupted" while workflow_run.ts says "...steps whose model methods honor AbortSignal." Same concept, different length. Worth aligning once the description is trimmed per suggestion 1.

Verdict

PASS — no blocking issues. All five ergonomic improvements work correctly: JSON-mode logger isolation, single-emitter renderError, :json suffix, --repo-dir parity, data gc --json discoverability, and --timeout. Integration test coverage is thorough. The suggestions above are polish items that don't block merge.

CLI UX review (PR #1305 review feedback):
- Trim --timeout description on model method run + workflow run from ~220
  chars to ~100; align wording across both commands.
- Restore familiar --repo-dir wording on type_search ("env: SWAMP_REPO_DIR;
  not required for type search") so users scripting via SWAMP_REPO_DIR see
  the standard pattern.
- Add a --input :json discoverability example to model method run + workflow
  run help text (parallel treatment to the data gc --json example).

Skill review:
- swamp-data-query content score was 0.7999... (just below the 0.90 strict
  threshold from scripts/review_skills.ts) due to LLM-judge rounding.
  Address the two judge suggestions: add an "Empty results and errors"
  section covering empty matches, malformed CEL, missing-field semantics,
  and binary content; clarify the references/fields.md path. Score now 94%.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

  1. --timeout format inconsistencysrc/cli/commands/model_method_run.ts:140 and workflow_run.ts:127 vs datastore_sync.ts:102

    datastore sync already ships --timeout <seconds:integer> where users pass a plain integer: swamp datastore sync --timeout 1800. The new --timeout on model method run and workflow run uses a different format: --timeout <duration:string> requiring a unit suffix (30s, 5m, 1h).

    A user who transfers their knowledge of --timeout from datastore sync and types swamp model method run my-model exec --timeout 1800 gets:

    Invalid duration format: "1800". Expected format like 1h, 1d, 7d, 1w, 1mo
    

    The error doesn't even mention s (the unit they'd reach for first), because parseDuration predates this feature. The two --timeout conventions on two commands in the same binary will reliably trip users.

    Suggested fix (pick one):

    • Option A (preferred): Update duration_parser.ts:parseTimeout to also accept a bare integer and treat it as seconds, so --timeout 1800 works on all three commands. Update the help text to say "Cancellation deadline — seconds (e.g. 30, 1800) or duration string (e.g. 30s, 5m, 1h)." and add an example for bare seconds.
    • Option B: Change the new --timeout to <seconds:integer> matching datastore sync, and document 30s notation as a convenience alias that parseTimeout accepts by stripping the s suffix.

    Either option closes the inconsistency; Option A is more ergonomic.

Suggestions

  1. --timeout help text uses jargonmodel_method_run.ts:141 and workflow_run.ts:128: "Cooperative — only honored by methods that check AbortSignal." AbortSignal is a V8/Deno implementation term; most CLI users won't know it. Consider "Cooperative — only honored by models that support cancellation (see swamp-club#247)." or simply dropping the parenthetical from the help text since the limitation is noted in the design docs.

  2. parseDuration error hint doesn't list s — if a user somehow reaches the parseDuration fallback with an invalid string, the error says "Expected format like 1h, 1d, 7d, 1w, 1mo" — missing s. Low impact given parseTimeout catches the s case first, but worth a note in duration_parser.ts that the fallback error is inherited and incomplete.

Verdict

NEEDS CHANGES--timeout on the two new commands uses a different format than the existing datastore sync --timeout, which will silently break user expectations for anyone who already uses that flag.

github-actions[bot]
github-actions Bot previously approved these changes May 5, 2026
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Well-structured bundle of agentic-CLI improvements. The JSON-mode logger isolation fix is architecturally sound — severing sink inheritance with parentSinks: "override" and consolidating fatal output into a single renderError emitter is the right design. The :json input suffix, --timeout flag, and --repo-dir parity changes are all clean CLI-layer additions with proper test coverage.

Blocking Issues

None.

Suggestions

  1. Stale comment in logger.ts:108-110 — The comment says "otherwise a child logger emitting an info record would also emit through the root's jsonError sink" but the jsonError sink was deleted in this PR. The root logger in JSON mode now has sinks: []. The comment should reference the previous bug rather than a current sink name.

  2. Missing unit test for doctor.ts --repo-dir — Other commands that gained --repo-dir in this PR (e.g. type_search) got a corresponding test. doctor.ts didn't, though its top-level action only shows help so the risk is minimal.

  3. Integration tests now check stderr + stdout — The pattern assertStringIncludes(result.stderr + result.stdout, ...) is correct given the single-emitter change, but it no longer asserts which stream the error appears on. Fine for content-checking tests, but if you later want to enforce the JSON Mode Output Contract at the integration level, consider separate assertions on stdout (structured JSON) vs stderr (log records).

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

None found. The code paths are well-structured and the changes are internally consistent.

Medium

  1. src/cli/input_parser.ts:131-153:json suffix is a silent semantic change for keys containing literal :json.
    If any existing model definition has input keys named like foo:json, the command --input foo:json=bar would now attempt to parse bar as JSON instead of setting the key foo:json to the string "bar". This is a deliberate design choice and is unlikely in practice (nobody names keys *:json), but it is technically a breaking change to the input parsing contract with no opt-out.
    Mitigation: Already mitigated by the leaf.length > jsonSuffix.length guard (a bare :json key is correctly rejected). The remaining risk is real keys ending in :json — document this in a migration note if any extensions use such key names.

  2. src/cli/duration_parser.ts:33parseInt on extremely large numeric strings produces imprecise results.
    parseTimeout("99999999999999999999s")parseInt yields 1e+20, then * 1000 produces 1e+23 milliseconds. AbortSignal.timeout(1e+23) behavior is implementation-defined for values exceeding Number.MAX_SAFE_INTEGER. A practical user would never type this, but an agent constructing the flag from user input could.
    Impact: Theoretical — the timeout would just be "very large" rather than precise, effectively meaning "no timeout."

Low

  1. src/cli/input_parser.ts:43-62setNestedValue traverses __proto__ / constructor paths (pre-existing).
    --input __proto__.polluted=yes would walk into Object.prototype via the __proto__ getter and set Object.prototype.polluted = "yes". This is prototype pollution. However: (a) this is pre-existing code not introduced by this PR, (b) CLI argument control requires shell access, (c) the agentic use case is the only plausible vector and even then the pollution only affects the current short-lived process. Not blocking — pre-existing concern for a separate hardening PR.

  2. src/cli/commands/workflow_run.ts:295+303flushModelLocks() can be called twice (pre-existing).
    On the success path (line 295), if flushModelLocks() throws, the catch block (line 303) calls it again. Whether double-release is safe depends on the lock implementation. Again, pre-existing and not introduced by this PR.

Verdict

PASS — The six bundled changes are well-implemented. The JSON isolation fix correctly severs sink inheritance. The :json suffix parser handles edge cases (empty leaf, nested keys, @file bypass). The parseTimeout function correctly delegates to parseDuration while adding second-level granularity. Error output now has a clean single-emitter contract in JSON mode. Integration tests are updated consistently. The --repo-dir additions are trivially correct (accepted but unused where not needed). No critical or high issues found.

…stency

The new --timeout on `model method run` and `workflow run` originally
required a unit-suffixed string ("30s", "5m", "1h"). But `swamp datastore
sync --timeout` already ships as `<seconds:integer>` — users who transfer
their knowledge type `--timeout 1800` and get a confusing parseDuration
error that doesn't even mention `s` as a valid unit.

Update parseTimeout to accept a bare integer as seconds (matching
datastore sync). Suffixed forms still work: 30s, 5m, 1h, 1d, 1w. Update
help text to call out both forms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI UX Review

Blocking

None.

Suggestions

  1. --input description doesn't mention the :json suffix (src/cli/commands/model_method_run.ts:94, workflow_run.ts:85). The current text is "Input values (key=value or JSON)" — this still reads as only two syntaxes. The new :json suffix is covered by the example added in this PR, which is good for discoverability, but a user reading only the flag descriptions (e.g. via swamp model method run --help | grep -A1 input) won't see the new form. Suggested description update: "Input values (key=value, key:json=, or JSON object)".

  2. No --timeout example in help text (model_method_run.ts, workflow_run.ts). The flag description is clear and self-explanatory, but a concrete example (e.g. swamp model method run my-server deploy --input env=prod --timeout 5m) would make the feature more discoverable via --help. Low priority since the description includes inline examples.

  3. type_search --repo-dir is accepted but has no functional effect (src/cli/commands/type_search.ts:83). The description "not required for type search" is good but doesn't make explicit that passing the flag is a no-op for this command. This is intentional (agentic flag-uniformity), but a user who passes --repo-dir /alt-repo might expect extensions from that repo to appear in results. Consider: "(accepted for flag uniformity with other commands; type search reads only the global extension catalog)".

Verdict

PASS — All six agentic-ergonomics fixes are clean from a user-facing perspective. The --timeout description honestly discloses cooperative cancellation semantics, data gc --json discoverability is improved, the :json suffix is shown in examples, and the JSON output contract is solid (error JSON on stdout, no log leaks). None of the three suggestions above block merge.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Well-structured bundle of agentic-CLI improvements. The changes follow clean DDD layering: parseTimeout and :json suffix parsing live correctly in the CLI layer, UserError.code is a natural domain-level extension that aligns with SwampError.code in libswamp, and the renderer error-propagation bridge (e.error.codeUserError constructor) is the right way to cross the libswamp→presentation boundary. All libswamp imports go through mod.ts. License headers are present on all new files. Test coverage is solid across unit and integration levels.

Blocking Issues

None.

Suggestions

  1. Stale comment in logger.ts:108-110 — The comment still references "the root's jsonError sink" but createJsonErrorSink was deleted and the root logger's sinks are now [] in JSON mode. The parentSinks: 'override' is still correct behavior, but the comment's "why" is misleading. Consider updating to reflect the current design (preventing any future sink additions from leaking).

  2. workflow_run_test.ts missing models barrel import — CLAUDE.md specifies CLI command tests should include import "../../domain/models/models.ts" (see data_get_test.ts for the canonical pattern). The new workflow_run_test.ts omits it. This test works today because it only checks option names via dynamic import, but including the barrel import would be more consistent and future-proof. (Pre-existing model_method_run_test.ts has the same gap, which isn't this PR's fault.)

  3. Minor: unreachable negative-integer path in parseTimeout — The seconds <= 0 guard after parseInt on a ^\d+$-matched string can only trigger for "0" (since the regex excludes negative numbers). The guard is fine as defense-in-depth, just noting it's not a separate code path to worry about testing.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adversarial Review

Critical / High

No critical or high-severity issues found. The changes are well-structured and correctly implemented across all six features.

Medium

  1. src/infrastructure/logging/logger.ts:109 — Stale comment references deleted jsonError sink. The comment reads "otherwise a child logger emitting an info record would also emit through the root's jsonError sink, polluting stderr with malformed JSON." The jsonError sink was deleted in this PR. The explanation of why parentSinks: 'override' is needed is still accurate, but referencing a non-existent sink could confuse future readers. Suggest rewording to: "otherwise a child logger emitting an info record would also inherit the root logger's sinks."

  2. src/cli/duration_parser.ts:40parseTimeout("0") throws but the test only covers the 0 case, not the bare-integer overflow boundary. parseInt("999999999999999999", 10) returns 1000000000000000000 due to IEEE 754 precision loss. 1000000000000000000 * 1000 is 1e21 ms (~31,709 years). Not a realistic user input, but the function doc says "Returns milliseconds, always > 0" — a Number.isSafeInteger guard would be belt-and-suspenders. Not blocking — this is within normal CLI tolerance.

  3. Integration test assertions now check stderr + stdout instead of just stderr. For example integration/inputs_test.ts:234: assertStringIncludes(result.stderr + result.stdout, "must be one of"). While this correctly accommodates the new JSON-mode error-on-stdout path, it means the tests would also pass if log-mode errors accidentally moved from stderr to stdout. This reduces the diagnostic precision of the test suite. Not blocking — the JSON isolation integration test (json_isolation_test.ts) separately guards the channel contract.

Low

  1. src/cli/input_parser.ts:138 — A key named literally :json (e.g. :json=value) does NOT trigger JSON parsing because of the leaf.length > jsonSuffix.length guard (5 > 5 is false). This is correct and intentional, but undocumented. A user who types --input ':json=["a"]' gets the raw string ["a"] stored under the key :json, which may be surprising. Not worth fixing — the edge case is extremely unlikely.

  2. src/cli/duration_parser.ts:52"1.5s" is silently rejected by the /^(\d+)s$/ regex (the dot is not \d). It falls through to parseDuration("1.5s") which also rejects it with a UserError about format. The error message from parseDuration says "Expected format like 1h, 1d, 7d" — it doesn't mention s as valid, which may confuse users who are close to valid input. Minor UX gap.

  3. src/libswamp/context.ts:49AbortSignal.timeout(ms) creates an internal timer with no cancellation handle. If --timeout 999999999 is passed, the timer lives for the process lifetime. No real leak since the process ends, but AbortSignal.any() + AbortSignal.timeout() together keep references alive until both fire. Purely theoretical.

Verdict

PASS — Solid implementation across all six features. The JSON-mode logger isolation is correctly designed with parentSinks: 'override', the single-emitter renderError pattern is clean, the :json suffix parser handles edge cases well (nested keys, @file bypass, empty leaves), and parseTimeout correctly unifies the bare-integer and suffixed forms. Test coverage is thorough. The medium findings are cosmetic/diagnostic — none represent incorrect behavior.

@stack72 stack72 merged commit dbe5588 into main May 5, 2026
11 checks passed
@stack72 stack72 deleted the worktree-235 branch May 5, 2026 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant