Skip to content

fix(coreutils-port): accept let-bound Command chain in uu_app validator#1624

Open
chaliy wants to merge 1 commit into
mainfrom
claude/keen-ride-QHE6a
Open

fix(coreutils-port): accept let-bound Command chain in uu_app validator#1624
chaliy wants to merge 1 commit into
mainfrom
claude/keen-ride-QHE6a

Conversation

@chaliy
Copy link
Copy Markdown
Contributor

@chaliy chaliy commented May 11, 2026

What

Extend the args-mode uu_app() validator in bashkit-coreutils-port to accept the safe two-statement shape

let <ident> = Command::new(...)<chain>;
<ident>.<method>(...)<chain>

in addition to the existing single tail-expression shape.

Why

The weekly Coreutils Args Drift workflow is currently red: upstream uutils refactored truncate's uu_app() into

let cmd = Command::new(...)...;
uucore::clap_localization::configure_localized_command(cmd).arg(...)

The Rewriter already folds configure_localized_command(cmd) to just cmd, leaving a clean let + tail-chain body. The strict single-expression validator (added under TM-INF-025) rejected that shape, so the codegen aborts with:

error: unsafe uu_app body: expected a single clap Command builder expression, found 2 statements

This pattern is structurally just a chain split across a binding — no executable side effect — and is the natural way upstream is going to keep writing uu_app(). Permanently rejecting it would mean every drift run fails until someone hand-edits.

How

  • Split validate_uu_app_body into validate_command_chain_expr (shared) and a new validate_let_bound_command_body.
  • The let-binding helper requires:
    • a simple let <ident> pattern (no destructuring, no let-else, no attrs/ref/subpat),
    • an initializer that itself satisfies the existing Command-chain validator (no closures, no disallowed methods, rooted at Command::new),
    • a tail expression that is a method chain whose innermost receiver is exactly the let-bound identifier — i.e. a structural continuation of the same chain, not a fresh Command::new or unrelated expression.
  • Both shapes run through the same UuAppExprValidator visitor, so the TM-INF-025 trust boundary on closures, macros, blocks, async/unsafe, loops/matches, and side-effecting methods (spawn/status/unwrap/...) is unchanged.

Tests

Five new tests in crates/bashkit-coreutils-port/src/args.rs:

Test Asserts
accepts_let_bound_command_with_tail_chain Happy path with the real configure_localized_command(cmd) wrapper — Rewriter folds it, validator accepts the result, generated body uses cmd.arg(...).
rejects_let_with_non_command_initializer let cmd = std::process::Command::new(...).status().unwrap(); fails codegen.
rejects_tail_not_chained_off_let_binding let cmd = Command::new(...); Command::new(...).arg(...) fails (tail must chain on the binding).
rejects_three_statement_body let cmd = ...; std::process::abort(); cmd.arg(...) fails (only 2 stmts allowed).
Existing rejects_executable_statements_in_uu_app, rejects_non_clap_command_root_chain, rejects_unexpected_macro_in_builder_chain Still green.

End-to-end verification:

  • Regenerated all 11 ported utils (cat, ls, mktemp, od, readlink, realpath, shuf, stat, tac, tee, truncate) against the pinned rev 39364b6byte-identical to repo after cargo fmt.
  • Regenerated truncate against current uutils HEAD (where the multi-stmt body lives) — now produces a valid file instead of aborting.

Specs

  • specs/coreutils-args-port.md — documents the two accepted body shapes.
  • specs/threat-model.md TM-INF-025 — updated to describe both shapes and lists all four positive/negative regression tests.

Risk

  • Low. Codegen-only change; no runtime bashkit code paths touched. The validator still rejects every disallowed primitive it rejected before — just now it understands one extra structural shape.

Generated by Claude Code

Upstream uutils refactored truncate's uu_app() to
`let cmd = Command::new(...); configure_localized_command(cmd).arg(...)...`.
The Rewriter folds the wrapper call to `cmd`, leaving a two-statement
`let <ident> = <chain>; <ident>.<method>(...)` body — which the strict
single-expression validator was rejecting, breaking the weekly Coreutils
Args Drift workflow on every regen.

Decompose `validate_uu_app_body` into `validate_command_chain_expr` and a
new `validate_let_bound_command_body` that requires a simple `let <ident>`
pattern, a Command::new-rooted initializer, and a tail method-chain whose
innermost receiver is exactly the let-bound identifier. Both branches
share the existing `UuAppExprValidator` (disallowed methods, closures,
macros, blocks) so TM-INF-025's trust boundary is unchanged.

New regression tests cover the happy path (with the real
`configure_localized_command` wrapper the Rewriter strips), a non-command
initializer, a tail not chained off the binding, and a 3-statement body.

Verified by regenerating all 11 ported utils against the pinned rev
(byte-identical to repo after cargo fmt) and against upstream HEAD
(truncate now regenerates cleanly).

Specs: updated coreutils-args-port.md and threat-model.md TM-INF-025 to
document the broadened accepted shapes and new regression tests.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit 92d6ab6 Commit Preview URL

Branch Preview URL
May 11 2026, 09:34 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant