From 50dd48967b71da79c69bf082f2724aa5f3dd45c0 Mon Sep 17 00:00:00 2001 From: Cannon07 Date: Wed, 3 Jun 2026 16:36:31 +0530 Subject: [PATCH 1/6] docs: handoff for Windows shell write/delete detection (#46 follow-up) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Task handoff for the Windows session: extend bash_detect to PowerShell commands (Remove-Item etc.) + Windows path resolution so neo-tree change indicators fire for shell deletes/writes on Windows. Includes the STEP 0 debug-sample gathering (only doable on Windows), the grammar requirements, the diff.lua git portability fix, and a note to coordinate the bash_detect -> shell_detect structure decision with the improve-architecture pass. Scratchpad — to be removed before this follow-up PR merges. Co-Authored-By: Claude Opus 4.8 --- HANDOFF-shell-detect.md | 99 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 99 insertions(+) create mode 100644 HANDOFF-shell-detect.md diff --git a/HANDOFF-shell-detect.md b/HANDOFF-shell-detect.md new file mode 100644 index 0000000..5da3121 --- /dev/null +++ b/HANDOFF-shell-detect.md @@ -0,0 +1,99 @@ +# Handoff — Windows shell write/delete detection (issue #46 follow-up) + +**For: Claude Code on the Windows machine.** You can run PowerShell, named pipes, +and Windows nvim — which is exactly why this task is yours. The prior work was +done on macOS and could not see what Claude Code actually emits on Windows. + +This is a **follow-up to the merged claudecode Windows slice** (`#73`, squash +commit `ead0c2c` on `main`). Branch: `feat/shell-detect-windows`. + +## The bug + +On Windows, deleting a file (and other shell writes) does **not** mark neo-tree. +Claude Code's `Bash` tool emits **PowerShell** (`Remove-Item …`) instead of `rm`, +and our detector only understands Unix commands + Unix paths, so it finds nothing. + +The preview path (Edit/Write/MultiEdit) already works on Windows — this is the +**Tier-1 change-indicator** path for shell proposals only (see CONTEXT.md +[Tier 1 / Tier 2], [Status], [Origin prefix]). + +## Read first + +- `lua/code-preview/pre_tool/bash_detect.lua` — the whole detector. Unix-only on + two axes: **command vocabulary** (`rm`, `>`/`>>`, `mv`, `cp`, `tee`, `sed -i`) + and **path resolution** (`resolve()`/`looks_like_path()` only handle `/`-absolute + and `~/`; no `C:\`, backslashes, or `%USERPROFILE%`). +- `lua/code-preview/pre_tool/normalisers.lua` — confirms routing: `tool_name="Bash"` + → `handle_bash` → `bash_detect`. Routing is fine; the detector is the gap. +- `tests/plugin/pre_tool_bash_detect_spec.lua` (CI-excluded on Windows) and the + `bash_modified` case in `pre_tool_handle_spec.lua` (marked `pending` on Windows) + — both are this task's to re-enable with Windows-path awareness. +- `docs/adr/0007-...` and `CONTEXT.md` for the cross-OS principles. + +## STEP 0 — gather ground truth (only you can do this) + +With `debug = true` (tail `%LOCALAPPDATA%\nvim-data\code-preview.log`), capture the +exact `tool_name` and raw command string Claude Code emits on Windows for: +- a **delete** (expect `Remove-Item …`?), +- a **file write/redirect** (`Set-Content`? `Out-File`? `Add-Content`? `>`?), +- a **move** and a **copy** if the agent uses them. + +Confirm the `tool_name` is still `"Bash"` (if not, the normaliser tool-map needs a +Windows entry too). **Paste these samples into the PR** — they are the spec inputs. + +## Design requirements (grammar) + +Extend detection to PowerShell, case-insensitively, alongside the existing Unix +grammar (don't break Unix): +- **Delete** → `deleted`: `Remove-Item`/`ri`/`rm`/`del`/`erase`/`rd`/`rmdir` (+ `-Path`, + `-Recurse`, `-Force`, positional). +- **Write** → `bash_created`/`bash_modified`: `Set-Content`/`sc`, `Out-File`, + `Add-Content`/`ac`, `Tee-Object`/`tee`, and `>`/`>>` (PowerShell supports these too). +- **Move/Copy** → write target: `Move-Item`/`mi`/`move`/`ren`/`rni`, `Copy-Item`/`cpi`/`copy`/`cp`. +- **Separators**: PS uses `;`, `|`, newlines; `&&`/`||` exist in PS7 but **not 5.1** + (the installed hook runs under 5.1). `-Path` may take a comma-list and wildcards. +- **Paths**: drive-letter absolute (`C:\…`, `C:/…`), UNC (`\\…`), backslash or + forward slash, `%USERPROFILE%` / `$HOME` / `$env:USERPROFILE`. Keep the `bash_*` + origin-prefix semantics from [Status]/[Origin prefix] unchanged. + +## ⚠ Structure decision is NOT yours to make alone + +A separate session is running `improve-codebase-architecture` (scoped to the +cross-OS shim/discovery/**detection** layer) to decide whether `bash_detect` +should grow into a per-OS `shell_detect` with grammar tables, or just gain a +Windows branch. **Coordinate on that result before committing to a file split** — +get the grammar working behind the current `M.detect(cmd, cwd)` interface first +(callers shouldn't care), so the structure can be refactored independently. + +## Also in this PR (small, decided) + +`lua/code-preview/diff.lua` (~line 60): the `reveal_root = "git"` path runs a +string shell command with `2>/dev/null` (POSIX-only; misbehaves under Windows +cmd). Fix to list-form, no redirect: +`vim.fn.systemlist({"git","-C",parent,"rev-parse","--show-toplevel"})` + the +existing `shell_error` check. Verify on both OSes. + +## Testing bar + +- Re-enable the excluded/pending bash specs with **portable** paths (no hardcoded + `/tmp`, no `os.getenv("HOME")` assumptions — guard or branch per OS). +- Add Windows-grammar specs from your STEP 0 samples. +- macOS/Linux specs must stay green (the detector must remain correct on Unix). +- The CI `windows-test` job (now on `main`) will exercise the Lua specs per-file; + remove `pre_tool_bash_detect_spec.lua` from its exclusion list once it passes on + Windows. + +## Out of scope + +opencode's `vim.fn.system({"cp",…})` (Windows blocker) and the codex/copilot +Windows shims belong to the per-agent rollout PRs, not this one. + +## Suggested skills + +- `tdd` — STEP 0 samples → failing specs → grammar. Good fit here. +- `verify` — confirm a real delete marks neo-tree on Windows end-to-end. + +## Before merge + +Remove this file, and remove `pre_tool_bash_detect_spec.lua` from the CI Windows +exclusion if it now passes. From 96aed6559628c6dc409cfb6f1df69721d397d5f4 Mon Sep 17 00:00:00 2001 From: Cannon07 Date: Thu, 4 Jun 2026 01:33:51 +0530 Subject: [PATCH 2/6] docs: fold architecture review into shell_detect handoff (#46 follow-up) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the "structure decision is not yours" placeholder with the now- decided shape from the improve-codebase-architecture pass: split the two independent axes (path conventions vs command grammar — they don't align to OS), rename bash_detect -> shell_detect behind the same M.detect interface, extract a path-convention adapter, make command matchers table-shaped, and add the PowerShell grammar only after STEP 0 confirms it. Registry/engine and any per-OS file split are explicitly out of scope (deferred by the review). Co-Authored-By: Claude Opus 4.8 --- HANDOFF-shell-detect.md | 71 ++++++++++++++++++++++++++--------------- 1 file changed, 45 insertions(+), 26 deletions(-) diff --git a/HANDOFF-shell-detect.md b/HANDOFF-shell-detect.md index 5da3121..c1a8836 100644 --- a/HANDOFF-shell-detect.md +++ b/HANDOFF-shell-detect.md @@ -41,29 +41,48 @@ exact `tool_name` and raw command string Claude Code emits on Windows for: Confirm the `tool_name` is still `"Bash"` (if not, the normaliser tool-map needs a Windows entry too). **Paste these samples into the PR** — they are the spec inputs. -## Design requirements (grammar) - -Extend detection to PowerShell, case-insensitively, alongside the existing Unix -grammar (don't break Unix): -- **Delete** → `deleted`: `Remove-Item`/`ri`/`rm`/`del`/`erase`/`rd`/`rmdir` (+ `-Path`, - `-Recurse`, `-Force`, positional). -- **Write** → `bash_created`/`bash_modified`: `Set-Content`/`sc`, `Out-File`, - `Add-Content`/`ac`, `Tee-Object`/`tee`, and `>`/`>>` (PowerShell supports these too). -- **Move/Copy** → write target: `Move-Item`/`mi`/`move`/`ren`/`rni`, `Copy-Item`/`cpi`/`copy`/`cp`. -- **Separators**: PS uses `;`, `|`, newlines; `&&`/`||` exist in PS7 but **not 5.1** - (the installed hook runs under 5.1). `-Path` may take a comma-list and wildcards. -- **Paths**: drive-letter absolute (`C:\…`, `C:/…`), UNC (`\\…`), backslash or - forward slash, `%USERPROFILE%` / `$HOME` / `$env:USERPROFILE`. Keep the `bash_*` - origin-prefix semantics from [Status]/[Origin prefix] unchanged. - -## ⚠ Structure decision is NOT yours to make alone - -A separate session is running `improve-codebase-architecture` (scoped to the -cross-OS shim/discovery/**detection** layer) to decide whether `bash_detect` -should grow into a per-OS `shell_detect` with grammar tables, or just gain a -Windows branch. **Coordinate on that result before committing to a file split** — -get the grammar working behind the current `M.detect(cmd, cwd)` interface first -(callers shouldn't care), so the structure can be refactored independently. +## Design — split two independent axes (decided; per the architecture review) + +The detector entangles two axes, and they are **not** the same as "OS" — a +git-bash shell on Windows has POSIX grammar with Windows-ish paths. So do **not** +add a blanket "Windows branch" and do **not** cut the module by OS: + +1. **Path conventions** — `resolve()` / `looks_like_path()` / `is_transient()`: + today only `/`-absolute, `~/`, `$HOME`, `/dev/`. Windows needs `C:\` / `C:/`, + UNC `\\…`, backslash or forward slash, `%TEMP%` / `%USERPROFILE%`. +2. **Command grammar** — which verbs/operators write or delete: POSIX + (`rm`, `>`/`>>`, `mv`, `cp`, `tee`, `sed -i`) vs PowerShell. + +**The shape to build (this is decided — but keep it all behind the current +`M.detect(cmd, cwd)` signature; callers must not change):** + +- **Rename `bash_detect` → `shell_detect`** so the interface stops lying. Update + the require site (`pre_tool/init.lua`), the spec filename, and the CI Windows + exclusion name. +- **Extract a path-convention seam** (`resolve`/`looks_like_path`/`is_transient` + → a small adapter): a Unix impl (as today) + a Windows impl. This is the + highest-value, lowest-risk extraction and is needed *regardless* of which shell + the agent uses. +- **Make the command matchers table-shaped** (verb / redirect / mv-cp-tee-sed) so + a grammar slots in as data, not as scattered `if` branches. Keep the POSIX + grammar **byte-identical** in behaviour (its edge cases come from real bugs). +- **Add a PowerShell grammar adapter — but only after STEP 0 confirms** the + Windows Bash tool emits PowerShell (your `Remove-Item` observation says it + does; STEP 0 nails the exact cmdlets/aliases/params). Starting vocabulary to + confirm/expand from real samples, case-insensitive: + - Delete → `deleted`: `Remove-Item`/`ri`/`rm`/`del`/`erase`/`rd`/`rmdir` + (+ `-Path`, `-Recurse`, `-Force`, positional). + - Write → `bash_created`/`bash_modified`: `Set-Content`/`sc`, `Out-File`, + `Add-Content`/`ac`, `Tee-Object`/`tee`, and `>`/`>>`. + - Move/Copy → write target: `Move-Item`/`mi`/`move`/`ren`, `Copy-Item`/`cpi`/`copy`/`cp`. + - Separators: `;`, `|`, newlines; `&&`/`||` are PS7-only (the hook runs under + **5.1**). `-Path` may take a comma-list / wildcards. + - Keep `bash_*` origin-prefix semantics from [Status]/[Origin prefix] unchanged. + +**Out of bounds for this PR:** do **not** introduce a per-OS file split, an +integration registry, or an install engine. The architecture review **deferred** +all of that as uncoupled cleanup; your scope is the rename + the two seams + the +confirmed grammar, nothing wider. ## Also in this PR (small, decided) @@ -79,9 +98,9 @@ existing `shell_error` check. Verify on both OSes. `/tmp`, no `os.getenv("HOME")` assumptions — guard or branch per OS). - Add Windows-grammar specs from your STEP 0 samples. - macOS/Linux specs must stay green (the detector must remain correct on Unix). -- The CI `windows-test` job (now on `main`) will exercise the Lua specs per-file; - remove `pre_tool_bash_detect_spec.lua` from its exclusion list once it passes on - Windows. +- The CI `windows-test` job (now on `main`) excludes `pre_tool_bash_detect_spec.lua` + by filename. After the rename, update that exclusion to the new spec name, then + drop it entirely once the spec passes on Windows. ## Out of scope From 3c7ca9182128bf36e802d8fd8b9692f909b7a5bc Mon Sep 17 00:00:00 2001 From: Cannon07 Date: Thu, 4 Jun 2026 01:45:55 +0530 Subject: [PATCH 3/6] =?UTF-8?q?docs:=20ADR-0009=20=E2=80=94=20shell=5Fdete?= =?UTF-8?q?ct=20splits=20axes,=20defers=20PowerShell=20grammar=20(#46)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Records the fork-(b) decision from the architecture review: the shell write/delete detector splits path-convention from command-grammar (the two do not align onto OS — git-bash on Windows is POSIX grammar + Windows paths), renames bash_detect -> shell_detect behind the same M.detect interface, extracts a path-convention adapter, table-shapes the matchers, and defers the PowerShell grammar until an empirical finding confirms which shell the Windows Bash tool emits. Backs the shell_detect handoff on this branch. 0008 is intentionally reserved for the hook-entry consolidation ADR, which lands with its own PR. Co-Authored-By: Claude Opus 4.8 --- ...ell-detect-split-axes-and-defer-grammar.md | 21 +++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md diff --git a/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md b/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md new file mode 100644 index 0000000..2a2e5b3 --- /dev/null +++ b/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md @@ -0,0 +1,21 @@ +# Shell write/delete detection splits path-convention from command-grammar; the PowerShell grammar is deferred + +Status: accepted + +Windows support (issue #46) needs the Tier-1 shell-write/delete detector (`pre_tool/bash_detect.lua`, which feeds the [change](../../CONTEXT.md#change) [status](../../CONTEXT.md#status) values `deleted` / `bash_modified` / `bash_created`) to handle Windows. The obvious framings — "make it per-OS" or "add a Windows branch" — are both wrong, because the detector entangles **two independent axes that do not align onto OS**: *path conventions* (`/`-absolute, `~/`, `/dev/`) and *command grammar* (which verbs/operators write or delete: `rm`, `>`, `mv`, `cp`, `tee`, `sed -i`). A git-bash shell on Windows has POSIX grammar with Windows-shaped paths; a PowerShell shell has PowerShell grammar with Windows paths — so OS is not the seam. + +We therefore: rename `bash_detect` → `shell_detect` (behind the unchanged `M.detect(cmd, cwd)` interface); extract a **path-convention adapter** (Unix today, Windows added now); make the command matchers **table-shaped** so a second grammar slots in as data; and **defer building the PowerShell grammar** until an empirical finding confirms which shell a given agent's Bash tool actually invokes on Windows (the same kind of spike [ADR-0007](0007-windows-shim-via-shared-powershell-discovery.md) forced for the RPC layer). + +## Considered Options + +- **Per-OS `shell_detect` (grammar tables keyed by OS)** — rejected: cuts along the wrong axis. OS conflates path conventions and command grammar, and the git-bash-on-Windows case (POSIX grammar, Windows paths) breaks the mapping outright. +- **"Just a Windows branch"** — rejected: scatters OS conditionals through a delicate, edge-case-heavy module (its cases come from real historical bugs) along that same wrong axis. +- **Split the two axes; ship the path adapter; table-shape the matchers; defer the PowerShell grammar** *(chosen)* — isolates the one extraction that is needed regardless of shell (path conventions), and avoids speculative grammar work until the requirement is confirmed. + +## Consequences + +- The path-convention adapter is the highest-value, lowest-risk extraction and is needed whatever shell the agent uses (Windows paths show up even under git-bash). +- The PowerShell grammar is built only once the empirical "what shell does the Windows Bash tool emit" finding lands (gathered as STEP 0 of the shell_detect work). Observed evidence already points at PowerShell (`Remove-Item`), but the exact cmdlets/aliases/params come from real samples, not guesses. +- The existing POSIX grammar's behaviour must stay byte-identical through the restructure — `pre_tool_bash_detect_spec.lua` (renamed alongside the module) is the safety net. +- `M.detect(cmd, cwd)` stays stable, so callers (`pre_tool.handle`) are untouched. The broader integration-registry / install-engine consolidation considered in the same architecture review is **out of scope** here and deferred. +- This refines [ADR-0007](0007-windows-shim-via-shared-powershell-discovery.md)'s "one implementation per OS" principle for the *detection* layer: detection is split by *capability axis*, not by OS. From b8f454ba23cd335f0d9a9b8e726faba8942cff0a Mon Sep 17 00:00:00 2001 From: Jay Shitre Date: Thu, 4 Jun 2026 02:39:46 +0530 Subject: [PATCH 4/6] feat: Windows PowerShell shell-detect for delete/write indicators (#46) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On Windows, Claude Code routes shell file ops through a distinct `PowerShell` tool (the Haiku model deletes via `Remove-Item`, moves via `Move-Item`, writes via `Set-Content`/`Out-File`), and the detector understood only POSIX commands + Unix paths — so shell deletes/writes never marked neo-tree. STEP 0 ground truth captured on the Windows box corrected the original macOS-written premise: it is a SEPARATE tool, not the Bash tool emitting PowerShell, and the hook matcher never fired for it. Three layers: - matcher: add `PowerShell` to the claudecode Pre/PostToolUse matchers so the hook fires for those proposals at all. - normaliser: fold tool_name="PowerShell" onto canonical Bash (shell proposal, Tier-1 indicator-only, bash_* origin prefix) — no dispatcher/emitter change, no permission gate. - shell_detect (renamed from bash_detect): split the two axes the architecture review identified — * OS-selected path adapter: Unix (byte-identical) + Windows (C:\, C:/, UNC, backslash, relative-vs-backslash-cwd). Emits canonical backslash to match the registry keys Edit/Write use, since fnamemodify(":p") is a no-op on an already-absolute Windows path. * additive PowerShell grammar: Remove-Item/Set-Content/Out-File/Add-Content/ Move-Item/Copy-Item (+ aliases), -Path/-Destination/positional, comma-list arrays (incl. spaces after commas) and here-strings. POSIX behaviour is unchanged (PS verbs never collide with POSIX-handled rm/cp/mv/tee). Also: diff.lua git reveal -> list-form systemlist (drop POSIX-only 2>/dev/null). Tests: spec renamed to pre_tool_shell_detect_spec.lua with OS-branched rows (POSIX pending on Windows; Windows + PowerShell rows from real samples, incl. multi-delete comma-list regression); pending bash_modified case re-enabled portably; normaliser + install specs updated; CI windows-test exclusion dropped. Verified end-to-end on Windows: single and multi-file PowerShell deletes mark neo-tree with correct backslash paths. Co-Authored-By: Claude Opus 4.8 --- .github/workflows/e2e-tests.yml | 10 +- HANDOFF-shell-detect.md | 45 +- lua/code-preview/backends/claudecode.lua | 14 +- lua/code-preview/diff.lua | 7 +- lua/code-preview/pre_tool/bash_detect.lua | 254 --------- lua/code-preview/pre_tool/init.lua | 4 +- lua/code-preview/pre_tool/normalisers.lua | 23 +- lua/code-preview/pre_tool/shell_detect.lua | 602 ++++++++++++++++++++ tests/backends/claudecode/test_install.sh | 6 +- tests/plugin/pre_tool_bash_detect_spec.lua | 82 --- tests/plugin/pre_tool_handle_spec.lua | 27 +- tests/plugin/pre_tool_normaliser_spec.lua | 21 + tests/plugin/pre_tool_shell_detect_spec.lua | 177 ++++++ 13 files changed, 907 insertions(+), 365 deletions(-) delete mode 100644 lua/code-preview/pre_tool/bash_detect.lua create mode 100644 lua/code-preview/pre_tool/shell_detect.lua delete mode 100644 tests/plugin/pre_tool_bash_detect_spec.lua create mode 100644 tests/plugin/pre_tool_shell_detect_spec.lua diff --git a/.github/workflows/e2e-tests.yml b/.github/workflows/e2e-tests.yml index 0e6b864..7ad68f6 100644 --- a/.github/workflows/e2e-tests.yml +++ b/.github/workflows/e2e-tests.yml @@ -107,11 +107,11 @@ jobs: # with -u tests/minimal_init.lua sidesteps both, and it sets the exit # code (0 = pass, non-zero = fail/error) via :cq, so no output parsing. # - # bash_detect specs are Unix-path-only today (issue #46 handoff item 3): - # looks_like_path rejects backslashes and resolve() only treats - # /-prefixed strings as absolute. Excluded until ported to Windows paths. - $specs = Get-ChildItem tests/plugin -Filter *_spec.lua | - Where-Object { $_.Name -ne 'pre_tool_bash_detect_spec.lua' } + # The shell detector is now Windows-aware (issue #46 follow-up): + # pre_tool_shell_detect_spec.lua (renamed from *_bash_detect_spec) runs + # its Windows-path + PowerShell rows here and marks the POSIX rows + # pending, so the whole tests/plugin directory runs on Windows. + $specs = Get-ChildItem tests/plugin -Filter *_spec.lua $failed = @() foreach ($spec in $specs) { Write-Host "-- $($spec.Name) --" diff --git a/HANDOFF-shell-detect.md b/HANDOFF-shell-detect.md index c1a8836..5b10bd9 100644 --- a/HANDOFF-shell-detect.md +++ b/HANDOFF-shell-detect.md @@ -7,10 +7,46 @@ done on macOS and could not see what Claude Code actually emits on Windows. This is a **follow-up to the merged claudecode Windows slice** (`#73`, squash commit `ead0c2c` on `main`). Branch: `feat/shell-detect-windows`. -## The bug +## ⚠ STEP 0 results — premise corrected (this doc was written on macOS) + +Ground truth captured on the Windows box (teeing raw hook stdin + reading the +Claude Code session transcripts) refined the bug in two ways the original +premise got wrong: + +1. **It is a separate tool, not the Bash tool emitting PowerShell.** Claude Code + on Windows exposes a distinct **`PowerShell`** tool (alongside `Bash`) and + routes shell file ops through it — the Haiku model deletes via + `Remove-Item`, moves via `Move-Item`, writes via `Set-Content`/`Out-File`. + `tool_name` is `"PowerShell"`, payload shape `{command, description}` (same as + Bash). The PreToolUse hook matcher was `Edit|Write|MultiEdit|Bash`, so the + hook **never fired** for those proposals — the command never reached the + detector at all. (Opus, by contrast, uses git-bash and emits POSIX `rm`.) +2. **Path resolution was the bug even for git-bash POSIX commands.** The hook + `cwd` is a Windows backslash path; `rm "D:\proj\x"` had its cwd prepended to + an already-absolute path (garbage), and the redirect/`looks_like_path` path + rejected backslashes outright. + +So the fix is three layers (all in this PR): **matcher** (`+PowerShell`), +**normaliser** (fold `PowerShell` → canonical `Bash`), and **shell_detect** +(PowerShell grammar + a Windows path adapter). Real samples used as spec inputs: + +``` +Remove-Item -Path ".step0\ps-delete.txt" -Force +Move-Item -Path ".step0\ps-move-src.txt" -Destination ".step0\ps-move-dst.txt" -Force +Copy-Item -Path ".step0\ps-copy-src.txt" -Destination ".step0\ps-copy-dst.txt" -Force +Set-Content -Path ".step0\ps-pswrite.txt" -Value @"…here-string…"@ +Add-Content -Path ".step0\ps-existing.txt" -Value "Appended via PowerShell Add-Content" +rm -f "D:\other\cp-src.txt" # git-bash POSIX, Windows-absolute path +``` + +The sections below are the original macOS-written plan; read them through the +correction above (notably: routing was NOT fine — the normaliser + matcher both +needed a Windows entry, which STEP 0 explicitly flagged as the fallback case). + +## The bug (original framing — see correction above) On Windows, deleting a file (and other shell writes) does **not** mark neo-tree. -Claude Code's `Bash` tool emits **PowerShell** (`Remove-Item …`) instead of `rm`, +A shell delete/write emits **PowerShell** (`Remove-Item …`) instead of `rm`, and our detector only understands Unix commands + Unix paths, so it finds nothing. The preview path (Edit/Write/MultiEdit) already works on Windows — this is the @@ -23,8 +59,9 @@ The preview path (Edit/Write/MultiEdit) already works on Windows — this is the two axes: **command vocabulary** (`rm`, `>`/`>>`, `mv`, `cp`, `tee`, `sed -i`) and **path resolution** (`resolve()`/`looks_like_path()` only handle `/`-absolute and `~/`; no `C:\`, backslashes, or `%USERPROFILE%`). -- `lua/code-preview/pre_tool/normalisers.lua` — confirms routing: `tool_name="Bash"` - → `handle_bash` → `bash_detect`. Routing is fine; the detector is the gap. +- `lua/code-preview/pre_tool/normalisers.lua` — routing. **Correction (STEP 0):** + routing was NOT fine. The `PowerShell` tool needed a matcher entry to fire at + all, and the claudecode normaliser now folds `tool_name="PowerShell"` → `Bash`. - `tests/plugin/pre_tool_bash_detect_spec.lua` (CI-excluded on Windows) and the `bash_modified` case in `pre_tool_handle_spec.lua` (marked `pending` on Windows) — both are this task's to re-enable with Windows-path awareness. diff --git a/lua/code-preview/backends/claudecode.lua b/lua/code-preview/backends/claudecode.lua index e5192b5..5f3756f 100644 --- a/lua/code-preview/backends/claudecode.lua +++ b/lua/code-preview/backends/claudecode.lua @@ -25,6 +25,16 @@ end local HOOK_MARKER = "code-preview" local LEGACY_HOOK_MARKER = "claude-preview" -- match old entries during transition +-- Tools whose proposals we intercept. On Windows, Claude Code exposes a +-- distinct `PowerShell` tool alongside `Bash` and routes shell file ops +-- (Remove-Item / Move-Item / Set-Content …) through it — observed with the +-- Haiku model, which deletes via `Remove-Item`. Without `PowerShell` in the +-- matcher the PreToolUse hook never fires for those proposals, so a shell +-- delete/write never marks neo-tree (issue #46 follow-up). The normaliser +-- folds `PowerShell` into the canonical `Bash` path; here we just make sure +-- the hook fires. Harmless on Unix (no such tool is ever emitted there). +local TOOL_MATCHER = "Edit|Write|MultiEdit|Bash|PowerShell" + -- The hook entry is per-OS (issue #46 / ADR-0007): a .sh shim on Unix, a .ps1 -- shim on Windows invoked through PowerShell. The installer writes the -- interpreter explicitly into Claude Code's `command` field, since the file is @@ -104,11 +114,11 @@ function M.install() -- Add our entries. On Windows the command invokes PowerShell explicitly -- against the .ps1 shim; on Unix it's the bare .sh path. See ADR-0007. table.insert(data.hooks.PreToolUse, { - matcher = "Edit|Write|MultiEdit|Bash", + matcher = TOOL_MATCHER, hooks = { { type = "command", command = hook_command(preview) } }, }) table.insert(data.hooks.PostToolUse, { - matcher = "Edit|Write|MultiEdit|Bash", + matcher = TOOL_MATCHER, hooks = { { type = "command", command = hook_command(close) } }, }) diff --git a/lua/code-preview/diff.lua b/lua/code-preview/diff.lua index d12043d..79a6ccb 100644 --- a/lua/code-preview/diff.lua +++ b/lua/code-preview/diff.lua @@ -57,9 +57,10 @@ local function mark_change_and_reveal(abs_file_path, action) local reveal_dir = nil if cfg.neo_tree.reveal_root == "git" then local parent = vim.fn.fnamemodify(abs_file_path, ":h") - local git_out = vim.fn.systemlist( - "git -C " .. vim.fn.shellescape(parent) .. " rev-parse --show-toplevel 2>/dev/null" - ) + -- List-form (no shell): avoids the POSIX-only `2>/dev/null` redirect, which + -- misbehaves under Windows cmd. `shell_error` still gates the result, so a + -- non-repo parent (git's stderr + non-zero exit) simply leaves reveal_dir nil. + local git_out = vim.fn.systemlist({ "git", "-C", parent, "rev-parse", "--show-toplevel" }) if vim.v.shell_error == 0 and git_out[1] and git_out[1] ~= "" then reveal_dir = git_out[1] end diff --git a/lua/code-preview/pre_tool/bash_detect.lua b/lua/code-preview/pre_tool/bash_detect.lua deleted file mode 100644 index 3eafcd2..0000000 --- a/lua/code-preview/pre_tool/bash_detect.lua +++ /dev/null @@ -1,254 +0,0 @@ --- pre_tool/bash_detect.lua — Tier 1 shell-write + rm detection. --- --- Inputs: a Bash command string and the project cwd. --- Output: a structured table { rm_paths = {...}, write_paths = {...} }. --- --- The edge cases here all come from real bugs in the historical bash --- pre-tool detection logic; resist "obvious simplifications" without first --- reading bash_detect_spec.lua. - -local M = {} - --- Resolve a raw token (possibly relative, possibly ~-prefixed) to an absolute --- path. Caller is responsible for unquoting first. -local function resolve(p, cwd) - if p == "" then return nil end - if p:sub(1, 1) == "/" then return p end - if p == "~" then return os.getenv("HOME") or "~" end - if p:sub(1, 2) == "~/" then - return (os.getenv("HOME") or "~") .. "/" .. p:sub(3) - end - return cwd .. "/" .. p -end - --- Strip a single matching pair of surrounding quotes. -local function unquote(s) - if #s >= 2 then - local first, last = s:sub(1, 1), s:sub(-1) - if (first == '"' and last == '"') or (first == "'" and last == "'") then - return s:sub(2, -2) - end - end - return s -end - --- Undo shell backslash-escapes that bash actually requires for filenames: --- \', \", \\, \ , \$, \&. Pseudo-escapes like \n / \t are NOT touched — --- those typically appear inside quoted strings (e.g. `printf '\n\n'`) --- and undoing them here would turn the leak into a fake-looking path. -local SHELL_ESCAPED = { ["'"]=true, ['"']=true, ["\\"]=true, [" "]=true, ["$"]=true, ["&"]=true } -local function shell_unescape(s) - return (s:gsub("\\(.)", function(c) return SHELL_ESCAPED[c] and c or ("\\" .. c) end)) -end - --- True if the string looks like a transient file we should skip. -local function is_transient(p) - if p:match("%.tmp$") or p:match("%.bak$") or p:match("%.swp$") then return true end - if p:sub(-1) == "~" then return true end - if p:sub(1, 5) == "/dev/" then return true end - return false -end - --- Reject obvious non-paths (leaked escape sequences, stray quotes, etc.). --- This catches cases like `printf '\n\n'` where the regex picks --- up `\n\n'` from inside a quoted string. -local function looks_like_path(p) - if p == "" then return false end - -- Reject backslash and double-quote: those typically leak from inside - -- quoted strings (e.g. `printf '\n\n'`). Apostrophe is allowed - -- because shell_unescape produces it legitimately from `\'`. - if p:find('[\\"]') then return false end - local first = p:sub(1, 1) - if first == "/" then return true end - if p:sub(1, 2) == "./" or p:sub(1, 3) == "../" then return true end - if p:sub(1, 2) == "~/" then return true end - if first:match("[%w_]") then return true end - return false -end - --- ── rm detection ───────────────────────────────────────────────── - --- Iterate sub-commands separated by ; && || (single or double). -local function each_subcommand(cmd) - local parts = {} - local buf = {} - local i = 1 - while i <= #cmd do - local c = cmd:sub(i, i) - local n = cmd:sub(i + 1, i + 1) - if c == ";" then - table.insert(parts, table.concat(buf)); buf = {}; i = i + 1 - elseif (c == "&" and n == "&") or (c == "|" and n == "|") then - table.insert(parts, table.concat(buf)); buf = {}; i = i + 2 - elseif c == "&" or c == "|" then - table.insert(parts, table.concat(buf)); buf = {}; i = i + 1 - else - table.insert(buf, c); i = i + 1 - end - end - if #buf > 0 then table.insert(parts, table.concat(buf)) end - return parts -end - -local function detect_rm_in(subcmd, cwd) - local cmd = subcmd:gsub("^%s+", "") - -- Optional `sudo `, then `rm` as standalone command. - local body = cmd:match("^sudo%s+rm%s+(.*)$") or cmd:match("^rm%s+(.*)$") - if not body then return {} end - - local paths = {} - for token in body:gmatch("%S+") do - -- Skip flags (-rf, --force, etc.). Single-dash followed by alpha; or `--`. - if not token:match("^%-") then - local p = shell_unescape(unquote(token)) - -- Strip trailing CR (Windows-style payloads). - p = p:gsub("\r$", "") - if p ~= "" then - local abs = resolve(p, cwd) - if abs then table.insert(paths, abs) end - end - end - end - return paths -end - -function M.detect_rm_paths(cmd, cwd) - local out = {} - local seen = {} - for _, sub in ipairs(each_subcommand(cmd)) do - for _, p in ipairs(detect_rm_in(sub, cwd)) do - if not seen[p] then - seen[p] = true - table.insert(out, p) - end - end - end - return out -end - --- ── write-path detection ──────────────────────────────────────── - --- Match output redirections (`>`, `>>`, `&>`, `&>>`) and capture the filename. --- Excludes FD redirections (`2>&1`) by requiring no digit prefix on bare `>`. -local function detect_redirects(cmd) - local out = {} - -- Walk character by character so we can disambiguate `2>&1` (skip) from - -- `> file` (capture). Cheap and explicit beats a clever regex here. - local i = 1 - while i <= #cmd do - local c = cmd:sub(i, i) - local prev = i > 1 and cmd:sub(i - 1, i - 1) or "" - if c == ">" then - -- Skip if prev is digit (FD redirection like 2>) — unless prev is &. - if prev:match("%d") then - i = i + 1 - else - -- Skip ">>" doubling - if cmd:sub(i + 1, i + 1) == ">" then i = i + 1 end - i = i + 1 - -- Skip whitespace - while i <= #cmd and cmd:sub(i, i):match("%s") do i = i + 1 end - -- Collect token until terminator. Special: `>&1` etc. emit empty. - if cmd:sub(i, i) == "&" then - -- It's an FD dup like `>&1`; skip the dup target. - i = i + 1 - while i <= #cmd and cmd:sub(i, i):match("[%w_]") do i = i + 1 end - else - local start = i - while i <= #cmd do - local ch = cmd:sub(i, i) - if ch:match("[%s&;|<>(){}`]") then break end - i = i + 1 - end - if i > start then - local tok = cmd:sub(start, i - 1) - if tok ~= "" and not tok:match("^/dev/(null|stdout|stderr|tty)$") then - table.insert(out, tok) - end - end - end - end - else - i = i + 1 - end - end - return out -end - --- `mv SRC DST` / `cp SRC DST` — emit DST. Greedy last token; misses quoted --- paths with spaces and the GNU `-t DST SRC...` flag (acceptable for Tier 1). -local function detect_mv_cp(cmd) - local out = {} - for _, sub in ipairs(each_subcommand(cmd)) do - local body = sub:match("^%s*mv%s+(.+)$") or sub:match("^%s*cp%s+(.+)$") - if body then - local tokens = {} - for tok in body:gmatch("%S+") do table.insert(tokens, tok) end - if #tokens > 0 then table.insert(out, tokens[#tokens]) end - end - end - return out -end - --- `tee [-a] FILE` — captures only the first target. -local function detect_tee(cmd) - local out = {} - -- Match "tee" optionally followed by "-a", then the next token. - local body = cmd:match("tee%s+%-a%s+([^%s&;|<>(){}`]+)") - if body then - table.insert(out, body) - else - body = cmd:match("tee%s+([^%s&;|<>(){}`]+)") - if body and body ~= "-a" then table.insert(out, body) end - end - return out -end - --- `sed -i ... FILE` — last token wins; BSD's backup-suffix would be flagged --- too (acceptable for Tier 1). -local function detect_sed_i(cmd) - local out = {} - -- Match `sed` then a flag-token containing `i`, then capture trailing args - -- up to the next pipe/semicolon. - for body in cmd:gmatch("sed%s+%-[%a]*i[%a]*%s+([^|&;]+)") do - local tokens = {} - for tok in body:gmatch("%S+") do table.insert(tokens, tok) end - if #tokens > 0 then table.insert(out, tokens[#tokens]) end - end - return out -end - -function M.detect_write_paths(cmd, cwd) - local raw = {} - for _, p in ipairs(detect_redirects(cmd)) do table.insert(raw, p) end - for _, p in ipairs(detect_mv_cp(cmd)) do table.insert(raw, p) end - for _, p in ipairs(detect_tee(cmd)) do table.insert(raw, p) end - for _, p in ipairs(detect_sed_i(cmd)) do table.insert(raw, p) end - - local out = {} - local seen = {} - for _, r in ipairs(raw) do - local p = shell_unescape(unquote(r)) - if looks_like_path(p) then - local abs = resolve(p, cwd) - if abs and not is_transient(abs) and not seen[abs] then - seen[abs] = true - table.insert(out, abs) - end - end - end - return out -end - ---- Combined entry point used by pre_tool.handle for Bash proposals. ---- @param cmd string the raw Bash command ---- @param cwd string project cwd for relative-path resolution ---- @return table { rm_paths = {...absolute...}, write_paths = {...absolute...} } -function M.detect(cmd, cwd) - return { - rm_paths = M.detect_rm_paths(cmd, cwd), - write_paths = M.detect_write_paths(cmd, cwd), - } -end - -return M diff --git a/lua/code-preview/pre_tool/init.lua b/lua/code-preview/pre_tool/init.lua index 3bdf9ce..6889d6d 100644 --- a/lua/code-preview/pre_tool/init.lua +++ b/lua/code-preview/pre_tool/init.lua @@ -11,7 +11,7 @@ local M = {} local normalisers = require("code-preview.pre_tool.normalisers") local emitters = require("code-preview.pre_tool.emitters") -local bash_detect = require("code-preview.pre_tool.bash_detect") +local shell_detect = require("code-preview.pre_tool.shell_detect") local apply_edit = require("code-preview.apply.edit") local apply_multi_edit = require("code-preview.apply.multi_edit") @@ -167,7 +167,7 @@ end local function handle_bash(input) local cmd = input.tool_input.command or "" - local detected = bash_detect.detect(cmd, input.cwd or "") + local detected = shell_detect.detect(cmd, input.cwd or "") local touched = false diff --git a/lua/code-preview/pre_tool/normalisers.lua b/lua/code-preview/pre_tool/normalisers.lua index 78d961b..26ce41c 100644 --- a/lua/code-preview/pre_tool/normalisers.lua +++ b/lua/code-preview/pre_tool/normalisers.lua @@ -19,6 +19,27 @@ local function identity(raw) return raw end +-- Claude Code's hook payload is already canonical ({tool_name, cwd, tool_input}), +-- so its normaliser is almost identity. The one translation is Windows-only: in +-- addition to `Bash`, Claude Code exposes a distinct `PowerShell` tool and routes +-- shell file ops through it (e.g. the Haiku model deletes via `Remove-Item …`, +-- moves via `Move-Item`, writes via `Set-Content`/`Out-File`). Semantically those +-- are shell proposals — Tier-1 indicator-only, `bash_*` origin prefix — identical +-- in handling to `Bash`. Folding `PowerShell` onto the canonical `Bash` tool here +-- means the dispatcher, emitters, and shell_detect need no awareness of a separate +-- tool: shell_detect's grammar is what tells PowerShell and POSIX commands apart. +-- (STEP 0 of the shell-detect work confirmed tool_name="PowerShell"; the hook +-- matcher gained `PowerShell` so the proposal reaches us at all.) +local function claudecode(raw) + if raw and raw.tool_name == "PowerShell" then + local out = {} + for k, v in pairs(raw) do out[k] = v end + out.tool_name = "Bash" + return out + end + return raw +end + -- OpenCode tools as of 2026-05-19: edit, write, multiedit, bash, apply_patch -- (plus read, glob, grep, which the TS-side allowlist filters out before they -- ever reach this normaliser). Update this map when OpenCode adds a tool the @@ -234,7 +255,7 @@ local function codex(raw) end M.normalisers = { - claudecode = identity, + claudecode = claudecode, opencode = opencode, copilot = copilot, codex = codex, diff --git a/lua/code-preview/pre_tool/shell_detect.lua b/lua/code-preview/pre_tool/shell_detect.lua new file mode 100644 index 0000000..838f227 --- /dev/null +++ b/lua/code-preview/pre_tool/shell_detect.lua @@ -0,0 +1,602 @@ +-- pre_tool/shell_detect.lua — Tier 1 shell-write + delete detection. +-- +-- (Renamed from bash_detect.lua: the detector handles more than bash now.) +-- +-- Inputs: a shell command string and the project cwd. +-- Output: a structured table { rm_paths = {...}, write_paths = {...} }. +-- +-- The detector entangles two INDEPENDENT axes, and the architecture review +-- (issue #46 follow-up) split them on purpose — they are NOT the same as "OS": +-- +-- 1. Path conventions — how a raw token resolves to an absolute path, and +-- what counts as a path at all. POSIX (`/`-absolute, `~/`, `/dev/`) vs +-- Windows (`C:\`/`C:/`, UNC `\\…`, backslash separators, relative against +-- a backslash cwd). Selected by OS via `path_adapter()`. +-- 2. Command grammar — which verbs/operators write or delete. POSIX +-- (`rm`, `>`/`>>`, `mv`, `cp`, `tee`, `sed -i`) and PowerShell +-- (`Remove-Item`, `Set-Content`, `Out-File`, `Move-Item`, …). Both +-- grammars run on every command regardless of OS; they share the redirect +-- operator and differ only in verbs, so adding PowerShell cannot change a +-- POSIX result (verified by the POSIX rows in shell_detect_spec.lua). +-- +-- Why PowerShell at all: on Windows, Claude Code routes shell file ops through +-- a distinct `PowerShell` tool (the Haiku model deletes via `Remove-Item …`). +-- The claudecode normaliser folds that onto the canonical Bash path, so those +-- commands arrive here as PowerShell syntax. See STEP 0 of the handoff. +-- +-- The POSIX edge cases here all come from real bugs in the historical bash +-- pre-tool detection logic; resist "obvious simplifications" without first +-- reading shell_detect_spec.lua. + +local M = {} + +local function is_windows() + return vim.fn.has("win32") == 1 +end + +-- ── Quoting / escaping primitives ──────────────────────────────── + +-- Strip a single matching pair of surrounding quotes. +local function unquote(s) + if #s >= 2 then + local first, last = s:sub(1, 1), s:sub(-1) + if (first == '"' and last == '"') or (first == "'" and last == "'") then + return s:sub(2, -2) + end + end + return s +end + +-- Undo shell backslash-escapes that bash actually requires for filenames: +-- \', \", \\, \ , \$, \&. Pseudo-escapes like \n / \t are NOT touched — +-- those typically appear inside quoted strings (e.g. `printf '\n\n'`) +-- and undoing them here would turn the leak into a fake-looking path. +-- +-- POSIX ONLY. On Windows a backslash is a path separator, not an escape, so the +-- Windows adapter must not run this (it would eat `C:\new`'s separator or fold +-- a UNC `\\` prefix). That difference is exactly why cleaning lives in the path +-- adapter rather than being shared. +local SHELL_ESCAPED = { ["'"]=true, ['"']=true, ["\\"]=true, [" "]=true, ["$"]=true, ["&"]=true } +local function shell_unescape(s) + return (s:gsub("\\(.)", function(c) return SHELL_ESCAPED[c] and c or ("\\" .. c) end)) +end + +-- ── POSIX path adapter (behaviour byte-identical to the historical detector) ── + +local unix_paths = {} + +-- Resolve a raw token (possibly relative, possibly ~-prefixed) to an absolute +-- path. Caller is responsible for cleaning (unquote/unescape) first. +function unix_paths.resolve(p, cwd) + if p == "" then return nil end + if p:sub(1, 1) == "/" then return p end + if p == "~" then return os.getenv("HOME") or "~" end + if p:sub(1, 2) == "~/" then + return (os.getenv("HOME") or "~") .. "/" .. p:sub(3) + end + return cwd .. "/" .. p +end + +-- Unquote, undo bash escapes, drop a trailing CR (Windows-style payloads). +function unix_paths.clean(s) + local p = shell_unescape(unquote(s)) + return (p:gsub("\r$", "")) +end + +-- Reject obvious non-paths (leaked escape sequences, stray quotes, etc.). +-- This catches cases like `printf '\n\n'` where the redirect +-- regex picks up `\n\n'` from inside a quoted string. +function unix_paths.looks_like_path(p) + if p == "" then return false end + -- Reject backslash and double-quote: those typically leak from inside + -- quoted strings (e.g. `printf '\n\n'`). Apostrophe is allowed + -- because shell_unescape produces it legitimately from `\'`. + if p:find('[\\"]') then return false end + local first = p:sub(1, 1) + if first == "/" then return true end + if p:sub(1, 2) == "./" or p:sub(1, 3) == "../" then return true end + if p:sub(1, 2) == "~/" then return true end + if first:match("[%w_]") then return true end + return false +end + +-- True if the string looks like a transient file we should skip. +function unix_paths.is_transient(p) + if p:match("%.tmp$") or p:match("%.bak$") or p:match("%.swp$") then return true end + if p:sub(-1) == "~" then return true end + if p:sub(1, 5) == "/dev/" then return true end + return false +end + +-- ── Windows path adapter ───────────────────────────────────────── +-- +-- A Windows box runs BOTH git-bash (POSIX grammar, `/c/…` or relative paths) +-- and PowerShell (Windows paths), so this adapter is a superset: it accepts the +-- Unix forms too. Output is canonicalised to BACKSLASH separators with `..` +-- collapsed, because that is the exact form Claude Code's Edit/Write proposals +-- key the changes registry with (e.g. `D:\proj\file.lua`), and `fnamemodify` +-- `:p` is a no-op on an already-absolute path — it neither flips separators nor +-- collapses `..`. Emitting any other form would key a different registry entry +-- than the editor path for the same file, and the neo-tree indicator would miss. + +local win_paths = {} + +local function win_is_absolute(p) + return p:match("^%a:[\\/]") ~= nil -- drive-letter: C:\ or C:/ + or p:match("^\\\\") ~= nil -- UNC: \\server\share + or p:sub(1, 1) == "/" -- unix-rooted (git-bash) +end + +-- Canonicalise to backslash separators with ./.. collapsed. +local function win_canon(p) + return (vim.fs.normalize(p):gsub("/", "\\")) +end + +function win_paths.resolve(p, cwd) + if p == "" then return nil end + if p == "~" then return os.getenv("USERPROFILE") or os.getenv("HOME") or "~" end + if p:match("^~[\\/]") then + local home = os.getenv("USERPROFILE") or os.getenv("HOME") or "~" + return win_canon(home .. "/" .. p:sub(3)) + end + if win_is_absolute(p) then + -- UNC must keep its leading "\\"; vim.fs.normalize preserves it. + return win_canon(p) + end + return win_canon((cwd or "") .. "/" .. p) +end + +-- Unquote and drop a trailing CR. No bash unescaping: backslash is a separator. +function win_paths.clean(s) + local p = unquote(s) + return (p:gsub("\r$", "")) +end + +-- Positive-shape match: accept things that look like a real Windows/Unix path, +-- reject leaked escape sequences. We can't blanket-reject backslash like the +-- POSIX adapter does (backslash is a legitimate separator here), so instead we +-- only accept recognised path shapes. The html-comment leak `\n\n'` matches +-- none of them (a single leading `\` is neither a drive, a UNC `\\`, nor a +-- relative/absolute prefix), so it is still rejected. +function win_paths.looks_like_path(p) + if p == "" then return false end + if p:find('"') then return false end -- stray quote leak + if p:match("^%a:[\\/]") then return true end -- C:\ or C:/ + if p:match("^\\\\") then return true end -- UNC \\server + if p:sub(1, 1) == "/" then return true end -- /unix/abs (git-bash) + if p:match("^~[\\/]?") then return true end -- ~ or ~/ or ~\ + if p:match("^%.%.?[\\/]") then return true end -- ./ ../ .\ ..\ + if p:match("^%.[%w_]") then return true end -- .step0 .config dotdirs + if p:sub(1, 1):match("[%w_]") then return true end -- bare / relative name + return false +end + +function win_paths.is_transient(p) + if p:match("%.[Tt][Mm][Pp]$") or p:match("%.[Bb][Aa][Kk]$") or p:match("%.[Ss][Ww][Pp]$") then + return true + end + if p:sub(-1) == "~" then return true end + return false +end + +local function path_adapter() + return is_windows() and win_paths or unix_paths +end + +-- ── Sub-command splitting (shared) ─────────────────────────────── + +-- Iterate sub-commands separated by ; && || (single or double) and newlines. +-- PowerShell uses `;`, `|`, and newlines; `&&`/`||` exist in PS7 but not 5.1 +-- (the installed hook runs under 5.1). POSIX uses all of these. Splitting on +-- the union is safe: a separator absent from a given shell simply never occurs. +local function each_subcommand(cmd) + local parts = {} + local buf = {} + local i = 1 + while i <= #cmd do + local c = cmd:sub(i, i) + local n = cmd:sub(i + 1, i + 1) + if c == ";" or c == "\n" or c == "\r" then + table.insert(parts, table.concat(buf)); buf = {}; i = i + 1 + elseif (c == "&" and n == "&") or (c == "|" and n == "|") then + table.insert(parts, table.concat(buf)); buf = {}; i = i + 2 + elseif c == "&" or c == "|" then + table.insert(parts, table.concat(buf)); buf = {}; i = i + 1 + else + table.insert(buf, c); i = i + 1 + end + end + if #buf > 0 then table.insert(parts, table.concat(buf)) end + return parts +end + +-- ── POSIX command grammar ──────────────────────────────────────── + +local function detect_rm_in(subcmd, cwd, adapter) + local cmd = subcmd:gsub("^%s+", "") + -- Optional `sudo `, then `rm` as standalone command. + local body = cmd:match("^sudo%s+rm%s+(.*)$") or cmd:match("^rm%s+(.*)$") + if not body then return {} end + + local paths = {} + for token in body:gmatch("%S+") do + -- Skip flags (-rf, --force, etc.). Single-dash followed by alpha; or `--`. + if not token:match("^%-") then + local p = adapter.clean(token) + if p ~= "" then + local abs = adapter.resolve(p, cwd) + if abs then table.insert(paths, abs) end + end + end + end + return paths +end + +-- Match output redirections (`>`, `>>`, `&>`, `&>>`) and capture the filename. +-- Excludes FD redirections (`2>&1`) by requiring no digit prefix on bare `>`. +-- Shared by POSIX and PowerShell (both use `>` / `>>`). +local function detect_redirects(cmd) + local out = {} + -- Walk character by character so we can disambiguate `2>&1` (skip) from + -- `> file` (capture). Cheap and explicit beats a clever regex here. + local i = 1 + while i <= #cmd do + local c = cmd:sub(i, i) + local prev = i > 1 and cmd:sub(i - 1, i - 1) or "" + if c == ">" then + -- Skip if prev is digit (FD redirection like 2>) — unless prev is &. + if prev:match("%d") then + i = i + 1 + else + -- Skip ">>" doubling + if cmd:sub(i + 1, i + 1) == ">" then i = i + 1 end + i = i + 1 + -- Skip whitespace + while i <= #cmd and cmd:sub(i, i):match("%s") do i = i + 1 end + -- Collect token until terminator. Special: `>&1` etc. emit empty. + if cmd:sub(i, i) == "&" then + -- It's an FD dup like `>&1`; skip the dup target. + i = i + 1 + while i <= #cmd and cmd:sub(i, i):match("[%w_]") do i = i + 1 end + else + local start = i + while i <= #cmd do + local ch = cmd:sub(i, i) + if ch:match("[%s&;|<>(){}`]") then break end + i = i + 1 + end + if i > start then + local tok = cmd:sub(start, i - 1) + if tok ~= "" and not tok:match("^/dev/(null|stdout|stderr|tty)$") then + table.insert(out, tok) + end + end + end + end + else + i = i + 1 + end + end + return out +end + +-- `mv SRC DST` / `cp SRC DST` — emit DST. Greedy last token; misses quoted +-- paths with spaces and the GNU `-t DST SRC...` flag (acceptable for Tier 1). +local function detect_mv_cp(cmd) + local out = {} + for _, sub in ipairs(each_subcommand(cmd)) do + local body = sub:match("^%s*mv%s+(.+)$") or sub:match("^%s*cp%s+(.+)$") + if body then + local tokens = {} + for tok in body:gmatch("%S+") do table.insert(tokens, tok) end + if #tokens > 0 then table.insert(out, tokens[#tokens]) end + end + end + return out +end + +-- `tee [-a] FILE` — captures only the first target. +local function detect_tee(cmd) + local out = {} + -- Match "tee" optionally followed by "-a", then the next token. + local body = cmd:match("tee%s+%-a%s+([^%s&;|<>(){}`]+)") + if body then + table.insert(out, body) + else + body = cmd:match("tee%s+([^%s&;|<>(){}`]+)") + if body and body ~= "-a" then table.insert(out, body) end + end + return out +end + +-- `sed -i ... FILE` — last token wins; BSD's backup-suffix would be flagged +-- too (acceptable for Tier 1). +local function detect_sed_i(cmd) + local out = {} + -- Match `sed` then a flag-token containing `i`, then capture trailing args + -- up to the next pipe/semicolon. + for body in cmd:gmatch("sed%s+%-[%a]*i[%a]*%s+([^|&;]+)") do + local tokens = {} + for tok in body:gmatch("%S+") do table.insert(tokens, tok) end + if #tokens > 0 then table.insert(out, tokens[#tokens]) end + end + return out +end + +-- ── PowerShell command grammar ─────────────────────────────────── +-- +-- PowerShell cmdlets are PascalCase Verb-Noun (`Remove-Item`) with aliases +-- (`ri`, `del`, `rd`), named parameters (`-Path`, `-Destination`), switches +-- (`-Force`, `-Recurse`), positional args, comma-lists, and here-strings +-- (`@"…"@`) for `-Value`. We deliberately keep the bash aliases (`rm`, `cp`, +-- `mv`, `tee`) OUT of these tables — those are handled by the POSIX grammar +-- above, and excluding them keeps the POSIX behaviour provably unchanged. + +-- Tokenise a PowerShell sub-command. Quoted regions (including here-strings) +-- span whitespace and become a single token with their quotes preserved, so a +-- here-string `-Value @"a\nb"@` never leaks its body words as positional paths. +local function ps_tokenise(s) + local tokens, i, n = {}, 1, #s + while i <= n do + local c = s:sub(i, i) + if c:match("%s") then + i = i + 1 + else + local start = i + while i <= n do + local ch = s:sub(i, i) + if ch == "'" or ch == '"' then + -- Here-string @' / @" (quote preceded by @ then a newline). + local hs_quote = ch + if s:sub(i - 1, i - 1) == "@" then + -- consume to the closing "@ / '@ + i = i + 1 + while i <= n do + if s:sub(i, i) == hs_quote and s:sub(i + 1, i + 1) == "@" then + i = i + 2; break + end + i = i + 1 + end + else + -- ordinary quoted region + i = i + 1 + while i <= n and s:sub(i, i) ~= hs_quote do i = i + 1 end + i = i + 1 -- skip closing quote + end + elseif ch:match("%s") then + break + else + i = i + 1 + end + end + table.insert(tokens, s:sub(start, i - 1)) + end + end + return tokens +end + +-- Split a parameter value on top-level commas (PowerShell `-Path a,b,c`), +-- ignoring commas inside quotes. +local function ps_split_commas(tok) + local parts, buf, i, n = {}, {}, 1, #tok + local q = nil + while i <= n do + local c = tok:sub(i, i) + if q then + buf[#buf + 1] = c + if c == q then q = nil end + elseif c == "'" or c == '"' then + q = c; buf[#buf + 1] = c + elseif c == "," then + parts[#parts + 1] = table.concat(buf); buf = {} + else + buf[#buf + 1] = c + end + i = i + 1 + end + parts[#parts + 1] = table.concat(buf) + return parts +end + +-- Parameters that take a following value (lowercased, leading dash stripped). +local PS_VALUE_PARAMS = { + path = true, literalpath = true, filepath = true, destination = true, + newname = true, target = true, value = true, encoding = true, width = true, +} +-- Switch parameters that take no value. +local PS_SWITCHES = { + force = true, recurse = true, confirm = true, whatif = true, verbose = true, + nonewline = true, noclobber = true, passthru = true, append = true, +} + +-- Collect a parameter value starting at token `i`, joining comma-connected +-- tokens so an array literal written with spaces (`-Path "a", "b"`) is captured +-- as a single value `"a","b"` instead of leaking "b" as a stray positional. +-- Stops at a flag (`-Force`) even mid-comma, since that is malformed anyway. +local function ps_gather_value(tokens, i) + if i > #tokens then return nil, i end + local parts = { tokens[i] } + i = i + 1 + while i <= #tokens do + local prev, cur = parts[#parts], tokens[i] + if (prev:match(",%s*$") or cur:match("^,")) and not cur:match("^%-[%a]") then + parts[#parts + 1] = cur + i = i + 1 + else + break + end + end + return table.concat(parts), i +end + +-- Parse `Verb-Noun` style args into { named = {param=value}, positional = {...} }. +local function ps_parse_args(tokens) + local named, positional = {}, {} + local i = 1 + while i <= #tokens do + local t = tokens[i] + local pname = t:match("^%-([%a][%w]*)$") + if pname then + local key = pname:lower() + if PS_SWITCHES[key] then + i = i + 1 + elseif PS_VALUE_PARAMS[key] then + -- Known value parameter: consume its (possibly comma-list) value. Also + -- swallows `-Value "x"` so the content never lands in `positional`. + named[key], i = ps_gather_value(tokens, i + 1) + else + -- Unknown parameter: consume a following value only if it isn't a flag; + -- otherwise treat as a standalone switch. + local nxt = tokens[i + 1] + if nxt and not nxt:match("^%-[%a]") then + named[key], i = ps_gather_value(tokens, i + 1) + else + i = i + 1 + end + end + else + -- Positional — also gather a comma-list (`Remove-Item a, b`). + positional[#positional + 1], i = ps_gather_value(tokens, i) + end + end + return named, positional +end + +-- Cmdlet/alias → category. Bash aliases (rm/cp/mv/tee) intentionally excluded. +local PS_DELETE = { + ["remove-item"] = true, ri = true, del = true, erase = true, + rd = true, rmdir = true, +} +local PS_WRITE = { + ["set-content"] = true, sc = true, ["add-content"] = true, ac = true, + ["out-file"] = true, ["tee-object"] = true, +} +local PS_MOVE = { + ["move-item"] = true, mi = true, move = true, ["rename-item"] = true, + ren = true, rni = true, +} +local PS_COPY = { + ["copy-item"] = true, cpi = true, copy = true, +} + +-- Pull the path value(s) for a category out of parsed args. `named_keys` lists +-- the value-params to check in priority order; `pos_index` is the positional +-- fallback (1-based). Returns a list of raw tokens (pre-clean), comma-expanded. +local function ps_targets(named, positional, named_keys, pos_index) + for _, k in ipairs(named_keys) do + if named[k] then return ps_split_commas(named[k]) end + end + local p = positional[pos_index] + if p then return ps_split_commas(p) end + return {} +end + +local function detect_ps(subcmd) + local cmd = subcmd:gsub("^%s+", "") + local verb = cmd:match("^([%a][%w%-]*)") + if not verb then return {}, {} end + local key = verb:lower() + + local category + if PS_DELETE[key] then category = "delete" + elseif PS_WRITE[key] then category = "write" + elseif PS_MOVE[key] then category = "move" + elseif PS_COPY[key] then category = "copy" + else return {}, {} end + + local rest = cmd:sub(#verb + 1) + local named, positional = ps_parse_args(ps_tokenise(rest)) + + local rm_raw, write_raw = {}, {} + if category == "delete" then + -- Delete can target many files: `-Path "a","b"` (array) and/or several bare + -- positionals (`Remove-Item a b`). Collect the named path array AND every + -- positional, each comma-expanded. + for _, k in ipairs({ "path", "literalpath" }) do + if named[k] then + for _, x in ipairs(ps_split_commas(named[k])) do rm_raw[#rm_raw + 1] = x end + end + end + for _, pos in ipairs(positional) do + for _, x in ipairs(ps_split_commas(pos)) do rm_raw[#rm_raw + 1] = x end + end + elseif category == "write" then + -- Out-File uses -FilePath; Set/Add-Content use -Path; positional 1 either way. + write_raw = ps_targets(named, positional, { "path", "literalpath", "filepath" }, 1) + else -- move / copy → the destination is the write target + write_raw = ps_targets(named, positional, { "destination", "newname" }, 2) + end + return rm_raw, write_raw +end + +-- ── Aggregation ────────────────────────────────────────────────── + +-- Clean → resolve → filter a list of raw path tokens into absolute paths. +local function finalise(raw_list, cwd, adapter, opts) + opts = opts or {} + local out, seen = {}, {} + for _, r in ipairs(raw_list) do + local p = adapter.clean(r) + -- rm/delete trust their explicit targets and skip the looks_like_path + -- heuristic (matches the historical `rm` behaviour); writes/redirects run + -- it to reject leaked escape sequences and stray tokens. + if opts.skip_path_check or adapter.looks_like_path(p) then + local abs = adapter.resolve(p, cwd) + if abs and (opts.skip_transient or not adapter.is_transient(abs)) and not seen[abs] then + seen[abs] = true + out[#out + 1] = abs + end + end + end + return out +end + +-- POSIX `rm`/`sudo rm` returns already-resolved absolute paths; PowerShell +-- delete returns raw tokens that still need clean/resolve. Keep the two sources +-- explicit so the POSIX ones are never double-resolved. +function M.detect_rm_paths(cmd, cwd) + local adapter = path_adapter() + local out, seen = {}, {} + local function add(abs) + if abs and not seen[abs] then seen[abs] = true; out[#out + 1] = abs end + end + for _, sub in ipairs(each_subcommand(cmd)) do + -- POSIX rm: detect_rm_in returns resolved absolute paths. + for _, abs in ipairs(detect_rm_in(sub, cwd, adapter)) do add(abs) end + -- PowerShell delete: raw tokens → clean → resolve. + local ps_rm = select(1, detect_ps(sub)) + for _, abs in ipairs(finalise(ps_rm, cwd, adapter, { skip_path_check = true, skip_transient = true })) do + add(abs) + end + end + return out +end + +function M.detect_write_paths(cmd, cwd) + local adapter = path_adapter() + local raw = {} + -- Shared redirects + POSIX verbs (these return raw tokens). + for _, p in ipairs(detect_redirects(cmd)) do raw[#raw + 1] = p end + for _, p in ipairs(detect_mv_cp(cmd)) do raw[#raw + 1] = p end + for _, p in ipairs(detect_tee(cmd)) do raw[#raw + 1] = p end + for _, p in ipairs(detect_sed_i(cmd)) do raw[#raw + 1] = p end + -- PowerShell write / move / copy targets (raw tokens). + for _, sub in ipairs(each_subcommand(cmd)) do + local _, ps_write = detect_ps(sub) + for _, p in ipairs(ps_write) do raw[#raw + 1] = p end + end + return finalise(raw, cwd, adapter) +end + +--- Combined entry point used by pre_tool.handle for shell proposals. +--- @param cmd string the raw shell command (POSIX or PowerShell) +--- @param cwd string project cwd for relative-path resolution +--- @return table { rm_paths = {...absolute...}, write_paths = {...absolute...} } +function M.detect(cmd, cwd) + return { + rm_paths = M.detect_rm_paths(cmd, cwd), + write_paths = M.detect_write_paths(cmd, cwd), + } +end + +return M diff --git a/tests/backends/claudecode/test_install.sh b/tests/backends/claudecode/test_install.sh index 34065a7..4a210dd 100644 --- a/tests/backends/claudecode/test_install.sh +++ b/tests/backends/claudecode/test_install.sh @@ -26,7 +26,11 @@ test_install_claude_hooks() { assert_contains "$content" "PostToolUse" "should have PostToolUse hook" || return 1 assert_contains "$content" "code-preview-diff.sh" "should reference diff script" || return 1 assert_contains "$content" "code-close-diff.sh" "should reference close script" || return 1 - assert_contains "$content" "Edit|Write|MultiEdit|Bash" "should match Edit/Write/MultiEdit/Bash tools" || return 1 + # PowerShell is matched too: on Windows Claude Code routes shell file ops + # (Remove-Item / Set-Content …) through a distinct PowerShell tool (issue #46 + # follow-up). The matcher is the same on every OS; the normaliser folds + # PowerShell onto Bash. + assert_contains "$content" "Edit|Write|MultiEdit|Bash|PowerShell" "should match Edit/Write/MultiEdit/Bash/PowerShell tools" || return 1 } # ── Test: Uninstall Claude Code hooks ──────────────────────────── diff --git a/tests/plugin/pre_tool_bash_detect_spec.lua b/tests/plugin/pre_tool_bash_detect_spec.lua deleted file mode 100644 index eb9560f..0000000 --- a/tests/plugin/pre_tool_bash_detect_spec.lua +++ /dev/null @@ -1,82 +0,0 @@ --- pre_tool_bash_detect_spec.lua — Tier 1 shell-write + rm detection. --- --- Table-driven. Each row pins a documented edge case from the historical --- bash pre-tool detection logic. New rows go at the bottom with a short --- comment explaining the case. Resist "obvious simplifications" to --- bash_detect.lua without first reading these rows. - -local bash_detect = require("code-preview.pre_tool.bash_detect") - -local CWD = "/proj" -local HOME = os.getenv("HOME") or "/root" - -local function sorted(t) - local copy = {} - for _, v in ipairs(t) do table.insert(copy, v) end - table.sort(copy) - return copy -end - -describe("bash_detect.detect_rm_paths", function() - local cases = { - { name = "plain rm", cmd = "rm foo.txt", expect = { CWD .. "/foo.txt" } }, - { name = "rm with flags", cmd = "rm -rf build", expect = { CWD .. "/build" } }, - { name = "sudo rm absolute", cmd = "sudo rm /etc/foo", expect = { "/etc/foo" } }, - { name = "rm with home tilde", cmd = "rm ~/.config/x", expect = { HOME .. "/.config/x" } }, - { name = "apostrophe in filename", cmd = "rm \"it's-mine.txt\"", expect = { CWD .. "/it's-mine.txt" } }, - { name = "compound with &&", cmd = "rm a && rm b", expect = { CWD .. "/a", CWD .. "/b" } }, - { name = "compound with ;", cmd = "rm a ; rm b", expect = { CWD .. "/a", CWD .. "/b" } }, - { name = "/tmp not filtered for rm", cmd = "rm /tmp/foo", expect = { "/tmp/foo" } }, - { name = "trailing CR stripped", cmd = "rm a.txt\r", expect = { CWD .. "/a.txt" } }, - { name = "non-rm command ignored", cmd = "echo rm foo", expect = {} }, - -- Real-world case: Claude Code escapes apostrophes in unquoted filenames. - { name = "backslash-escaped apostrophe", cmd = "rm it\\'s-mine.txt", expect = { CWD .. "/it's-mine.txt" } }, - -- Backslash-escaped spaces (`rm my\ file.txt`) would need a real shell - -- tokeniser to split correctly; documented as a Tier 1 limitation. - } - for _, c in ipairs(cases) do - it(c.name, function() - assert.are.same(sorted(c.expect), sorted(bash_detect.detect_rm_paths(c.cmd, CWD))) - end) - end -end) - -describe("bash_detect.detect_write_paths", function() - local cases = { - { name = "single > redirect", cmd = "echo x > foo.txt", expect = { CWD .. "/foo.txt" } }, - { name = "append >> redirect", cmd = "echo x >> log.txt", expect = { CWD .. "/log.txt" } }, - { name = "absolute redirect", cmd = "echo x > /var/log/y", expect = { "/var/log/y" } }, - { name = "/tmp not filtered for writes", cmd = "echo x > /tmp/real", expect = { "/tmp/real" } }, - { name = "FD redirection 2>&1 skipped",cmd = "cmd 2>&1", expect = {} }, - { name = "/dev/null filtered", cmd = "cmd > /dev/null", expect = {} }, - { name = "/dev/stderr filtered", cmd = "cmd > /dev/stderr", expect = {} }, - { name = "mv writes destination", cmd = "mv a.tmp b", expect = { CWD .. "/b" } }, - { name = "cp writes destination", cmd = "cp a b", expect = { CWD .. "/b" } }, - { name = "tee target", cmd = "echo x | tee foo.txt", expect = { CWD .. "/foo.txt" } }, - { name = "tee -a target", cmd = "echo x | tee -a foo.txt", expect = { CWD .. "/foo.txt" } }, - { name = "sed -i target", cmd = "sed -i 's/x/y/' foo.txt", expect = { CWD .. "/foo.txt" } }, - { name = "transient .tmp filtered", cmd = "echo x > scratch.tmp", expect = {} }, - { name = "transient ~ filtered", cmd = "echo x > foo~", expect = {} }, - { name = "home tilde expanded", cmd = "echo x > ~/notes.md", expect = { HOME .. "/notes.md" } }, - -- The HTML-comment false positive: the redirect regex picks up the `>` - -- in `-->`, but looks_like_path rejects the resulting token because it - -- contains a backslash (leaked from inside a quoted string). - { name = "html comment false positive", cmd = "printf '\\n\\n'", expect = {} }, - { name = "dedup duplicates", cmd = "echo a > x.txt && echo b > x.txt", expect = { CWD .. "/x.txt" } }, - -- Apostrophe-escaped paths in redirects (companion to the rm case). - { name = "backslash-escaped apostrophe in redirect", cmd = "echo x > it\\'s.txt", expect = { CWD .. "/it's.txt" } }, - } - for _, c in ipairs(cases) do - it(c.name, function() - assert.are.same(sorted(c.expect), sorted(bash_detect.detect_write_paths(c.cmd, CWD))) - end) - end -end) - -describe("bash_detect.detect (combined)", function() - it("returns both rm and write paths", function() - local r = bash_detect.detect("rm old.txt && echo x > new.txt", CWD) - assert.are.same({ CWD .. "/old.txt" }, r.rm_paths) - assert.are.same({ CWD .. "/new.txt" }, r.write_paths) - end) -end) diff --git a/tests/plugin/pre_tool_handle_spec.lua b/tests/plugin/pre_tool_handle_spec.lua index a575278..31b272d 100644 --- a/tests/plugin/pre_tool_handle_spec.lua +++ b/tests/plugin/pre_tool_handle_spec.lua @@ -32,18 +32,23 @@ describe("pre_tool.handle (Bash)", function() assert.equals("bash_created", changes.get(p)) end) - it("redirect to existing file marks bash_modified", function() - -- Skipped on Windows: this case needs to create the file on disk (io.open - -- of a /tmp path fails on Windows) AND have bash_detect resolve a Windows - -- path, which is Unix-path-only today (issue #46, handoff item 3). The - -- bash-on-Windows work will re-enable this. The new-file/rm cases above use - -- forward-slash paths that don't need an on-disk file, so they still run. - if vim.fn.has("win32") == 1 then - return pending("bash_detect is Unix-path-only on Windows (issue #46)") - end - local p = "/tmp/code-preview-test-existing-" .. tostring(vim.loop.hrtime()) + it("shell write to an existing file marks bash_modified", function() + -- Portable across OSes: use the OS temp dir (no hardcoded /tmp) so the file + -- can actually be created on disk on Windows too. Drive the same shell-write + -- the agent would: a redirect on Unix, and a PowerShell Set-Content on + -- Windows (Claude Code's PowerShell tool, folded onto Bash by the + -- normaliser). Both route through handle_bash → shell_detect. + local dir = vim.fn.fnamemodify(vim.fn.tempname(), ":h") + -- Join with the native separator: the changes registry keys on + -- fnamemodify(":p"), which does not flip separators on Windows, so a mixed + -- path here would not match the detector's canonical backslash output. + local sep = (vim.fn.has("win32") == 1) and "\\" or "/" + local p = dir .. sep .. "code-preview-test-existing-" .. tostring(vim.loop.hrtime()) .. ".txt" local fh = assert(io.open(p, "w")); fh:write("hi"); fh:close() - pre_tool.handle(payload("Bash", { command = "echo x > " .. p }), "claudecode") + local cmd = (vim.fn.has("win32") == 1) + and ('Set-Content -Path "' .. p .. '" -Value "x"') + or ("echo x > " .. p) + pre_tool.handle(payload("Bash", { command = cmd }), "claudecode") assert.equals("bash_modified", changes.get(p)) os.remove(p) end) diff --git a/tests/plugin/pre_tool_normaliser_spec.lua b/tests/plugin/pre_tool_normaliser_spec.lua index dd6d1b1..e728a8d 100644 --- a/tests/plugin/pre_tool_normaliser_spec.lua +++ b/tests/plugin/pre_tool_normaliser_spec.lua @@ -20,6 +20,27 @@ describe("normalisers.normalise (claudecode)", function() it("unknown backend falls back to identity", function() assert.are.same(canonical, normalisers.normalise(canonical, "future-agent")) end) + + -- Windows: Claude Code routes shell file ops through a distinct `PowerShell` + -- tool (Haiku deletes via Remove-Item). It carries the same {command} shape as + -- Bash and is a shell proposal, so the normaliser folds it onto canonical Bash. + it("folds the PowerShell tool onto canonical Bash", function() + local raw = { + tool_name = "PowerShell", + cwd = "/proj", + tool_input = { command = 'Remove-Item -Path "foo.txt" -Force' }, + } + local out = normalisers.normalise(raw, "claudecode") + assert.equals("Bash", out.tool_name) + assert.equals('Remove-Item -Path "foo.txt" -Force', out.tool_input.command) + assert.equals("/proj", out.cwd) + end) + + it("does not mutate the input payload when folding PowerShell", function() + local raw = { tool_name = "PowerShell", cwd = "/proj", tool_input = { command = "ls" } } + normalisers.normalise(raw, "claudecode") + assert.equals("PowerShell", raw.tool_name) + end) end) describe("normalisers.normalise (opencode)", function() diff --git a/tests/plugin/pre_tool_shell_detect_spec.lua b/tests/plugin/pre_tool_shell_detect_spec.lua new file mode 100644 index 0000000..a14f9c1 --- /dev/null +++ b/tests/plugin/pre_tool_shell_detect_spec.lua @@ -0,0 +1,177 @@ +-- pre_tool_shell_detect_spec.lua — Tier 1 shell-write + delete detection. +-- +-- (Renamed from pre_tool_bash_detect_spec.lua when bash_detect → shell_detect.) +-- +-- Two axes, tested as such (see shell_detect.lua's header): +-- +-- * POSIX block — the historical bash rows, byte-for-byte. They pin +-- documented edge cases from real bugs; resist "obvious simplifications" +-- without reading them. They assume Unix path semantics (`/`-absolute, +-- $HOME, /dev/), so on Windows they are marked `pending` — the Unix CI +-- runners are the regression guard for POSIX behaviour. +-- * Windows block — git-bash-on-Windows POSIX commands AND PowerShell +-- commands, both with Windows paths. Inputs are the real STEP 0 samples +-- captured from Claude Code (the Haiku model emits `Remove-Item …` etc. +-- through a `PowerShell` tool). Marked `pending` off Windows. +-- +-- New rows go at the bottom of the relevant block with a short comment. + +local shell_detect = require("code-preview.pre_tool.shell_detect") + +local IS_WIN = vim.fn.has("win32") == 1 +local CWD = "/proj" +local HOME = os.getenv("HOME") or "/root" + +local function sorted(t) + local copy = {} + for _, v in ipairs(t) do table.insert(copy, v) end + table.sort(copy) + return copy +end + +-- ── POSIX grammar (Unix path semantics) ────────────────────────── + +describe("shell_detect.detect_rm_paths (POSIX)", function() + local cases = { + { name = "plain rm", cmd = "rm foo.txt", expect = { CWD .. "/foo.txt" } }, + { name = "rm with flags", cmd = "rm -rf build", expect = { CWD .. "/build" } }, + { name = "sudo rm absolute", cmd = "sudo rm /etc/foo", expect = { "/etc/foo" } }, + { name = "rm with home tilde", cmd = "rm ~/.config/x", expect = { HOME .. "/.config/x" } }, + { name = "apostrophe in filename", cmd = "rm \"it's-mine.txt\"", expect = { CWD .. "/it's-mine.txt" } }, + { name = "compound with &&", cmd = "rm a && rm b", expect = { CWD .. "/a", CWD .. "/b" } }, + { name = "compound with ;", cmd = "rm a ; rm b", expect = { CWD .. "/a", CWD .. "/b" } }, + { name = "/tmp not filtered for rm", cmd = "rm /tmp/foo", expect = { "/tmp/foo" } }, + { name = "trailing CR stripped", cmd = "rm a.txt\r", expect = { CWD .. "/a.txt" } }, + { name = "non-rm command ignored", cmd = "echo rm foo", expect = {} }, + -- Real-world case: Claude Code escapes apostrophes in unquoted filenames. + { name = "backslash-escaped apostrophe", cmd = "rm it\\'s-mine.txt", expect = { CWD .. "/it's-mine.txt" } }, + -- Backslash-escaped spaces (`rm my\ file.txt`) would need a real shell + -- tokeniser to split correctly; documented as a Tier 1 limitation. + } + for _, c in ipairs(cases) do + it(c.name, function() + if IS_WIN then return pending("POSIX path semantics: Unix-only") end + assert.are.same(sorted(c.expect), sorted(shell_detect.detect_rm_paths(c.cmd, CWD))) + end) + end +end) + +describe("shell_detect.detect_write_paths (POSIX)", function() + local cases = { + { name = "single > redirect", cmd = "echo x > foo.txt", expect = { CWD .. "/foo.txt" } }, + { name = "append >> redirect", cmd = "echo x >> log.txt", expect = { CWD .. "/log.txt" } }, + { name = "absolute redirect", cmd = "echo x > /var/log/y", expect = { "/var/log/y" } }, + { name = "/tmp not filtered for writes", cmd = "echo x > /tmp/real", expect = { "/tmp/real" } }, + { name = "FD redirection 2>&1 skipped",cmd = "cmd 2>&1", expect = {} }, + { name = "/dev/null filtered", cmd = "cmd > /dev/null", expect = {} }, + { name = "/dev/stderr filtered", cmd = "cmd > /dev/stderr", expect = {} }, + { name = "mv writes destination", cmd = "mv a.tmp b", expect = { CWD .. "/b" } }, + { name = "cp writes destination", cmd = "cp a b", expect = { CWD .. "/b" } }, + { name = "tee target", cmd = "echo x | tee foo.txt", expect = { CWD .. "/foo.txt" } }, + { name = "tee -a target", cmd = "echo x | tee -a foo.txt", expect = { CWD .. "/foo.txt" } }, + { name = "sed -i target", cmd = "sed -i 's/x/y/' foo.txt", expect = { CWD .. "/foo.txt" } }, + { name = "transient .tmp filtered", cmd = "echo x > scratch.tmp", expect = {} }, + { name = "transient ~ filtered", cmd = "echo x > foo~", expect = {} }, + { name = "home tilde expanded", cmd = "echo x > ~/notes.md", expect = { HOME .. "/notes.md" } }, + -- The HTML-comment false positive: the redirect regex picks up the `>` + -- in `-->`, but looks_like_path rejects the resulting token because it + -- contains a backslash (leaked from inside a quoted string). + { name = "html comment false positive", cmd = "printf '\\n\\n'", expect = {} }, + { name = "dedup duplicates", cmd = "echo a > x.txt && echo b > x.txt", expect = { CWD .. "/x.txt" } }, + -- Apostrophe-escaped paths in redirects (companion to the rm case). + { name = "backslash-escaped apostrophe in redirect", cmd = "echo x > it\\'s.txt", expect = { CWD .. "/it's.txt" } }, + } + for _, c in ipairs(cases) do + it(c.name, function() + if IS_WIN then return pending("POSIX path semantics: Unix-only") end + assert.are.same(sorted(c.expect), sorted(shell_detect.detect_write_paths(c.cmd, CWD))) + end) + end +end) + +describe("shell_detect.detect combined (POSIX)", function() + it("returns both rm and write paths", function() + if IS_WIN then return pending("POSIX path semantics: Unix-only") end + local r = shell_detect.detect("rm old.txt && echo x > new.txt", CWD) + assert.are.same({ CWD .. "/old.txt" }, r.rm_paths) + assert.are.same({ CWD .. "/new.txt" }, r.write_paths) + end) +end) + +-- ── Windows paths: git-bash POSIX commands + PowerShell commands ── +-- +-- cwd is a backslash drive path (Claude Code's hook payload form on Windows). +-- Expected outputs are canonical backslash paths — the form the changes +-- registry keys on (matching Edit/Write), so the neo-tree indicator lands. + +local WCWD = [[C:\proj]] + +describe("shell_detect.detect_rm_paths (Windows)", function() + local cases = { + -- git-bash POSIX rm with a relative path → resolved against the backslash cwd. + { name = "bash rm relative", cmd = [[rm foo.txt]], expect = { [[C:\proj\foo.txt]] } }, + -- The original STEP 0 bug: a Windows-absolute path was treated as relative + -- and the cwd was prepended, producing garbage. It must stay absolute now. + { name = "bash rm windows-absolute", cmd = [[rm -f "D:\other\cp-src.txt"]], expect = { [[D:\other\cp-src.txt]] } }, + -- PowerShell delete — the real Haiku sample. + { name = "Remove-Item -Path relative", cmd = [[Remove-Item -Path ".step0\del.txt" -Force]], expect = { [[C:\proj\.step0\del.txt]] } }, + { name = "Remove-Item windows-absolute",cmd = [[Remove-Item -Path "D:\proj\x.txt"]], expect = { [[D:\proj\x.txt]] } }, + { name = "Remove-Item comma-list", cmd = [[Remove-Item -Path "a.txt","b.txt" -Force]], expect = { [[C:\proj\a.txt]], [[C:\proj\b.txt]] } }, + -- Real Haiku multi-delete: a comma-list with a SPACE after the comma, which + -- naive tokenising splits into "a", + a stray "b" positional. Both files + -- must register (regression: only the first was marked). + { name = "Remove-Item comma-list with spaces", cmd = [[Remove-Item -Path "D:\proj\temp1.txt", "D:\proj\temp2.txt" -Force]], expect = { [[D:\proj\temp1.txt]], [[D:\proj\temp2.txt]] } }, + { name = "Remove-Item multiple positionals", cmd = [[Remove-Item a.txt b.txt]], expect = { [[C:\proj\a.txt]], [[C:\proj\b.txt]] } }, + { name = "Remove-Item switches before positional", cmd = [[Remove-Item -Recurse -Force build]], expect = { [[C:\proj\build]] } }, + { name = "del alias positional", cmd = [[del bad.txt]], expect = { [[C:\proj\bad.txt]] } }, + { name = "ri alias", cmd = [[ri ".step0\gone.log"]], expect = { [[C:\proj\.step0\gone.log]] } }, + -- Non-delete cmdlets must not register. + { name = "Get-Content ignored", cmd = [[Get-Content "x.txt"]], expect = {} }, + } + for _, c in ipairs(cases) do + it(c.name, function() + if not IS_WIN then return pending("Windows path semantics: Windows-only") end + assert.are.same(sorted(c.expect), sorted(shell_detect.detect_rm_paths(c.cmd, WCWD))) + end) + end +end) + +describe("shell_detect.detect_write_paths (Windows)", function() + local cases = { + -- git-bash POSIX redirect (forward slash relative) resolved to backslash. + { name = "bash redirect relative", cmd = [[echo x > out.txt]], expect = { [[C:\proj\out.txt]] } }, + { name = "bash redirect dotdir", cmd = [[echo x > .step0/new.txt]], expect = { [[C:\proj\.step0\new.txt]] } }, + { name = "bash redirect windows-abs", cmd = [[echo x > "D:\logs\y.txt"]], expect = { [[D:\logs\y.txt]] } }, + -- PowerShell writes — real samples. + { name = "Set-Content -Path", cmd = [[Set-Content -Path ".step0\new.txt" -Value "hi"]], expect = { [[C:\proj\.step0\new.txt]] } }, + { name = "Set-Content here-string", cmd = 'Set-Content -Path ".step0\\ps-pswrite.txt" -Value @"\nLine 1\nLine 2\n"@', expect = { [[C:\proj\.step0\ps-pswrite.txt]] } }, + { name = "Add-Content", cmd = [[Add-Content -Path ".step0\ps-existing.txt" -Value "appended"]], expect = { [[C:\proj\.step0\ps-existing.txt]] } }, + { name = "Out-File -FilePath", cmd = [[Out-File -FilePath "log.txt"]], expect = { [[C:\proj\log.txt]] } }, + { name = "Out-File via pipeline positional", cmd = [[Get-ChildItem | Out-File "list.txt"]], expect = { [[C:\proj\list.txt]] } }, + { name = "Move-Item destination", cmd = [[Move-Item -Path ".step0\a" -Destination ".step0\b.txt" -Force]], expect = { [[C:\proj\.step0\b.txt]] } }, + { name = "Copy-Item destination", cmd = [[Copy-Item -Path ".step0\a" -Destination ".step0\c.txt" -Force]], expect = { [[C:\proj\.step0\c.txt]] } }, + { name = "Move-Item windows-abs dest", cmd = [[Move-Item -Path "a" -Destination "D:\dst\b.txt"]], expect = { [[D:\dst\b.txt]] } }, + { name = "UNC destination", cmd = [[Set-Content -Path "\\srv\share\f.txt" -Value x]], expect = { [[\\srv\share\f.txt]] } }, + -- Look-alikes that must NOT be flagged as writes. + { name = "Out-Null not a write", cmd = [[New-Item -ItemType Directory d | Out-Null]], expect = {} }, + { name = "Out-String not a write", cmd = [[Get-Process | Out-String]], expect = {} }, + { name = "transient .tmp filtered", cmd = [[Set-Content -Path "scratch.tmp" -Value x]], expect = {} }, + { name = "FD redirect skipped", cmd = [[some-exe 2>&1]], expect = {} }, + } + for _, c in ipairs(cases) do + it(c.name, function() + if not IS_WIN then return pending("Windows path semantics: Windows-only") end + assert.are.same(sorted(c.expect), sorted(shell_detect.detect_write_paths(c.cmd, WCWD))) + end) + end +end) + +describe("shell_detect.detect combined (Windows)", function() + it("PowerShell delete + write in one ;-separated command", function() + if not IS_WIN then return pending("Windows path semantics: Windows-only") end + local r = shell_detect.detect( + [[Remove-Item -Path "old.txt" -Force; Set-Content -Path "new.txt" -Value "x"]], WCWD) + assert.are.same({ [[C:\proj\old.txt]] }, r.rm_paths) + assert.are.same({ [[C:\proj\new.txt]] }, r.write_paths) + end) +end) From b8c2cea49799764c5263120fdb2e215db7247925 Mon Sep 17 00:00:00 2001 From: Jay Shitre Date: Thu, 4 Jun 2026 02:46:01 +0530 Subject: [PATCH 5/6] docs(adr-0009): record shell-detect grammar resolved + premise correction (#46) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The deferral in ADR-0009 is closed: STEP 0 landed and the PowerShell grammar shipped in the same PR as the axis split. Append a resolution note correcting the two premises the original text got wrong — it is a separate `PowerShell` tool (not the Bash tool emitting PowerShell), so routing was NOT fine and a matcher + normaliser layer were needed on top of the detection split — and record what shipped for the detection layer, including the honest scope note that the POSIX matchers were not refactored into a unified grammar table. Co-Authored-By: Claude Opus 4.8 --- ...shell-detect-split-axes-and-defer-grammar.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md b/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md index 2a2e5b3..3bf61d4 100644 --- a/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md +++ b/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md @@ -19,3 +19,20 @@ We therefore: rename `bash_detect` → `shell_detect` (behind the unchanged `M.d - The existing POSIX grammar's behaviour must stay byte-identical through the restructure — `pre_tool_bash_detect_spec.lua` (renamed alongside the module) is the safety net. - `M.detect(cmd, cwd)` stays stable, so callers (`pre_tool.handle`) are untouched. The broader integration-registry / install-engine consolidation considered in the same architecture review is **out of scope** here and deferred. - This refines [ADR-0007](0007-windows-shim-via-shared-powershell-discovery.md)'s "one implementation per OS" principle for the *detection* layer: detection is split by *capability axis*, not by OS. + +## Update — resolved in the #46 follow-up + +The deferral above is closed: STEP 0 was gathered and the PowerShell grammar shipped **in the same PR** as the split, so the title's "defer grammar" no longer describes reality. The implementation also corrected two premises in the original text: + +- **It is a separate tool, not the Bash tool emitting PowerShell.** STEP 0 (raw hook stdin teed on the Windows box, cross-checked against the Claude Code session transcripts) showed Claude Code on Windows exposes a distinct **`PowerShell`** tool alongside `Bash`, and routes shell file ops through it (the Haiku model deletes via `Remove-Item`, moves via `Move-Item`, writes via `Set-Content`/`Out-File`/`Add-Content`). `tool_name` is `"PowerShell"` with a Bash-shaped `{command}` payload. (Opus, by contrast, drives git-bash and emits POSIX `rm` with Windows-shaped paths — exactly the cross-cutting case this ADR predicted.) +- **Routing was therefore NOT fine** — contrary to the assumption that the detector was the only gap. The PreToolUse matcher was `Edit|Write|MultiEdit|Bash`, so the hook **never fired** for the `PowerShell` tool; the command never reached the detector. The fix needed two layers this ADR did not anticipate, on top of the detection split: + 1. **Matcher** — add `PowerShell` to the claudecode Pre/PostToolUse matchers so the hook fires at all. + 2. **Normaliser** — fold `tool_name="PowerShell"` onto canonical `Bash` (a shell proposal, Tier-1 indicator-only, `bash_*` origin prefix), so the dispatcher/emitter and `shell_detect` need no awareness of a separate tool. + +What shipped for the detection layer itself, per this ADR: + +- `bash_detect.lua` → `shell_detect.lua` (and `pre_tool_bash_detect_spec.lua` → `pre_tool_shell_detect_spec.lua`), behind the unchanged `M.detect(cmd, cwd)`. +- An OS-selected **path-convention adapter**: Unix (byte-identical) and Windows (`C:\`/`C:/`, UNC `\\…`, backslash, relative-against-a-backslash-cwd). The Windows adapter emits canonical backslash paths because `fnamemodify(":p")` is a no-op on an already-absolute Windows path, so the detector's output must already match the form Edit/Write key the changes registry with. +- The **PowerShell grammar**: `Remove-Item`/`Set-Content`/`Out-File`/`Add-Content`/`Move-Item`/`Copy-Item` (+ aliases), `-Path`/`-Destination`/positional, comma-list arrays (including spaces after commas) and here-strings. POSIX stays byte-identical because the PowerShell verbs deliberately exclude the bash aliases (`rm`/`cp`/`mv`/`tee`), so the two vocabularies never collide. + +Scope honesty: the command matchers are **table-shaped for the PowerShell vocabulary** (verb→category tables) but the POSIX matchers remain the original per-verb functions rather than a single shared grammar table. The ADR's "a second grammar slots in as data" intent is met functionally (the PowerShell grammar was added as data without disturbing POSIX), not as one unified table; a full grammar-table refactor of POSIX was not needed to land Windows support and stays out of scope. From bd9694fb12214947615c35a44d235cec0220ed89 Mon Sep 17 00:00:00 2001 From: Cannon07 Date: Thu, 4 Jun 2026 02:54:17 +0530 Subject: [PATCH 6/6] docs: finalize ADR-0009 + remove shell_detect handoff (#46) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Retitle ADR-0009 to "splits path-convention from command-grammar" (drop "defer grammar": the grammar shipped with the split, so the deferral framing never described a real shipped state). Rewrite as one coherent decision — fold in the STEP 0 premise corrections (distinct PowerShell tool; the matcher + normaliser layers) and the scope-honesty note. Fix stale bash_detect references to shell_detect. - Remove HANDOFF-shell-detect.md: a cross-machine scratchpad, now consumed (findings live in the ADR, the spec's real-sample rows, and code comments). Co-Authored-By: Claude Opus 4.8 --- HANDOFF-shell-detect.md | 155 ------------------ ...ell-detect-split-axes-and-defer-grammar.md | 38 ----- docs/adr/0009-shell-detect-split-axes.md | 26 +++ 3 files changed, 26 insertions(+), 193 deletions(-) delete mode 100644 HANDOFF-shell-detect.md delete mode 100644 docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md create mode 100644 docs/adr/0009-shell-detect-split-axes.md diff --git a/HANDOFF-shell-detect.md b/HANDOFF-shell-detect.md deleted file mode 100644 index 5b10bd9..0000000 --- a/HANDOFF-shell-detect.md +++ /dev/null @@ -1,155 +0,0 @@ -# Handoff — Windows shell write/delete detection (issue #46 follow-up) - -**For: Claude Code on the Windows machine.** You can run PowerShell, named pipes, -and Windows nvim — which is exactly why this task is yours. The prior work was -done on macOS and could not see what Claude Code actually emits on Windows. - -This is a **follow-up to the merged claudecode Windows slice** (`#73`, squash -commit `ead0c2c` on `main`). Branch: `feat/shell-detect-windows`. - -## ⚠ STEP 0 results — premise corrected (this doc was written on macOS) - -Ground truth captured on the Windows box (teeing raw hook stdin + reading the -Claude Code session transcripts) refined the bug in two ways the original -premise got wrong: - -1. **It is a separate tool, not the Bash tool emitting PowerShell.** Claude Code - on Windows exposes a distinct **`PowerShell`** tool (alongside `Bash`) and - routes shell file ops through it — the Haiku model deletes via - `Remove-Item`, moves via `Move-Item`, writes via `Set-Content`/`Out-File`. - `tool_name` is `"PowerShell"`, payload shape `{command, description}` (same as - Bash). The PreToolUse hook matcher was `Edit|Write|MultiEdit|Bash`, so the - hook **never fired** for those proposals — the command never reached the - detector at all. (Opus, by contrast, uses git-bash and emits POSIX `rm`.) -2. **Path resolution was the bug even for git-bash POSIX commands.** The hook - `cwd` is a Windows backslash path; `rm "D:\proj\x"` had its cwd prepended to - an already-absolute path (garbage), and the redirect/`looks_like_path` path - rejected backslashes outright. - -So the fix is three layers (all in this PR): **matcher** (`+PowerShell`), -**normaliser** (fold `PowerShell` → canonical `Bash`), and **shell_detect** -(PowerShell grammar + a Windows path adapter). Real samples used as spec inputs: - -``` -Remove-Item -Path ".step0\ps-delete.txt" -Force -Move-Item -Path ".step0\ps-move-src.txt" -Destination ".step0\ps-move-dst.txt" -Force -Copy-Item -Path ".step0\ps-copy-src.txt" -Destination ".step0\ps-copy-dst.txt" -Force -Set-Content -Path ".step0\ps-pswrite.txt" -Value @"…here-string…"@ -Add-Content -Path ".step0\ps-existing.txt" -Value "Appended via PowerShell Add-Content" -rm -f "D:\other\cp-src.txt" # git-bash POSIX, Windows-absolute path -``` - -The sections below are the original macOS-written plan; read them through the -correction above (notably: routing was NOT fine — the normaliser + matcher both -needed a Windows entry, which STEP 0 explicitly flagged as the fallback case). - -## The bug (original framing — see correction above) - -On Windows, deleting a file (and other shell writes) does **not** mark neo-tree. -A shell delete/write emits **PowerShell** (`Remove-Item …`) instead of `rm`, -and our detector only understands Unix commands + Unix paths, so it finds nothing. - -The preview path (Edit/Write/MultiEdit) already works on Windows — this is the -**Tier-1 change-indicator** path for shell proposals only (see CONTEXT.md -[Tier 1 / Tier 2], [Status], [Origin prefix]). - -## Read first - -- `lua/code-preview/pre_tool/bash_detect.lua` — the whole detector. Unix-only on - two axes: **command vocabulary** (`rm`, `>`/`>>`, `mv`, `cp`, `tee`, `sed -i`) - and **path resolution** (`resolve()`/`looks_like_path()` only handle `/`-absolute - and `~/`; no `C:\`, backslashes, or `%USERPROFILE%`). -- `lua/code-preview/pre_tool/normalisers.lua` — routing. **Correction (STEP 0):** - routing was NOT fine. The `PowerShell` tool needed a matcher entry to fire at - all, and the claudecode normaliser now folds `tool_name="PowerShell"` → `Bash`. -- `tests/plugin/pre_tool_bash_detect_spec.lua` (CI-excluded on Windows) and the - `bash_modified` case in `pre_tool_handle_spec.lua` (marked `pending` on Windows) - — both are this task's to re-enable with Windows-path awareness. -- `docs/adr/0007-...` and `CONTEXT.md` for the cross-OS principles. - -## STEP 0 — gather ground truth (only you can do this) - -With `debug = true` (tail `%LOCALAPPDATA%\nvim-data\code-preview.log`), capture the -exact `tool_name` and raw command string Claude Code emits on Windows for: -- a **delete** (expect `Remove-Item …`?), -- a **file write/redirect** (`Set-Content`? `Out-File`? `Add-Content`? `>`?), -- a **move** and a **copy** if the agent uses them. - -Confirm the `tool_name` is still `"Bash"` (if not, the normaliser tool-map needs a -Windows entry too). **Paste these samples into the PR** — they are the spec inputs. - -## Design — split two independent axes (decided; per the architecture review) - -The detector entangles two axes, and they are **not** the same as "OS" — a -git-bash shell on Windows has POSIX grammar with Windows-ish paths. So do **not** -add a blanket "Windows branch" and do **not** cut the module by OS: - -1. **Path conventions** — `resolve()` / `looks_like_path()` / `is_transient()`: - today only `/`-absolute, `~/`, `$HOME`, `/dev/`. Windows needs `C:\` / `C:/`, - UNC `\\…`, backslash or forward slash, `%TEMP%` / `%USERPROFILE%`. -2. **Command grammar** — which verbs/operators write or delete: POSIX - (`rm`, `>`/`>>`, `mv`, `cp`, `tee`, `sed -i`) vs PowerShell. - -**The shape to build (this is decided — but keep it all behind the current -`M.detect(cmd, cwd)` signature; callers must not change):** - -- **Rename `bash_detect` → `shell_detect`** so the interface stops lying. Update - the require site (`pre_tool/init.lua`), the spec filename, and the CI Windows - exclusion name. -- **Extract a path-convention seam** (`resolve`/`looks_like_path`/`is_transient` - → a small adapter): a Unix impl (as today) + a Windows impl. This is the - highest-value, lowest-risk extraction and is needed *regardless* of which shell - the agent uses. -- **Make the command matchers table-shaped** (verb / redirect / mv-cp-tee-sed) so - a grammar slots in as data, not as scattered `if` branches. Keep the POSIX - grammar **byte-identical** in behaviour (its edge cases come from real bugs). -- **Add a PowerShell grammar adapter — but only after STEP 0 confirms** the - Windows Bash tool emits PowerShell (your `Remove-Item` observation says it - does; STEP 0 nails the exact cmdlets/aliases/params). Starting vocabulary to - confirm/expand from real samples, case-insensitive: - - Delete → `deleted`: `Remove-Item`/`ri`/`rm`/`del`/`erase`/`rd`/`rmdir` - (+ `-Path`, `-Recurse`, `-Force`, positional). - - Write → `bash_created`/`bash_modified`: `Set-Content`/`sc`, `Out-File`, - `Add-Content`/`ac`, `Tee-Object`/`tee`, and `>`/`>>`. - - Move/Copy → write target: `Move-Item`/`mi`/`move`/`ren`, `Copy-Item`/`cpi`/`copy`/`cp`. - - Separators: `;`, `|`, newlines; `&&`/`||` are PS7-only (the hook runs under - **5.1**). `-Path` may take a comma-list / wildcards. - - Keep `bash_*` origin-prefix semantics from [Status]/[Origin prefix] unchanged. - -**Out of bounds for this PR:** do **not** introduce a per-OS file split, an -integration registry, or an install engine. The architecture review **deferred** -all of that as uncoupled cleanup; your scope is the rename + the two seams + the -confirmed grammar, nothing wider. - -## Also in this PR (small, decided) - -`lua/code-preview/diff.lua` (~line 60): the `reveal_root = "git"` path runs a -string shell command with `2>/dev/null` (POSIX-only; misbehaves under Windows -cmd). Fix to list-form, no redirect: -`vim.fn.systemlist({"git","-C",parent,"rev-parse","--show-toplevel"})` + the -existing `shell_error` check. Verify on both OSes. - -## Testing bar - -- Re-enable the excluded/pending bash specs with **portable** paths (no hardcoded - `/tmp`, no `os.getenv("HOME")` assumptions — guard or branch per OS). -- Add Windows-grammar specs from your STEP 0 samples. -- macOS/Linux specs must stay green (the detector must remain correct on Unix). -- The CI `windows-test` job (now on `main`) excludes `pre_tool_bash_detect_spec.lua` - by filename. After the rename, update that exclusion to the new spec name, then - drop it entirely once the spec passes on Windows. - -## Out of scope - -opencode's `vim.fn.system({"cp",…})` (Windows blocker) and the codex/copilot -Windows shims belong to the per-agent rollout PRs, not this one. - -## Suggested skills - -- `tdd` — STEP 0 samples → failing specs → grammar. Good fit here. -- `verify` — confirm a real delete marks neo-tree on Windows end-to-end. - -## Before merge - -Remove this file, and remove `pre_tool_bash_detect_spec.lua` from the CI Windows -exclusion if it now passes. diff --git a/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md b/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md deleted file mode 100644 index 3bf61d4..0000000 --- a/docs/adr/0009-shell-detect-split-axes-and-defer-grammar.md +++ /dev/null @@ -1,38 +0,0 @@ -# Shell write/delete detection splits path-convention from command-grammar; the PowerShell grammar is deferred - -Status: accepted - -Windows support (issue #46) needs the Tier-1 shell-write/delete detector (`pre_tool/bash_detect.lua`, which feeds the [change](../../CONTEXT.md#change) [status](../../CONTEXT.md#status) values `deleted` / `bash_modified` / `bash_created`) to handle Windows. The obvious framings — "make it per-OS" or "add a Windows branch" — are both wrong, because the detector entangles **two independent axes that do not align onto OS**: *path conventions* (`/`-absolute, `~/`, `/dev/`) and *command grammar* (which verbs/operators write or delete: `rm`, `>`, `mv`, `cp`, `tee`, `sed -i`). A git-bash shell on Windows has POSIX grammar with Windows-shaped paths; a PowerShell shell has PowerShell grammar with Windows paths — so OS is not the seam. - -We therefore: rename `bash_detect` → `shell_detect` (behind the unchanged `M.detect(cmd, cwd)` interface); extract a **path-convention adapter** (Unix today, Windows added now); make the command matchers **table-shaped** so a second grammar slots in as data; and **defer building the PowerShell grammar** until an empirical finding confirms which shell a given agent's Bash tool actually invokes on Windows (the same kind of spike [ADR-0007](0007-windows-shim-via-shared-powershell-discovery.md) forced for the RPC layer). - -## Considered Options - -- **Per-OS `shell_detect` (grammar tables keyed by OS)** — rejected: cuts along the wrong axis. OS conflates path conventions and command grammar, and the git-bash-on-Windows case (POSIX grammar, Windows paths) breaks the mapping outright. -- **"Just a Windows branch"** — rejected: scatters OS conditionals through a delicate, edge-case-heavy module (its cases come from real historical bugs) along that same wrong axis. -- **Split the two axes; ship the path adapter; table-shape the matchers; defer the PowerShell grammar** *(chosen)* — isolates the one extraction that is needed regardless of shell (path conventions), and avoids speculative grammar work until the requirement is confirmed. - -## Consequences - -- The path-convention adapter is the highest-value, lowest-risk extraction and is needed whatever shell the agent uses (Windows paths show up even under git-bash). -- The PowerShell grammar is built only once the empirical "what shell does the Windows Bash tool emit" finding lands (gathered as STEP 0 of the shell_detect work). Observed evidence already points at PowerShell (`Remove-Item`), but the exact cmdlets/aliases/params come from real samples, not guesses. -- The existing POSIX grammar's behaviour must stay byte-identical through the restructure — `pre_tool_bash_detect_spec.lua` (renamed alongside the module) is the safety net. -- `M.detect(cmd, cwd)` stays stable, so callers (`pre_tool.handle`) are untouched. The broader integration-registry / install-engine consolidation considered in the same architecture review is **out of scope** here and deferred. -- This refines [ADR-0007](0007-windows-shim-via-shared-powershell-discovery.md)'s "one implementation per OS" principle for the *detection* layer: detection is split by *capability axis*, not by OS. - -## Update — resolved in the #46 follow-up - -The deferral above is closed: STEP 0 was gathered and the PowerShell grammar shipped **in the same PR** as the split, so the title's "defer grammar" no longer describes reality. The implementation also corrected two premises in the original text: - -- **It is a separate tool, not the Bash tool emitting PowerShell.** STEP 0 (raw hook stdin teed on the Windows box, cross-checked against the Claude Code session transcripts) showed Claude Code on Windows exposes a distinct **`PowerShell`** tool alongside `Bash`, and routes shell file ops through it (the Haiku model deletes via `Remove-Item`, moves via `Move-Item`, writes via `Set-Content`/`Out-File`/`Add-Content`). `tool_name` is `"PowerShell"` with a Bash-shaped `{command}` payload. (Opus, by contrast, drives git-bash and emits POSIX `rm` with Windows-shaped paths — exactly the cross-cutting case this ADR predicted.) -- **Routing was therefore NOT fine** — contrary to the assumption that the detector was the only gap. The PreToolUse matcher was `Edit|Write|MultiEdit|Bash`, so the hook **never fired** for the `PowerShell` tool; the command never reached the detector. The fix needed two layers this ADR did not anticipate, on top of the detection split: - 1. **Matcher** — add `PowerShell` to the claudecode Pre/PostToolUse matchers so the hook fires at all. - 2. **Normaliser** — fold `tool_name="PowerShell"` onto canonical `Bash` (a shell proposal, Tier-1 indicator-only, `bash_*` origin prefix), so the dispatcher/emitter and `shell_detect` need no awareness of a separate tool. - -What shipped for the detection layer itself, per this ADR: - -- `bash_detect.lua` → `shell_detect.lua` (and `pre_tool_bash_detect_spec.lua` → `pre_tool_shell_detect_spec.lua`), behind the unchanged `M.detect(cmd, cwd)`. -- An OS-selected **path-convention adapter**: Unix (byte-identical) and Windows (`C:\`/`C:/`, UNC `\\…`, backslash, relative-against-a-backslash-cwd). The Windows adapter emits canonical backslash paths because `fnamemodify(":p")` is a no-op on an already-absolute Windows path, so the detector's output must already match the form Edit/Write key the changes registry with. -- The **PowerShell grammar**: `Remove-Item`/`Set-Content`/`Out-File`/`Add-Content`/`Move-Item`/`Copy-Item` (+ aliases), `-Path`/`-Destination`/positional, comma-list arrays (including spaces after commas) and here-strings. POSIX stays byte-identical because the PowerShell verbs deliberately exclude the bash aliases (`rm`/`cp`/`mv`/`tee`), so the two vocabularies never collide. - -Scope honesty: the command matchers are **table-shaped for the PowerShell vocabulary** (verb→category tables) but the POSIX matchers remain the original per-verb functions rather than a single shared grammar table. The ADR's "a second grammar slots in as data" intent is met functionally (the PowerShell grammar was added as data without disturbing POSIX), not as one unified table; a full grammar-table refactor of POSIX was not needed to land Windows support and stays out of scope. diff --git a/docs/adr/0009-shell-detect-split-axes.md b/docs/adr/0009-shell-detect-split-axes.md new file mode 100644 index 0000000..6c45dba --- /dev/null +++ b/docs/adr/0009-shell-detect-split-axes.md @@ -0,0 +1,26 @@ +# Shell write/delete detection splits path-convention from command-grammar + +Status: accepted + +Windows support (issue #46) needs the Tier-1 shell-write/delete detector (`pre_tool/shell_detect.lua`, renamed from `bash_detect.lua`; it feeds the [change](../../CONTEXT.md#change) [status](../../CONTEXT.md#status) values `deleted` / `bash_modified` / `bash_created`) to work on Windows. The obvious framings — "make it per-OS" or "add a Windows branch" — are both wrong, because the detector entangles **two independent axes that do not align onto OS**: *path conventions* (`/`-absolute, `~/`, `/dev/`) and *command grammar* (which verbs/operators write or delete). A git-bash shell on Windows has POSIX grammar with Windows-shaped paths; a PowerShell shell has PowerShell grammar with Windows paths — so OS is not the seam. + +We therefore split the two axes: rename `bash_detect` → `shell_detect` behind the unchanged `M.detect(cmd, cwd)` interface; extract an OS-selected **path-convention adapter** (Unix, byte-identical to the historical behaviour, plus Windows: `C:\`/`C:/`, UNC `\\…`, backslash separators, relative-against-a-backslash-cwd); and add a **PowerShell command grammar** alongside the POSIX one. Both grammars run on every command — they share the redirect operator and differ only in verbs, and the PowerShell tables deliberately exclude the bash aliases (`rm`/`cp`/`mv`/`tee`), so adding PowerShell provably cannot change a POSIX result. + +The PowerShell grammar was **informed by an empirical finding, not guessed** — the same discipline [ADR-0007](0007-windows-shim-via-shared-powershell-discovery.md) forced for the RPC layer. Ground truth (raw hook stdin teed on a Windows box, cross-checked against Claude Code session transcripts) corrected two premises we held going in: + +- **It is a separate tool, not the Bash tool emitting PowerShell.** Claude Code on Windows exposes a distinct **`PowerShell`** tool alongside `Bash` and routes shell file ops through it (the Haiku model deletes via `Remove-Item`, moves via `Move-Item`, writes via `Set-Content`/`Out-File`/`Add-Content`); `tool_name` is `"PowerShell"` with a Bash-shaped `{command}` payload. (Opus, by contrast, drives git-bash and emits POSIX `rm` with Windows-shaped paths — exactly the cross-cutting case the axis split predicted.) +- **Routing was therefore not "fine"** — the detector was not the only gap. The PreToolUse matcher was `Edit|Write|MultiEdit|Bash`, so the hook never fired for the `PowerShell` tool. The full fix is three layers: **matcher** (add `PowerShell` to the claudecode Pre/PostToolUse matchers so the hook fires), **normaliser** (fold `tool_name="PowerShell"` onto canonical `Bash`, so the dispatcher/emitters and `shell_detect` need no awareness of a separate tool), and the **detection split** itself. + +## Considered Options + +- **Per-OS `shell_detect` (grammar tables keyed by OS)** — rejected: cuts along the wrong axis. OS conflates path conventions and command grammar, and the git-bash-on-Windows case (POSIX grammar, Windows paths) breaks the mapping outright. +- **"Just a Windows branch"** — rejected: scatters OS conditionals through a delicate, edge-case-heavy module (its cases come from real historical bugs) along that same wrong axis. +- **Split the two axes** *(chosen)* — path conventions become an OS-selected adapter; command grammars are added by vocabulary, independent of OS. + +## Consequences + +- The path-convention adapter is the highest-value extraction and is needed whatever shell the agent uses (Windows paths show up even under git-bash). The Windows adapter emits canonical backslash paths, because `fnamemodify(":p")` is a no-op on an already-absolute Windows path, so the detector's output must already match the form Edit/Write key the changes registry with — otherwise the neo-tree indicator keys a different entry and misses. +- POSIX behaviour stays byte-identical through the restructure; `pre_tool_shell_detect_spec.lua` (renamed alongside the module) is the safety net, with POSIX rows and Windows rows each marked `pending` on the other OS. +- `M.detect(cmd, cwd)` stays stable, so callers (`pre_tool.handle`) are untouched. +- **Scope honesty:** the PowerShell vocabulary is table-shaped (verb→category tables), but the POSIX matchers remain the original per-verb functions rather than one unified grammar table. "A second grammar slots in as data" is met functionally — PowerShell was added without disturbing POSIX — not as a single shared table; a full POSIX grammar-table refactor was not needed to land Windows and stays out of scope. The broader integration-registry / install-engine consolidation from the same architecture review is likewise deferred. +- This refines [ADR-0007](0007-windows-shim-via-shared-powershell-discovery.md)'s "one implementation per OS" principle for the *detection* layer: detection is split by *capability axis*, not by OS.