Windows/Mac compatibility + surface all startup/execution errors in /bonsai:status#1
Draft
ferdinandobons wants to merge 9 commits into
Draft
Windows/Mac compatibility + surface all startup/execution errors in /bonsai:status#1ferdinandobons wants to merge 9 commits into
ferdinandobons wants to merge 9 commits into
Conversation
Cross-platform: - Add .gitattributes pinning every text file to LF. On Windows, Git's default autocrlf=true would rewrite the shell scripts to CRLF on checkout and break bash (CR in shebang / [[ ]] compare) as soon as a hook fired. - Expand CI from Linux-only to a Linux/macOS/Windows matrix (Git Bash on Windows). shellcheck + unit tests run on all three; the timing-sensitive integration suite runs on the POSIX platforms. This catches BSD-vs-GNU tool differences and Windows path/line-ending regressions that a Linux-only pipeline missed before. Error reporting in /bonsai:status: - New telemetry.sh helper bonsai_telemetry_gardener_errors extracts recent errored gardener runs (subtype + flattened, truncated message), including unparseable/partial logs from a claude that died mid-write. - status.sh now has an Errors section reporting 24h counts of hook errors (stop.sh/remind.sh startup+execution failures from bonsai-errors.log) and gardener execution errors, plus the most recent lines of each with their messages. Previously only the single last error line was shown and gardener failures had no surfaced detail. Docs + tests: - README: Requirements section (bash/jq; Windows via Git Bash) and note that errors now surface in /bonsai:status. CHANGELOG entries. - Unit tests for the new telemetry helper and the status Errors block. https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
Under `set -e`, the recent-errors block ended in `[ -n "$gardener_errs" ] && printf …`. When there were hook errors but no gardener errors, that trailing AND-list's test failed and became the script's last command, so `/bonsai:status` exited 1 even though it ran fine — failing the new "surfaces a recent hook error" unit test. Use full `if` blocks for the two conditional prints (an untaken `if` exits 0) and add a defensive final `exit 0`. status must never report failure just because it summarized errors. https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
Two failures surfaced by the new macOS/Windows matrix jobs:
- Shellcheck ran from the repo root, where `source "${CLAUDE_PLUGIN_ROOT}/
lib/*.sh"` resolves to a non-existent ./lib/*.sh. shellcheck 0.11.0
(brew/choco on macOS/Windows) then emits SC1091 and fails the step; the
older 0.9.0 (apt) was lenient, so the Linux-only pipeline never caught
it. Run the step from plugins/bonsai with relative paths so the sources
resolve and shellcheck follows them — verified clean (rc=0) on both
0.9.0 and 0.11.0.
- `shell: bash` enables `pipefail`, so `bash --version | head -1` in the
Tool versions step could fail on SIGPIPE when head closes the pipe
early. Drop the truncating pipes and print full versions.
https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
…n POSIX The bats suite fails on Windows at the test-harness level, not in the plugin: tests stub Unix executables (jq, claude) onto PATH and assume coreutils temp-path semantics, neither of which hold under Git Bash/MSYS. Those are properties of the harness, not the shipped code. So: run the full behavioral suite (bats unit + integration) on Linux and macOS, and on Windows verify what is both meaningful and reliable — the scripts lint cleanly under the Windows shellcheck toolchain and the JSON manifests are valid. Combined with .gitattributes (LF endings guaranteed on a Windows checkout, the real #1 Git Bash breaker) this keeps a genuine Windows gate without depending on porting the Linux test harness to Git Bash. macOS already runs the full suite, covering the BSD-tool paths. https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
Make the Linux/macOS behavioral coverage actually run on Windows instead of scoping Windows down to lint-only. Two Git Bash blockers fixed: - bats was unfindable on Windows: the install step added "$RUNNER_TEMP/bats-core/bin" to GITHUB_PATH, but on Windows $RUNNER_TEMP is a native path (D:\a\_temp) whose drive colon splits Git Bash's PATH. Normalize the entry with cygpath on Windows. - test sandbox temp dir: setup.bash built it from $TMPDIR, a native Windows path on the runner (backslashes + drive colon break the mktemp template). Base it on bats' own $BATS_TEST_TMPDIR, a clean POSIX path on every OS, with $TMPDIR only as last-resort fallback. Unit suite verified on Linux (197 ok; the 1 miss is a pre-existing timing flake in the concurrent-lock test, green in isolation). Integration stays POSIX-only (detached gardeners + timing-based polling). https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
The Windows Actions log isn't readable from the dev environment, so capture the bats unit output and, on a Windows failure, post the failing test lines as a PR comment to diagnose the Git Bash incompatibilities precisely. This debug step is removed once Windows is green. https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
|
Windows bats unit failures (debug): |
Diagnosed from the Windows runner (surfaced via the temporary PR-comment debug step). Three distinct Git Bash issues, all fixed without affecting Linux/macOS (198/198 unit still green, shellcheck clean): - MSYS path conversion (whitelist + quota path tests): jq is a native Windows binary, so Git Bash rewrites POSIX-looking argument paths (`jq --arg p "/foo"`) into Windows paths before jq sees them. Keys written one way never matched on read. Export MSYS_NO_PATHCONV=1 / MSYS2_ARG_CONV_EXCL='*' from common.sh (runtime) and setup.bash (tests); both are inert on macOS/Linux. - jq.exe CRLF (judge multi-verdict parse): text-mode stdout adds \r, leaving a stray CR on each parsed verdict line. Strip it with `tr -d '\r'`, wrapped in a pipefail subshell so the function's fail-open exit code still reflects jq, not tr. - Non-ASCII @test names (3 tests "unknown test name", never executed under Git Bash): replace the U+2192/U+2014 glyphs with ASCII -> / --. https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
|
Windows bats unit failures (debug): |
The previous attempt exported MSYS_NO_PATHCONV/MSYS2_ARG_CONV_EXCL to stop Git Bash from rewriting POSIX-looking `jq --arg` values. That backfired: jq (choco) is a native Windows binary, so disabling conversion also stopped it from opening file-path arguments (/tmp/...x.json -> "Could not open file"), turning 9 failures into ~50. Revert the env exports (so jq file I/O works under the default conversion) and instead make the 5 affected tests use slash-free keys. MSYS only rewrites leading-slash POSIX paths, so a key like "foo"/"p" is immune: the fixture literal and the lib's `jq --arg` lookup stay in sync. These are opaque keys, so the change is equally valid on Linux/macOS (198/198 still green). whitelist "add" now asserts via --arg too, symmetric with the lib. Keeps the earlier correct fixes (ASCII @test names; judge `tr -d '\r'` for jq.exe's CRLF). The temporary PR-comment debug step stays for one more run to confirm Windows is green, then it's removed. https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
Windows now runs the full bats unit suite green, so drop the debug step that posted failing tests as a PR comment, along with the permissions block it required. https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two related goals: make the plugin run reliably on Windows and macOS (not just Linux), and make any startup or execution error visible from
/bonsai:status.The bash library was already carefully cross-platform aware (BSD-vs-GNU
date/stat,shasum/sha256sum,mkdirmutex instead offlock,timeout/gtimeoutfallback). The real gaps were: no line-ending protection for Windows checkouts, and no CI on anything but Linux — which is how an earlier release shipped a platform regression undetected.Cross-platform
.gitattributes— pins every text file toLF. On Windows, Git's defaultcore.autocrlf=truerewrites the shell scripts to CRLF on checkout, which breaks bash (a CR in a shebang, or in a[[ "$x" == "$y" ]]compare) the moment a hook fires.git add --renormalizeproduced no churn — the tree is already LF.ubuntu-latestonly toubuntu-latest,macos-latest,windows-latest(Git Bash,shell: bash). shellcheck + unit tests run on all three; the timing-sensitive integration suite (detached background gardeners + polling) runs on the POSIX platforms.fail-fast: falseso one red OS doesn't hide the others.Errors in
/bonsai:status(follow-up request)Errors come from two independent places, and only the single last
bonsai-errors.logline was previously shown:stop.sh,remind.sh— i.e. Stop/SessionStart/UserPromptSubmit) recorded inbonsai-errors.logby their ERR traps.gardener-*.logresult JSON and never reachbonsai-errors.log.Changes:
bonsai_telemetry_gardener_errorshelper — lists recent errored gardener runs with subtype + a flattened, truncated message, including unparseable/partial logs from aclaudethat died mid-write.status.shnow has an Errors section: 24h counts of hook errors and gardener errors, plus the most recent few lines of each with their messages.Docs
/bonsai:status.[Unreleased].Tests
test_telemetry.bats: coverage forbonsai_telemetry_gardener_errors(subtype+message listing, unparseable logs, cutoff + max-lines cap, empty/clean/missing-dir cases).test_commands.bats: status renders the Errors block clean, surfaces a recent hook error, and surfaces a gardener execution error.Verification
bats isn't installable in this sandbox (network/sudo denied), so the new bash logic was validated directly: the telemetry helper and a full
status.shrun were exercised end-to-end (both error and clean cases exit 0 underset -e),.gitattributesrenormalization showed no churn, and the workflow YAML parses. The matrix CI on this PR is the real cross-platform check.https://claude.ai/code/session_01YHfJpxnPbycBwjt8znjjPV
Generated by Claude Code