fix: security audit fixes round 1 — Cursor, Sparkle, Watch, socket, binary locator by 8676311081 · Pull Request #2 · 8676311081/multi-agent-console

8676311081 · 2026-05-03T10:05:04Z

Summary

CRITICAL: Require user approval for Cursor beforeShellExecution and beforeMCPExecution (was auto-approved)
HIGH: Restrict Sparkle allowedChannels to ["release"] + explicit appcast URL
HIGH: Expand Watch pairing code from 4→6 digits, 1-hour token expiry, anonymize Bonjour name
MEDIUM: Gate .build/ binary search behind #if DEBUG
MEDIUM: fchmod 0o600 on Unix bridge socket after bind
MEDIUM: Reject HTTP requests >128KB on Watch endpoint
MEDIUM: Validate settings.json size (1MB cap) and hook group count (256 max)
FIX: Remove stale BridgeServerAutoResponseTests.swift (205 tests pass)

🤖 Generated with Claude Code

Rebrand as enhanced fork of open-vibe-island, highlighting the instant terminal detection and manual refresh features unique to this version. Bilingual (Chinese + English), with proper attribution to upstream. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The test referenced AutoResponseRule, RuleConditions, and methods that were never added to BridgeServer.swift. 205 remaining tests now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Previously handleCursorHook auto-approved beforeShellExecution and beforeMCPExecution with permission: .allow, bypassing the user completely. Now these events emit .permissionRequested and queue a PendingCursorInteraction, routing through the same resolution flow as Claude Code permissions. The user must explicitly approve in the UI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Sparkle: - Restrict allowedChannels to ["release"] (was empty = any channel) - Add explicit feedURLString returning the expected appcast URL - Note: EdDSA signing key (SUPublicEDKey) should be added to Info.plist before release builds ship Watch: - Expand pairing code from 4 to 6 digits (1M -> 1B combinations) - Add 1-hour token expiry (was permanent) - Anonymize Bonjour service name (was Host.current().localizedName) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- HooksBinaryLocator: gate .build/ search behind #if DEBUG (release builds only use the managed install directory) - BridgeServer: fchmod 0o600 on Unix socket after bind - WatchHTTPEndpoint: reject HTTP requests >128KB (413 Payload Too Large) - ClaudeHookInstaller: validate settings.json size (1MB cap) and hook group count (256 max per event) before deserializing Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Addresses findings from independent security review: HIGH: - /pair brute-force protection: regenerate code + 2s block after 5 failed attempts per code - Rate-limit state resets on successful pair or code regeneration MEDIUM: - Bind Watch HTTP listener to WiFi interface only (was all interfaces, including Tailscale/VPN) LOW: - Proactive expired token pruning on each code regeneration Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The fork's SUFeedURL and SUPublicEDKey point to upstream Octane0411/open-vibe-island. Enabling auto-updates would silently overwrite fork-specific enhancements with the upstream release. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Added createdAt timestamps to PendingApproval, PendingClaudeToolContext, PendingClaudeInteraction, PendingOpenCodeInteraction, and PendingCursorInteraction. A DispatchSourceTimer fires every 2 minutes to sweep entries older than 10 minutes, preventing unbounded growth when sessions crash without cleanup events. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- TerminalTextSender: escape \n \r \t in AppleScript strings (was only \\ and \") - BridgeTransport.decodeLines: cap buffer at 8 MiB when no newline arrives, throw malformedEnvelope to prevent OOM DoS - HooksBinaryLocator: gate OPEN_ISLAND_HOOKS_BINARY env var behind #if DEBUG (was accepted from any source in release builds) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The same function in TerminalJumpService was missed in fa60355 -- only TerminalTextSender was updated. Both copies now escape newline, carriage return, and tab in addition to backslash and quote. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… tests - BridgeServer: guard against pending dict overwrite by a second client with the same sessionID (8 write sites now skip if key exists) - SettingsView: "Check for Updates" now opens the fork GitHub releases page instead of calling Sparkle (which would pull upstream) - Tests: 9 new security regression tests across 2 new suites: - BridgeServerSecurityTests: Cursor blocking hooks, TTL sweep timer, BridgeServer lifecycle - WatchHTTPEndpointSecurityTests: pairing code length, regeneration, token expiry, body size limit Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- UpdateChecker.releasesURL now points to 8676311081/open-island (was Octane0411/open-vibe-island) - Removed sweepTimerIsConfigured (hardcoded constants, always passes) - Replaced httpBodySizeLimitIsEnforced with endpointIntegratesWithoutCrashing that tests actual start/stop lifecycle Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ClaudeUsageSnapshot.cachedAt and CodexUsageSnapshot.capturedAt were already populated by the loaders but never reached the UI. When Claude Code CLI is idle (e.g. user works only in Claude Desktop), the statusline-fed cache freezes at the last snapshot, making the island percentages diverge silently from the actual account-level rate limits. Pipe both timestamps into UsageWindowPresentation and apply visible decay: - < 5 min: percentage at full opacity (fresh) - 5–30 min: percentage at 60% opacity (aging) - > 30 min: percentage at 35% opacity (stale) In .full layout, append a yellow "Nm ago" / "Nh ago" tag once the data crosses the 5-minute threshold, with an accessibility label so VoiceOver reads "Cached 6h ago". Co-Authored-By: claude-flow <ruv@ruv.net>

After implementing stale-percentage opacity (634b16a), users still have no way to learn *why* the number is stale. This adds a hover tooltip on the "5h ago" / "Nh ago" badge that explains the data source and why Claude Desktop usage may differ. Confirmed via two PTY experiments that Claude Code's statusline hook can't be triggered from outside a real terminal (even with TERM_PROGRAM=iTerm.app + ITERM_SESSION_ID + a real prompt over an expect-managed PTY, the statusline script is never invoked). So the statusline-fed cache is the upper bound of what we can refresh within Claude Code's public surface; the tooltip makes that boundary visible to the user. Tooltip text: - Claude windows: "Cached Nh ago. Claude usage updates only when Claude Code's interactive statusline receives fresh rate_limits. Claude Desktop and web usage may already be newer." - Codex windows: "Cached Nh ago. Codex usage is read from local rollout files; numbers refresh on the next assistant turn." Co-Authored-By: claude-flow <ruv@ruv.net>

Codify what we learned over the four-experiment investigation so the next worker (human or AI) doesn't re-run the same dead ends. The full record lives in docs/usage-freshness-investigation.md: - statusline is the only public-surface writer of used_percentage - claude -p / stream-json / hooks / debug-file / transcript: none contain equivalent fields - PTY + faked iTerm env (TERM_PROGRAM, ITERM_SESSION_ID, COLORTERM, FORCE_HYPERLINK) + real prompt + 30 s wait does NOT trigger statusline — Claude Code's terminal check is deeper - staleness UI (opacity decay + "Nh ago" tag + tooltip) is the ceiling under Claude Code's public surface Real-time parity with Claude Desktop is possible only via private Anthropic endpoint + Keychain OAuth + feature flag + fail-closed — explicitly scoped as a separate multi-day project. A pointer comment at the top of ClaudeUsage.swift redirects future readers to the doc before they try a hidden poller again. Co-Authored-By: claude-flow <ruv@ruv.net>

Pulls Max-plan 5h/7d utilization directly from https://claude.ai/api/organizations/<org-id>/usage so the island panel matches Claude Desktop's Settings → Usage page even when the user works only in Claude Desktop and Claude Code's interactive statusline never fires. Investigation closed in three independent passes (me + Codex + DS). Empirical findings (see docs/usage-freshness-investigation.md): - statusline is the only public surface that exposes used_percentage; -p / stream-json / hooks / debug-file all do not - PTY-wrapped interactive claude does not trigger statusline either, even with TERM_PROGRAM=iTerm.app + a real prompt - the web API endpoint Claude Desktop calls is reachable from any HTTPS client given the user's session cookie; schema (`utilization` field) was already supported by ClaudeUsageLoader as a fallback, so the cache file path needs zero translation Implementation (all opt-in, default off, fail-closed): OpenIslandCore: - ClaudeWebUsageCookieStore: Keychain wrapper using kSecClassInternetPassword (per DeepSeek review), server="claude.ai", AfterFirstUnlockThisDeviceOnly. InMemory variant for tests. - ClaudeWebUsageClient: URLSession-based, explicit error mapping (unauthorized / rateLimited(retryAfter:) / schemaMismatch / httpError / transportError / missingCookie), User-Agent set to "OpenIsland/1.0" (per DS — don't fake a browser). - ClaudeWebUsagePoller: 5-minute timer, auto-resolves org_id via /api/organizations on first run, only writes the cache file on success, fires onAuthFailure on 401/403, fires onSchemaDrift exactly once when consecutive failures cross 10 (~50 min). State is exposed as an Equatable struct + onStateChange callback. OpenIslandApp: - AppModel: feature flag (UserDefaults), lazy poller, cookie save via setClaudeWebUsageCookie, refreshNow trigger, observable poller state. - ClaudeWebUsageSection (new view): Settings → Setup → "Realtime Web Usage (experimental)" with toggle, cookie paste, auto-filled org_id, status badge (Active / No cookie / Expired / Drift), and a Refresh-now button. Includes inline help on how to copy the cookie from Chrome. - IslandPanelView: orange dashed border around the usage row when the poller's drift threshold is crossed (visible signal that numbers may be stale). Privacy: - PRIVACY_POLICY.md updated in both EN and ZH to disclose the optional outbound HTTPS to claude.ai and the Keychain-stored cookie. Default-off; user-supplied cookie; nothing transmitted beyond claude.ai. Tests: 13 new tests covering cookie store roundtrip + whitespace rejection, client schema parsing, 200/401/403/429/schema-mismatch mapping, missing-cookie short-circuit, poller success cache write, unauthorized callback firing, schema-drift threshold firing exactly once. Suite runs .serialized due to URLProtocol mock global state. Total: 226 / 226 tests pass, swift build clean. Co-Authored-By: claude-flow <ruv@ruv.net>

qw and others added 16 commits April 16, 2026 15:03

fix: remove stale BridgeServerAutoResponseTests blocking compilation

95f86d7

The test referenced AutoResponseRule, RuleConditions, and methods that were never added to BridgeServer.swift. 205 remaining tests now pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: security audit fixes round 1 — Cursor, Sparkle, Watch, socket, binary locator#2

fix: security audit fixes round 1 — Cursor, Sparkle, Watch, socket, binary locator#2
8676311081 wants to merge 16 commits into
mainfrom
worktree-fix+security-audit-round1

8676311081 commented May 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

8676311081 commented May 3, 2026

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant