Skip to content

fix: security audit fixes round 1 — Cursor, Sparkle, Watch, socket, binary locator#2

Open
8676311081 wants to merge 16 commits into
mainfrom
worktree-fix+security-audit-round1
Open

fix: security audit fixes round 1 — Cursor, Sparkle, Watch, socket, binary locator#2
8676311081 wants to merge 16 commits into
mainfrom
worktree-fix+security-audit-round1

Conversation

@8676311081

Copy link
Copy Markdown
Owner

Summary

  • CRITICAL: Require user approval for Cursor beforeShellExecution and beforeMCPExecution (was auto-approved)
  • HIGH: Restrict Sparkle allowedChannels to ["release"] + explicit appcast URL
  • HIGH: Expand Watch pairing code from 4→6 digits, 1-hour token expiry, anonymize Bonjour name
  • MEDIUM: Gate .build/ binary search behind #if DEBUG
  • MEDIUM: fchmod 0o600 on Unix bridge socket after bind
  • MEDIUM: Reject HTTP requests >128KB on Watch endpoint
  • MEDIUM: Validate settings.json size (1MB cap) and hook group count (256 max)
  • FIX: Remove stale BridgeServerAutoResponseTests.swift (205 tests pass)

🤖 Generated with Claude Code

qw and others added 16 commits April 16, 2026 15:03
Rebrand as enhanced fork of open-vibe-island, highlighting the instant
terminal detection and manual refresh features unique to this version.
Bilingual (Chinese + English), with proper attribution to upstream.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test referenced AutoResponseRule, RuleConditions, and methods that
were never added to BridgeServer.swift. 205 remaining tests now pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously handleCursorHook auto-approved beforeShellExecution and
beforeMCPExecution with permission: .allow, bypassing the user completely.
Now these events emit .permissionRequested and queue a
PendingCursorInteraction, routing through the same resolution flow as
Claude Code permissions. The user must explicitly approve in the UI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sparkle:
- Restrict allowedChannels to ["release"] (was empty = any channel)
- Add explicit feedURLString returning the expected appcast URL
- Note: EdDSA signing key (SUPublicEDKey) should be added to Info.plist
  before release builds ship

Watch:
- Expand pairing code from 4 to 6 digits (1M -> 1B combinations)
- Add 1-hour token expiry (was permanent)
- Anonymize Bonjour service name (was Host.current().localizedName)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- HooksBinaryLocator: gate .build/ search behind #if DEBUG (release
  builds only use the managed install directory)
- BridgeServer: fchmod 0o600 on Unix socket after bind
- WatchHTTPEndpoint: reject HTTP requests >128KB (413 Payload Too Large)
- ClaudeHookInstaller: validate settings.json size (1MB cap) and hook
  group count (256 max per event) before deserializing

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses findings from independent security review:

HIGH:
- /pair brute-force protection: regenerate code + 2s block after
  5 failed attempts per code
- Rate-limit state resets on successful pair or code regeneration

MEDIUM:
- Bind Watch HTTP listener to WiFi interface only (was all interfaces,
  including Tailscale/VPN)

LOW:
- Proactive expired token pruning on each code regeneration

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The fork's SUFeedURL and SUPublicEDKey point to upstream
Octane0411/open-vibe-island. Enabling auto-updates would silently
overwrite fork-specific enhancements with the upstream release.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Added createdAt timestamps to PendingApproval, PendingClaudeToolContext,
PendingClaudeInteraction, PendingOpenCodeInteraction, and
PendingCursorInteraction. A DispatchSourceTimer fires every 2 minutes
to sweep entries older than 10 minutes, preventing unbounded growth
when sessions crash without cleanup events.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- TerminalTextSender: escape \n \r \t in AppleScript strings (was only
  \\ and \")
- BridgeTransport.decodeLines: cap buffer at 8 MiB when no newline
  arrives, throw malformedEnvelope to prevent OOM DoS
- HooksBinaryLocator: gate OPEN_ISLAND_HOOKS_BINARY env var behind
  #if DEBUG (was accepted from any source in release builds)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The same function in TerminalJumpService was missed in fa60355 --
only TerminalTextSender was updated. Both copies now escape
newline, carriage return, and tab in addition to backslash and quote.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… tests

- BridgeServer: guard against pending dict overwrite by a second client
  with the same sessionID (8 write sites now skip if key exists)
- SettingsView: "Check for Updates" now opens the fork GitHub releases
  page instead of calling Sparkle (which would pull upstream)
- Tests: 9 new security regression tests across 2 new suites:
  - BridgeServerSecurityTests: Cursor blocking hooks, TTL sweep timer,
    BridgeServer lifecycle
  - WatchHTTPEndpointSecurityTests: pairing code length, regeneration,
    token expiry, body size limit

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- UpdateChecker.releasesURL now points to 8676311081/open-island
  (was Octane0411/open-vibe-island)
- Removed sweepTimerIsConfigured (hardcoded constants, always passes)
- Replaced httpBodySizeLimitIsEnforced with endpointIntegratesWithoutCrashing
  that tests actual start/stop lifecycle

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ClaudeUsageSnapshot.cachedAt and CodexUsageSnapshot.capturedAt
were already populated by the loaders but never reached the UI.
When Claude Code CLI is idle (e.g. user works only in Claude
Desktop), the statusline-fed cache freezes at the last snapshot,
making the island percentages diverge silently from the actual
account-level rate limits.

Pipe both timestamps into UsageWindowPresentation and apply
visible decay:
- < 5 min:   percentage at full opacity (fresh)
- 5–30 min:  percentage at 60% opacity (aging)
- > 30 min:  percentage at 35% opacity (stale)

In .full layout, append a yellow "Nm ago" / "Nh ago" tag once
the data crosses the 5-minute threshold, with an accessibility
label so VoiceOver reads "Cached 6h ago".

Co-Authored-By: claude-flow <ruv@ruv.net>
After implementing stale-percentage opacity (634b16a), users still
have no way to learn *why* the number is stale. This adds a hover
tooltip on the "5h ago" / "Nh ago" badge that explains the data
source and why Claude Desktop usage may differ.

Confirmed via two PTY experiments that Claude Code's statusline
hook can't be triggered from outside a real terminal (even with
TERM_PROGRAM=iTerm.app + ITERM_SESSION_ID + a real prompt over an
expect-managed PTY, the statusline script is never invoked). So
the statusline-fed cache is the upper bound of what we can refresh
within Claude Code's public surface; the tooltip makes that boundary
visible to the user.

Tooltip text:
- Claude windows: "Cached Nh ago. Claude usage updates only when
  Claude Code's interactive statusline receives fresh rate_limits.
  Claude Desktop and web usage may already be newer."
- Codex windows: "Cached Nh ago. Codex usage is read from local
  rollout files; numbers refresh on the next assistant turn."

Co-Authored-By: claude-flow <ruv@ruv.net>
Codify what we learned over the four-experiment investigation so the
next worker (human or AI) doesn't re-run the same dead ends.

The full record lives in docs/usage-freshness-investigation.md:
- statusline is the only public-surface writer of used_percentage
- claude -p / stream-json / hooks / debug-file / transcript: none
  contain equivalent fields
- PTY + faked iTerm env (TERM_PROGRAM, ITERM_SESSION_ID,
  COLORTERM, FORCE_HYPERLINK) + real prompt + 30 s wait does NOT
  trigger statusline — Claude Code's terminal check is deeper
- staleness UI (opacity decay + "Nh ago" tag + tooltip) is the
  ceiling under Claude Code's public surface

Real-time parity with Claude Desktop is possible only via private
Anthropic endpoint + Keychain OAuth + feature flag + fail-closed —
explicitly scoped as a separate multi-day project.

A pointer comment at the top of ClaudeUsage.swift redirects future
readers to the doc before they try a hidden poller again.

Co-Authored-By: claude-flow <ruv@ruv.net>
Pulls Max-plan 5h/7d utilization directly from
https://claude.ai/api/organizations/<org-id>/usage so the island
panel matches Claude Desktop's Settings → Usage page even when the
user works only in Claude Desktop and Claude Code's interactive
statusline never fires.

Investigation closed in three independent passes (me + Codex + DS).
Empirical findings (see docs/usage-freshness-investigation.md):
- statusline is the only public surface that exposes
  used_percentage; -p / stream-json / hooks / debug-file all do not
- PTY-wrapped interactive claude does not trigger statusline
  either, even with TERM_PROGRAM=iTerm.app + a real prompt
- the web API endpoint Claude Desktop calls is reachable from any
  HTTPS client given the user's session cookie; schema
  (`utilization` field) was already supported by ClaudeUsageLoader
  as a fallback, so the cache file path needs zero translation

Implementation (all opt-in, default off, fail-closed):

OpenIslandCore:
- ClaudeWebUsageCookieStore: Keychain wrapper using
  kSecClassInternetPassword (per DeepSeek review), server="claude.ai",
  AfterFirstUnlockThisDeviceOnly. InMemory variant for tests.
- ClaudeWebUsageClient: URLSession-based, explicit error mapping
  (unauthorized / rateLimited(retryAfter:) / schemaMismatch /
  httpError / transportError / missingCookie), User-Agent set to
  "OpenIsland/1.0" (per DS — don't fake a browser).
- ClaudeWebUsagePoller: 5-minute timer, auto-resolves org_id via
  /api/organizations on first run, only writes the cache file on
  success, fires onAuthFailure on 401/403, fires onSchemaDrift
  exactly once when consecutive failures cross 10 (~50 min). State
  is exposed as an Equatable struct + onStateChange callback.

OpenIslandApp:
- AppModel: feature flag (UserDefaults), lazy poller, cookie save
  via setClaudeWebUsageCookie, refreshNow trigger, observable
  poller state.
- ClaudeWebUsageSection (new view): Settings → Setup → "Realtime
  Web Usage (experimental)" with toggle, cookie paste, auto-filled
  org_id, status badge (Active / No cookie / Expired / Drift), and
  a Refresh-now button. Includes inline help on how to copy the
  cookie from Chrome.
- IslandPanelView: orange dashed border around the usage row when
  the poller's drift threshold is crossed (visible signal that
  numbers may be stale).

Privacy:
- PRIVACY_POLICY.md updated in both EN and ZH to disclose the
  optional outbound HTTPS to claude.ai and the Keychain-stored
  cookie. Default-off; user-supplied cookie; nothing transmitted
  beyond claude.ai.

Tests: 13 new tests covering cookie store roundtrip + whitespace
rejection, client schema parsing, 200/401/403/429/schema-mismatch
mapping, missing-cookie short-circuit, poller success cache write,
unauthorized callback firing, schema-drift threshold firing exactly
once. Suite runs .serialized due to URLProtocol mock global state.

Total: 226 / 226 tests pass, swift build clean.

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant