Skip to content

feat(dashboard): operator cockpit — Approvals/Calibration/Activity surfaces with gate + send execution#8

Merged
OriginalGary merged 12 commits intomainfrom
feat/cockpit-calibration-telemetry
May 8, 2026
Merged

feat(dashboard): operator cockpit — Approvals/Calibration/Activity surfaces with gate + send execution#8
OriginalGary merged 12 commits intomainfrom
feat/cockpit-calibration-telemetry

Conversation

@OriginalGary
Copy link
Copy Markdown
Owner

Summary

Folds the GaryOS cockpit into Graze as Next.js routes under (operator)/, gated by OPERATOR_BUILD=true so the public OSS build 404s these paths. Three cockpit surfaces (Approvals, Calibration, Activity) reading from the gary-ui sidecar via internal proxy routes.

  • Approvals (/approvals) — unified queue of Sam-gates and pending sends. Inline expansion (M-modal) shows full context + recommended answer + rationale. ⌘↵ multiplexes: submit() for gates, submitSend() for sends. Recommended option pre-selected; gate detail (sam-gate.md + stage outputs) loaded on demand from /api/gates/:id. Hash deep-link (#gate=<id>) with one-shot mount guard so the writer effect doesn't clobber the URL on initial load.
  • Calibration (/calibration) — match-rate hero + per-window stats, fed by /api/quality-metrics and /api/quality-events.
  • Activity (/activity) — decisions log + override / outputs feed.
  • FrequencyStrip (operator layout chrome) — gates-per-day sparkline + match-rate badge across all surfaces.

Operator-only API proxies under src/app/api/operator/:

  • decide/[id] — POST → gary-ui /api/decide/:id, derives learning_tag from answer vs recommended.
  • send/[id] — POST → gary-ui /api/send/:id.
  • gate-detail/[id] — GET → gary-ui /api/gates/:id.
  • quality-events — POST → gary-ui /api/quality-events.

Authz: /api/operator/* short-circuits to PUBLIC in the proxy classifier when OPERATOR_BUILD=true. Browser fetches carry no credentials; the gary-ui bearer token is read server-side from ~/.gary/gary-ui-token and never reaches the client.

Test plan

  • Typecheck (npx tsc --noEmit -p tsconfig.typecheck-core.json) clean across new files.
  • ESLint clean on src/app/(operator), src/lib/gary-ui, src/app/api/operator.
  • End-to-end gate flip — OPERATOR_BUILD=true npm run dev, navigate /approvals, select a Sam-gate, ⌘↵ → kernel actions/<id>/index.md awaiting_sam.answer flips, history/decisions.jsonl audit row written. Verified against 27-p0-autonomous-monitoring (answer a, recommended_match=true, learning_tag=confirmed_recommendation).
  • Send proxy reachable — POST /api/operator/send/<nonexistent> returns gary-ui's 404 {"error":"No pending-send.yaml for: …"} envelope, proving the full proxy → auth → handler chain.
  • Send execution end-to-end — gated on a real pending-send.yaml appearing in the kernel (none currently). Will run when one lands.
  • Hash deep-link doesn't get clobbered on mount (one-shot ref guard).
  • Public OSS build (OPERATOR_BUILD unset) 404s every operator route — layout + each /api/operator/* route check.

Companion change (not in this PR)

OpenGaryBot/garyos@729dde6 on main — fix(docker): mount workspace rw so gary-ui can write gate answers back to kernel. The previous read_only: true on the gary-ui workspace bind made gate-answer writes EROFS-fail; the cockpit cannot function until that's deployed.

🤖 Generated with Claude Code

samtuckerdavis and others added 10 commits May 8, 2026 06:27
First step toward folding the GaryOS cockpit into Graze. Adds an (operator)
route group gated by OPERATOR_BUILD env so the public OSS build 404s these
paths, and an /approvals landing page that reads gates+sends from a running
gary-ui sidecar. Read-only — write paths and Tailwind pass land next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pprovals

Replaces inline styles with graze's Tailwind v4 + shared component library
(Button/Card/Badge/EmptyState). Adds the gate-answer write path: an
internal /api/operator/decide/:id proxy that forwards to gary-ui's
POST /api/decide/:id with the operator's bearer token kept server-side.
Approvals page becomes a two-pane layout — queue left, inline expansion
right (M-modal per cockpit spec, not a modal route). Recommended option
pre-selected; window-level keydown listener for ⌘↵ submit.

Sends still render but write-execute is pass 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…trip

Adds the foundation the cockpit surfaces hang off:
  - OperatorNav: tab nav across the four operator surfaces
  - FrequencyStrip: server-rendered top-of-page card showing 30-day
    gates-per-day sparkline + recommended-match aggregate ratio. Fails
    silently to placeholder copy when /api/metrics returns empty (fresh
    install) or /api/decisions errors. Threshold-crossing alert banner
    surfaces when /api/metrics carries a recent crossing.
  - Sparkline: dependency-free SVG bar chart, also exported as a
    server-side helper (buildDailyBuckets) so other server components
    can reuse the daily-bucket logic.

Extends src/lib/gary-ui/client.ts with typed clients for /api/decisions,
/api/metrics, /api/quality-metrics, /api/gates/:id, /api/outputs, plus
a POST helper for /api/quality-events.

Patches /api/operator/decide to compute and forward learning_tag on
every gate answer (confirmed_recommendation / operator_override /
no_recommendation), so the calibration corpus carries a reason label
on every record. Manual learning_tag from the client wins when
present.

Adds /api/operator/quality-events POST proxy so client components can
record override / sample_flag / customer_correction events.
…tats

The M-modal cockpit surface. Single huge match-rate number leads, with
the per-window aggregate ('X of Y aligned with Gary'). Below that:
decisions-per-day sparkline with a week-over-week trend label, then a
row of supporting stats (decisions in window, marked overrides,
auto-approved %, customer corrections).

Window selector is server-side via URL search param (?window=7|30|90).
Reload-safe, deep-linkable, no client state to invalidate. The selector
is the only client component on the page; everything else is server-
rendered from /api/decisions and /api/quality-metrics.

Renders gracefully empty when /api/quality-metrics returns nulls (fresh
install) or when no decisions in the window have a recommendation
(match rate shows '—' with copy explaining the loop).

Pulls from existing /api/decisions for the per-day decision sparkline
since /api/quality-metrics doesn't expose daily breakdown.
…s feed

The Activity surface. Two-column on wide viewports, stacked on narrow:
left = recent decisions, right = stage outputs.

Each decision row carries the full enrichment the gap analysis flagged:
  - title + slug
  - answer alongside recommended (when divergent) with 'rec' label
  - aligned/overrode badge, playbook, risk pill (low/medium/high/
    irreversible), outcome (auto-approved / executed / deferred),
    sensitivity tier (Tier-2/3 only — Tier-1 stays uncluttered)
  - Override button with optimistic update; marked state cached in
    localStorage so the button stays disabled across page reloads

Override clicks hit POST /api/operator/quality-events which proxies to
gary-ui's /api/quality-events with event_kind: 'override' (the closest
backend kind to 'I would now decide differently'). Backend has no
'good_decision' kind, so positive signal is read off the existing
aligned indicator on each row.

Stage outputs feed shows the latest 30 autonomous outputs with a
collapsible content viewer per row.
The proxy.ts middleware runs runAuthzPipeline over /api/* and routes any
non-v1, non-public path into the MANAGEMENT class, which 401s without a
management bearer. The cockpit's /api/operator/* proxies need to be
reachable from the browser without a graze management token because the
upstream auth (against gary-ui) happens server-side using a token read
from disk; the request from the browser carries no credentials.

OPERATOR_BUILD is the existing gate that 404s the operator route group
in public OSS builds; piggy-backing on it here keeps the public exposure
matched to the build mode.

Discovered when Ctrl+Enter from /approvals returned 401 from the proxy
before reaching the route handler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ApprovalsClient now auto-fetches /api/operator/gate-detail/[id] when a
gate row is selected. The detail panel below the decide controls
surfaces the sam-gate decision card markdown and stage outputs the
inline /api/gates summary doesn't carry. Decide buttons stay on the
main card — one canonical decision surface per gate.

URL hash sync: /approvals#gate=<action-id> auto-selects that row on
load (deep-link from Slack/email). Selecting a row updates the hash
via replaceState (no history pollution).

Submit posts now include the gate's recommended value so the server
can derive learning_tag without re-fetching the gate detail.

Adds /api/operator/gate-detail/[id] proxy to gary-ui's /api/gates/:id.
Stale-request guard via incrementing request id keeps a slow earlier
fetch from clobbering a newer selection.
The hash deep-link reader and writer effects raced on initial mount: the
reader read the URL hash and scheduled setSelectedId(<hash-id>), but the
writer also fired on mount with the initial selectedId (rows[0].id) and
overwrote the URL hash before the reader's state change could commit.

A useRef guard skips the writer's first run so the reader effect is the
sole source of truth on initial load. The writer fires only on
user-initiated selection changes after that.

Surfaced when an end-to-end test navigated to /approvals#gate=102-... and
the keystroke landed on rows[0] (action 27) instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires the operator send-execute path: client.send(actionId), an internal
/api/operator/send/[id] proxy, and a Send-now button on the send card.
⌘↵ now multiplexes — calls submit() when a gate is selected, submitSend()
when a send is selected. Send button is disabled when the pending-send's
approved_at is null (gate decision still required upstream).

Also fixes the send-row data shape: pass 1 mapped s.action_id/s.channel
(undefined fields), gary-ui actually returns {id, pendingSend: {...}}.
The page now extracts channel/tool/to/subject/preview/queued_at/approved_at
from pendingSend.arguments correctly.

Verified wiring: POST /api/operator/send/<nonexistent> → 404 with
gary-ui's {"error":"No pending-send.yaml for: ..."} envelope, proving
the full proxy → auth → handler chain. End-to-end execution test gated
on a real pending-send.yaml appearing in the kernel — none currently.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds operator-classification.test.ts covering the OPERATOR_BUILD-gated
PUBLIC classification of /api/operator/* paths:
- PUBLIC when OPERATOR_BUILD=true (decide, send, gate-detail, quality-events)
- MANAGEMENT when env unset, =false, or any non-'true' value (fail-closed)
- non-operator routes unaffected by the env flag
- prefix-boundary check (/api/operator vs /api/operatorial)

Also runs `npm audit fix` to patch three pre-existing moderate-severity
transitive vulnerabilities (hono, ip-address, express-rate-limit) that
were blocking CI/Lint. Lockfile-only — no package.json changes, no
breaking version bumps.

Satisfies CI/PR Test Policy (was failing because production-code changes
shipped without test additions) and CI/Lint (npm audit:deps).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

CI Coverage Report

  • Coverage job: success
  • PR test policy: success

Coverage artifact was not available for this run.

samtuckerdavis and others added 2 commits May 8, 2026 13:22
- FrequencyStrip + calibration: hoist Date.now() with an eslint-disable
  for react-hooks/purity. The rule fires on async server components, but
  Date.now() runs at request time on the server, not during a client
  render — there's no purity concern. Hoisting once also avoids reading
  the clock per filter iteration.
- activity + calibration: escape "you've" / "Gary's" with `&apos;` per
  react/no-unescaped-entities.

CI/Lint flagged these on the previous push. No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Graze's t06 check requires every route doing request.json() to validate
via validateBody() or .safeParse(). Both /api/operator/decide/[id] and
/api/operator/quality-events were doing typeof-narrowing instead — same
runtime guarantees, but the regex-based static check doesn't see them.

Replaces the manual narrowing with z.safeParse against a small schema
per route. Decide accepts {answer, learning_tag?, recommended?};
quality-events accepts {action_id, event_kind: enum, instance_id?}.
Behaviour and error envelopes preserved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@OriginalGary OriginalGary merged commit cdc25d5 into main May 8, 2026
63 checks passed
@OriginalGary OriginalGary deleted the feat/cockpit-calibration-telemetry branch May 8, 2026 03:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants