Skip to content

feat: per-card agent execution with LangGraph#86

Open
siracusa5 wants to merge 39 commits intomainfrom
feat/card-agent
Open

feat: per-card agent execution with LangGraph#86
siracusa5 wants to merge 39 commits intomainfrom
feat/card-agent

Conversation

@siracusa5
Copy link
Copy Markdown
Collaborator

Summary

  • New packages/agent-server/ — standalone Node.js process (port 4801) running a LangGraph JS StateGraph with 5 phases: spec → planning → coding → qa_review → qa_fixing
  • Per-card live log streaming in Aperant-style structured event view (timestamps, agent thoughts, color-coded full-width tool blocks with collapsible output)
  • Phase-grouped subtask list auto-generated during the planning phase
  • Steering input (inject messages into running agent), Take Over (handoff), Pause, Stop controls
  • SQLite checkpointing (~/.harness/board/agent-checkpoints.sqlite) — agent state survives app restarts
  • Board-aware context via @langchain/mcp-adapters → board-server MCP at port 4800
  • Tauri launchd service management for agent-server (agent_server.rs, mirrors board_server.rs)
  • Auth: ANTHROPIC_API_KEY env var → Claude Code OAuth keychain fallback
  • AgentExecutionBadge on kanban cards showing animated phase + progress bar
  • TaskDetailDialog gains 5 tabs: Overview | Subtasks | Logs | Files | Diff — CSS ported from approved mock

Architecture

apps/desktop          → agent-api.ts (HTTP+WS client for :4801)
                        useAgentEvents (WS event stream)
                        AgentExecutionBadge, TaskDetailDialog tabs (ported from mock)
packages/agent-server → LangGraph StateGraph, tools, HTTP routes, WS server
packages/board-server → unchanged (MCP tools consumed by agent-server)
apps/desktop/src-tauri → agent_server.rs (launchd plist management)

Start/stop routing

  • Harness claude → routes through agentApi.start() (LangGraph agent)
  • All other harnesses → keeps existing PTY-based useTaskExecution path (legacy)

Test plan

  • Start agent-server: cd packages/agent-server && ANTHROPIC_API_KEY=<key> pnpm dev
  • Open board, click a task, click Start — logs tab streams structured events
  • Subtasks tab populates after planning phase completes
  • Overview tab shows phase stepper and steering textarea
  • Stop button terminates the agent gracefully
  • Restart desktop app — agent resumes from last checkpoint (SQLite)
  • Kanban card shows AgentExecutionBadge with correct phase when running

🤖 Generated with Claude Code

siracusa5 and others added 30 commits April 7, 2026 00:13
… tool approval panel

Replaces the hardcoded --allowedTools flag with a user-configurable
permission mode system. Three modes: Skip All (default, --dangerously-
skip-permissions), Auto (--permission-mode auto, for paid plans), and
Allowed Tools (--allowedTools with per-tool checklist).

Security → Permissions now leads with a mode selector section featuring
icon cards, flag badges, and a clear selected state. An allowed-tools
checklist and per-harness overrides are available below. A one-time
first-run modal surfaces before the first task execution. A slide-in
ToolApprovalPanel lets users manage the allowed tools list mid-session
from within the terminals view.

All preferences persist in localStorage under harness-kit- keys.
resetPermissionDefaults() clears tool list, ack flag, and overrides.

638/638 tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…llback

getAllowedTools now filters stored entries against /^[A-Za-z]+(\([^)\]+\))?$/
so arbitrary localStorage strings cannot reach the Claude CLI flag.

Empty allowed-tools list no longer falls back to --dangerously-skip-permissions;
Claude's default prompting behavior (no permission flag) is now used instead,
which matches user expectation that an empty list means "prompt for everything."

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…stance

- tauri.ts: export detectClaudeAccount + ClaudeAccountInfo interface
- tool-names.ts: add scopeHint/scopeLabel fields for per-tool scope input hints
- PermissionsPage.test.tsx: add window.__TAURI_INTERNALS__ setup/teardown
- vite.config.ts: ignore .auto-claude/** to prevent Vite watching worktree artifacts
- Cargo.toml + Cargo.lock: add tauri-plugin-single-instance dependency
- settings.rs: detect_claude_account command (parses claude auth status)
- lib.rs: register single-instance plugin and detect_claude_account command

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mirrors board_server.rs for the agent-server (port 4801). Registers
AgentServerState + all four commands (check_installed, install, start,
restart) in lib.rs setup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds optional phase/thread_id to TaskExecution and optional phase to
Subtask, enabling agent-driven execution tracking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Typed HTTP + WebSocket client for port 4801. Defines AgentEvent
discriminated union inline to avoid cross-package import complexity.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Subscribes to AgentEvent WebSocket stream, tracks phase/progress/running
state, and accumulates all events for log rendering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Ports .mini-badge, .phase-dot.anim, .phase-label, .mini-progress styles
from the mock. Adds agent-pulse keyframe to app.css. Wires badge into
TaskCard when task.execution.phase is set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds 'diff' tab, imports agent tab components, wires useAgentEvents,
adds phase/thread_id to TaskExecution and phase to Subtask in board-api.
Agent tasks (execution.thread_id set) get agent-specific tab views.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Full-width tool blocks with 2.5px colored left border, expandable output
with line numbers, agent thought bubbles, auto-scroll to bottom.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Phase-grouped subtask view (Planning / Coding / QA) with done/active/pending
status icons, ported from mock .subtasks-body CSS.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix TS errors (implicit any reducers, stream await, index type),
rename graph nodes to avoid collision with state field names,
update LangChain dependency versions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…trols

Phase stepper with connector lines, current subtask spinner, steering
textarea with Cmd+Enter send, Take Over / Pause / Stop controls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
siracusa5 and others added 4 commits April 7, 2026 01:12
FilesTab derives modified files from agent_tool write/edit events.
DiffTab renders output lines with +/- color coding and line numbers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Route agentApi.start() for claude harness; keep PTY fallback for others
- Route agentApi.stop() when stopping agent-managed tasks
- Set thread_id + phase on task execution record on start
- Expand isAgentTask to cover tasks currently running with claude harness
- Add packages/agent-server/dist/ to .gitignore, remove tracked dist files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- coding.ts: emit agent_tool events (start/done/error) for each tool call;
  fix dynamic ToolMessage import moved to top-level; emit agent_thought on prose
- planning.ts: emit agent_subtask events after each subtask created;
  strip markdown fences from LLM JSON output before parse
- runner.ts: update board task execution status to completed/failed on agent_done/error
- http.ts: restrict CORS from wildcard to tauri://localhost only;
  add Zod schema validation on task body in POST /start
- checkpointer.ts: document that setup() is called lazily by SqliteSaver

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment on lines +80 to +84
return fetch(`http://localhost:${port}/api/v1/projects/${slug}/tasks/${taskId}/execution`, {
method: 'PATCH',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ status, finished_at: new Date().toISOString() }),
});
siracusa5 and others added 4 commits April 7, 2026 01:34
- http.ts: add POST /pause and POST /resume routes
- runner.ts: pauseAgent() (abort with checkpoint preserved), resumeAgent() (resume from checkpoint via null input to stream())
- agent-api.ts: expose pause() and resume() client methods
- AgentExecutionBadge: self-subscribe to WS via optional taskId prop for live progress
- TaskCard: pass taskId to badge when task has a thread_id
- OverviewTab: sending state + inline error on steer; Pause/Resume toggle wired to real API; onTaskUpdated propagated
- board-api.ts, board-server/types.ts: add 'paused' to ExecutionStatus union
- auth.ts: document why OAuth token is passed as apiKey

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lidation

- Add shared-secret token auth to all HTTP endpoints and WebSocket upgrades
  (token stored at ~/.harness-kit/agent-server.token, mode 0600)
- Replace bash denylist with explicit allowlist of known-safe dev tools;
  add secondary blocks for pipe-to-interpreter and rm-targeting-root patterns
- Add Zod validation to /steer endpoint (message: min 1, max 4000 chars)
- Guard steerAgent against concurrent execution (pause-before-steer)
- Add allowedTools to AgentState; pass from StartAgentOptions through to
  buildFsTools so permission mode is enforced server-side
- Add Tauri get_agent_server_token command; update agent-api.ts to include
  Bearer token in all HTTP requests and ?token= in WebSocket URL
- Fix useAgentEvents: remove unconditional setIsRunning(true) on subscribe
- Fix TaskDetailDialog: check agentApi.start() response and throw on error
- Fix auth.test.ts: replace vacuous second test with real assertion
- Update CLAUDE.md and README.md to document agent-server package

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- qa-review.ts: respect state.allowedTools — intersect with QA-safe
  subset ['read_file','list_directory','bash'] instead of ignoring it
- fs-tools.ts: extend rm/rmdir block to cover all absolute paths and
  home-relative paths (was only bare / or ~ at end-of-line)
- fs-tools.test.ts: add cases for rm ~/path, rm /abs/path, rmdir,
  sudo, pipe variants, and safe relative-path rm
- broadcaster.ts: add clearSubscribers(taskId) to prevent unbounded
  Map growth; runner.ts calls it in both startAgent/resumeAgent finally
- agent_server.rs: fix error message — was pointing to non-existent
  'pnpm build:agent-server' script
- capabilities/default.json: add all five agent-server Tauri commands
  (get-agent-server-token, check-installed, install, start, restart)
- CHANGELOG.md: add 0.2.1 entries for agent execution, agent-server
  service management, and auth hardening

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
tauri-build requires explicit AppManifest.commands() registration for
ACL permission generation — #[tauri::command] + invoke_handler alone
is not sufficient. Add all five agent-server commands so tauri-build
generates the allow-agent-server-* and allow-get-agent-server-token
permissions referenced in capabilities/default.json.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants