First off, thank you for considering contributing! This project is a local-first, permission-gated agent workbench, and every contribution helps.
- Development Setup
- Project Structure
- Code Style
- Phase Workflow
- Making Changes
- Testing
- Pull Request Process
- Architecture Principles
- Bun >= 1.0 — Install Bun
git clone https://github.com/MerverliPy/agent-workbench.git
cd agent-workbench
bun installbash scripts/build-all.shThis compiles TypeScript to dist/ for all workspace packages in dependency order (17 packages + apps).
agent-workbench/
├── apps/
│ ├── tui/ # OpenTUI + SolidJS terminal client
│ ├── server/ # Hono HTTP/SSE server
│ ├── mobile-web/ # SolidJS PWA (touch-optimized)
│ ├── dashboard/ # SolidJS observability dashboard
│ └── cli/ # CLI entrypoint (plugin management)
├── packages/
│ ├── core/ # Runtime orchestration
│ ├── protocol/ # Zod schemas, route contracts
│ ├── sdk/ # Typed HTTP/SSE client
│ ├── tools/ # Tool definitions & registry
│ ├── permissions/ # Allow/ask/deny engine
│ ├── shell/ # Simple + PTY command runners
│ ├── storage/ # SQLite/Drizzle persistence
│ ├── models/ # Provider adapters + marketplace
│ ├── tokens/ # Token health & budgets
│ ├── diff/ # Patch preview/apply
│ ├── planner/ # Pre-run plan gates
│ ├── events/ # Event bus
│ ├── cache/ # Read/grep/glob cache
│ ├── telemetry/ # OpenTelemetry tracing, metrics
│ ├── plugin-sdk/ # Plugin extension interfaces
│ ├── config/ # Config loading (scaffold)
│ └── ui/ # Shared UI primitives (scaffold)
├── docs/ # Phase planning docs (00–27)
├── decisions/ # Architecture Decision Records (0017)
├── tests/ # Unit, integration, e2e, fixtures
├── scripts/ # Build & health scripts
└── tools/ # Model-router benchmark tooling
- TypeScript with
strict: trueandnoUncheckedIndexedAccess - Biome for formatting and linting (see
biome.json) - No
any— preferunknownwith type guards - Zod as the single source of truth for all data shapes
- ULID primary keys, ISO-8601 timestamps
Run the formatter:
npx @biomejs/biome format --write .This project follows a phase-based development model (currently 0–26 complete, 27–30 planned). Each phase has:
- A plan doc (
docs/NN_PHASE_NAME.md) defining scope and deliverables - An ADR (
decisions/NNNN-topic.md) recording architectural decisions - Exit gates in
docs/18_PHASE_EXIT_GATES.mdthat must be satisfied - Tests covering the implementation
Rules:
- Do not skip ahead to later phases
- Do not start a phase until its exit gate is complete
- Exceptions require explicit maintainer approval
- Fork the repo and create a branch from
main - Make your changes following the architecture boundaries
- Run tests:
bun test - Ensure all 523 tests pass
- Ensure
bash scripts/build-all.shcompletes cleanly - Submit a pull request
See AGENTS.md for the agent-specific workflow, which includes:
- Read relevant docs and decisions before making changes
- Propose a bounded plan identifying target files and risk
- Obtain approval before writing, editing, or running shell commands
- Execute through the runtime (never through the TUI)
# Full test suite (523 tests, 0 failures)
bun test
# Per-category
bun test unit
bun test integration
bun test e2e
# Specific areas
bun test tests/unit/plugin-sdk/ # Plugin system (26 tests)
bun test tests/unit/telemetry/ # Telemetry (59 tests)
bun test tests/integration/server/ # Server integration
# Build verification
bash scripts/build-all.shtests/unit/— Isolated unit tests per package (core, tools, permissions, models, telemetry, plugin-sdk, etc.)tests/integration/— Cross-package integration + fault injectiontests/e2e/— Full-stack end-to-end validationtests/helpers/— Shared test utilities and fixtures (test-db, test-server, mock-model)
- Ensure all 523 tests pass and CI is green
- Update README and package-level docs if your change affects public API
- Update phase checklists if your change completes a phase exit gate
- Include a clear PR description referencing the relevant docs/decisions
- PRs require at least one review before merging
Phase N: Brief description
feat: Brief description
fix: Brief description
docs: Brief description
chore: Brief description
These principles must be preserved in every contribution:
| Principle | Description |
|---|---|
| TUI is thin | The TUI renders and accepts input only. Never executes tools, permissions, or runtime logic. |
| Server is thin | Routes delegate to core runtime. Never execute tools or shell commands directly. |
| Schema-first | All data shapes are defined in Zod schemas first, then consumed everywhere. |
| Permission-gated | No file mutation or shell execution bypasses the permission engine. |
| Localhost default | The server binds to 127.0.0.1 by default (override with WORKBENCH_HOST=0.0.0.0). |
| Full audit trail | Every tool call, permission decision, and runtime event is recorded in the run ledger. |
| Plugin sandbox | Plugins declare permissions; risky operations are logged and gated. |
- Open an issue for bugs or feature requests
- See the
docs/directory for detailed planning docs - See
AGENTS.mdfor the AI agent workflow - See
docs/27_PROJECT_ROADMAP.mdfor the project roadmap