Skip to content

aremesch/multi-agent-workbench

Repository files navigation

Multi-Agent Workbench (MAW)

Self-hosted web workbench that orchestrates multiple LLM coding-agent CLIs (Claude Code, Codex, Gemini CLI, …) in parallel across one or more git repos. Each agent runs inside its own tmux session in an isolated git worktree. The backend reattaches to surviving sessions across restarts, streams live terminal output over WebSocket to a SvelteKit frontend, and pushes permission-prompt alerts to an installable PWA on your phone.

Two driving goals:

  1. Phone-first multi-agent management — install the PWA, get Web Push notifications for permission prompts, tap to approve from anywhere.
  2. Daily browser workbench — log in once, resume seamlessly, see every agent's live terminal side by side.

Security — read this before deploying

⚠️ MAW exposes a shell to every authenticated session. Anyone who logs in can spawn agents that run arbitrary commands on the host — the shell adapter is literally a bare bash, and Claude Code / Codex / Gemini agents can read and write any file the host user can, make outbound network requests, and invoke any installed CLI. Treat an authenticated session as equivalent to SSH access to the MAW user account.

The non-negotiables:

  • Never run MAW as root or under any account with sudo rights. Create a dedicated unprivileged system user (maw is the convention used throughout these docs). Everything MAW spawns inherits that user's permissions — that isolation is the sandbox.
  • Use a strong password for every MAW account, and rotate the MAW_BOOTSTRAP_PASSWORD seeded on first boot.
  • Enable fail2ban before exposing MAW to any untrusted network. The shipped jail bans repeat login_fail, pwchange_fail, rate_limited, and ws_origin_reject offenders.
  • Always terminate TLS in front of MAW. Session cookies, WebSocket traffic, and Web Push subscriptions must not travel over plain HTTP. HTTPS is also required for PWA install and Web Push to function at all (except on localhost for dev).
  • Set MAW_PUBLIC_ORIGIN in production so the WebSocket upgrade rejects cross-origin handshakes. Without it MAW falls back to dev behavior and accepts any Origin.
  • Set MAW_TRUST_PROXY=1 when behind a reverse proxy so the auth log records the real client IP, not the proxy's. The fail2ban jail is useless without this.
  • Rotate MAW_SESSION_SECRET if you suspect compromise — it signs every session cookie, and changing it invalidates all active logins.
  • Only grant MAW accounts to people you would give SSH to. There is no per-agent sandbox, no per-user filesystem isolation, no rate limit on command execution. Permission boundaries end at the host user.

For extra paranoia: disable cli-adapters/shell.jsonc in production (move or rename the file; the registry hot-reloads) so users can only spawn adapter-vetted CLIs rather than a free-form bash.

PWA & push notifications

MAW's headline feature: install it on your phone, get push notifications when an agent needs attention, tap to jump straight to that agent's terminal.

  1. Install. Open the deployed URL in Android Chrome or desktop Chrome/Edge and pick Install app. The manifest (static/manifest.webmanifest) and service worker (src/service-worker.ts) drive installability; an offline fallback page is cached at install time.
  2. Enable push. Go to Settings → Notifications, grant permission, and subscribe. Requires VAPID keys on the server — see the Environment section.
  3. What you get notified about. Permission prompts, idle waiting, crashes, errors — detected per adapter. Tapping a notification opens the PWA on the agent that needs attention.
  4. HTTPS is required. Service workers and Web Push only work over HTTPS (localhost is exempt for dev). Put MAW behind a TLS-terminating reverse proxy (Caddy, nginx, Cloudflare Tunnel) for phone installs.
  5. Per-spawn control. The spawn form lets you toggle adapter flags like Claude Code's --dangerously-skip-permissions. Turn it off if you want the agent to actually prompt — that's what drives the push.

Stack

  • SvelteKit fullstack (TypeScript strict) on Node 24 via @sveltejs/adapter-node, with a custom server.js that mounts a raw ws WebSocket server on the same HTTP listener.
  • SQLite via better-sqlite3, hand-written migrations under migrations/, typed row helpers — no ORM.
  • tmux + FIFO for agent sessions (pipe-pane → named pipe → AgentRuntime). State lives in tmux + SQLite so the backend can crash, redeploy, or upgrade and reattach on boot without losing any agent.
  • xterm.js in the browser; shadcn-svelte + Tailwind for UI.
  • Argon2id password auth + signed httpOnly session cookie.
  • Config-driven CLI adapters (cli-adapters/*.jsonc, validated against schemas/adapter.schema.json) hot-reloaded via chokidar.
  • Service worker (src/service-worker.ts) + web app manifest for PWA install, offline fallback, and push/notificationclick handling.
  • web-push for VAPID-signed Web Push fan-out from the backend.
  • Production bundle via esbuild: pnpm build emits a single build/server.js that wraps the SvelteKit handler with the /ws listener. Native addons (better-sqlite3, @node-rs/argon2) stay external; prod hosts don't need tsx or the src/ tree.
  • i18n with en / de / fr / es locales under src/lib/i18n/.

Repository layout

cli-adapters/        JSONC adapter definitions (claude-code, codex, gemini, shell)
docs/plans/          Persisted plans (see the Plans section below)
migrations/          Hand-written NNN_*.sql migrations
schemas/             JSON Schema for adapter configs
scripts/             migrate.ts, test-adapter.ts
static/              PWA manifest, icons, offline fallback
src/service-worker.ts  Install / fetch / push / notificationclick
src/lib/server/      AgentSupervisor, WorktreeManager, FifoStreamer, DB, auth
src/lib/server/push/ PushService + alert fan-out (VAPID / web-push)
src/lib/client/      xterm wrapper, shared WS client
src/lib/i18n/        Locale bundles (en / de / fr / es)
src/lib/shared/      Types shared between client and server
src/routes/          SvelteKit routes (login, dashboard, repos, settings, api)
server.js            adapter-node handler + ws server + boot sequence

Prerequisites

  • Node.js 24 LTS and pnpm
  • git and tmux on PATH
  • Local (non-NFS) disk for SQLite WAL and FIFOs

MAW is Linux-first. macOS works. Windows is not supported — the agent runtime depends on tmux and POSIX named pipes (FIFOs), neither of which exist natively on Windows. Use WSL2 if you must, and treat it as Linux.

Ubuntu/Debian

Ubuntu 24.04's and Debian 12's default apt repo ship Node 18, Debian 13 ships Node 20, so Node 24 comes from NodeSource (or nvm / fnm if you prefer a version manager).

# system packages
sudo apt update
sudo apt install -y git tmux curl ca-certificates

# Node.js 24 LTS via NodeSource
curl -fsSL https://deb.nodesource.com/setup_24.x | sudo -E bash -
sudo apt install -y nodejs

# pnpm via Corepack (bundled with Node 24)
sudo corepack enable
corepack prepare pnpm@latest --activate

# verify
node --version   # v24.x
pnpm --version
tmux -V
git --version

If pnpm install ever falls back to building better-sqlite3 from source, also install a C/C++ toolchain and Python:

sudo apt install -y build-essential python3

macOS (13+, Apple Silicon or Intel)

Install via Homebrew:

# Homebrew itself (skip if already installed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# runtime + tools
brew install node@24 pnpm tmux git

# node@24 is keg-only; link it so `node` resolves to v24
brew link --overwrite --force node@24

# verify
node --version   # v24.x
pnpm --version
tmux -V
git --version

If a native module ever falls back to a source build, install the Xcode Command Line Tools:

xcode-select --install

Quickstart

pnpm install
cp .env.example .env
# edit .env — at minimum set MAW_SESSION_SECRET and pick a
# MAW_BOOTSTRAP_PASSWORD. Generate VAPID keys if you want push:
#   pnpm dlx web-push generate-vapid-keys
pnpm migrate
pnpm dev

Open the URL pnpm dev prints (Vite defaults to http://127.0.0.1:5173) and log in with the bootstrap credentials from .env. Create a project, point it at a git repo (or an empty directory — MAW will git init it), add a role, then spawn an agent. The terminal view will attach to the live tmux session.

For a production-ish run against the built bundle:

pnpm build
pnpm start        # runs server.js directly

Scripts

Command What it does
pnpm dev Vite dev server with HMR
pnpm build Production build
pnpm start Run server.js (adapter-node + ws)
pnpm check svelte-kit sync + svelte-check
pnpm migrate Apply pending SQL migrations
pnpm test Vitest
pnpm test:adapter Exercise an adapter end-to-end via shell
pnpm lint / format ESLint / Prettier

Environment

All configuration lives in .env. See .env.example for the full list — notable entries:

  • MAW_DATA_DIR — SQLite + push-subscription state. Must be local disk. SQLite WAL does not work on NFS.
  • MAW_FIFO_DIR — one named pipe per agent, local disk only.
  • MAW_WORKTREE_ROOT — where agent worktrees are checked out.
  • MAW_BOOTSTRAP_USERNAME / MAW_BOOTSTRAP_PASSWORD — seeded only on first boot against an empty DB; change via the UI afterwards.
  • MAW_SESSION_SECRET — 32 random bytes (base64) for signing cookies.
  • MAW_VAPID_PUBLIC_KEY / MAW_VAPID_PRIVATE_KEY / MAW_VAPID_SUBJECT — Web Push credentials. Generate a keypair with pnpm dlx web-push generate-vapid-keys. MAW_VAPID_SUBJECT must be a mailto: address or an https:// URL. Leaving all three blank disables push cleanly; the rest of the app still runs.
  • MAW_PUBLIC_ORIGIN — browser-visible origin (e.g. https://maw.example.com); required in prod so the WebSocket upgrade can reject mismatched Origin headers.
  • MAW_TRUST_PROXY — set to 1 when behind a reverse proxy so the auth log / rate limiter honor X-Forwarded-For.
  • MAW_AUTH_LOG_PATH — override for the auth event log. Defaults to ${MAW_DATA_DIR}/auth.log. Symlink it to /var/log/maw/auth.log for the included fail2ban jail.
  • MAW_LOGIN_RATE_LIMITcount/windowSeconds (default 10/60).

Never commit .env or any credential. See CLAUDE.md for the full rules.

fail2ban (prod)

deploy/fail2ban/ ships a filter (filter.d/maw-auth.conf) and jail (jail.d/maw.conf) that watch ${MAW_DATA_DIR}/auth.log for repeat login_fail, pwchange_fail, rate_limited, and ws_origin_reject entries. Defaults: 5 hits in 10 min → 1 h ban.

Install

sudo apt install -y fail2ban
sudo cp deploy/fail2ban/filter.d/maw-auth.conf /etc/fail2ban/filter.d/
sudo cp deploy/fail2ban/jail.d/maw.conf        /etc/fail2ban/jail.d/

Edit /etc/fail2ban/jail.d/maw.conf so logpath is the real file path of your auth log. It must equal ${MAW_DATA_DIR}/auth.log exactly — symlinks are unreliable with the pyinotify backend, so don't use one. For the shipped systemd unit with MAW_DATA_DIR=/var/lib/maw the default logpath = /var/lib/maw/auth.log already matches.

Then:

sudo systemctl enable --now fail2ban
sudo systemctl restart fail2ban        # use restart, not reload:
                                       # reload doesn't re-tail logs
                                       # if the old path went away

Restart, not reload. fail2ban-client reload re-reads the jail config but keeps the previously-opened file descriptor. If you ever change logpath (or the old path is deleted), you need a full restart for the new file to be tailed. Symptom of a stale reload: fail2ban-client status maw shows the new path under File list but Total failed never increments after fresh events.

Port and MAW_TRUST_PROXY — pick the scenario that matches prod

Behind a reverse proxy (recommended) — MAW listens on 3000 on localhost, nginx/Caddy terminates TLS on 443 and forwards with X-Forwarded-For. Set MAW_TRUST_PROXY=1 in MAW's env so clientIp() honors the forwarded header; the real attacker IP lands in auth.log. The shipped jail's port = http,https then bans at the edge (80/443) which is the only thing exposed publicly.

Direct exposure on 3000 — only safe on a trusted LAN. Leave MAW_TRUST_PROXY unset and change the jail to:

port = 3000

then sudo systemctl restart fail2ban. Without this change the ban installs iptables rules on 80/443 and completely misses the actual Node listener.

Also set MAW_PUBLIC_ORIGIN in .env so the WebSocket Origin check is active and ws_origin_reject entries can accrue; with it unset MAW falls back to dev behavior and never rejects.

Verify

  1. Check the filter regex against the live log — decoupled from live daemon state:

    sudo fail2ban-regex /var/lib/maw/auth.log /etc/fail2ban/filter.d/maw-auth.conf

    Expected: every login_fail / pwchange_fail / rate_limited / ws_origin_reject line shows under "Lines matched".

  2. Fire a failed login from a non-local IP (localhost is always dropped by fail2ban's ignoreself rule — bans will never register from 127.0.0.1 regardless of config) and check:

    sudo fail2ban-client status maw
    sudo tail -n 40 /var/log/fail2ban.log

    Total failed increments, and after maxretry hits within findtime the offending IP appears under Banned IP list.

  3. Unban in a test:

    sudo fail2ban-client set maw unbanip <ip>

Adapters

A CLI adapter is a JSONC file under cli-adapters/ that tells MAW how to launch a coding-agent CLI and how to interpret its output (prompt detection, permission prompts, idle state). The registry is hot-reloaded on change. cli-adapters/shell.jsonc is a minimal smoke adapter that runs a plain bash session — useful for testing the pipeline without installing claude, codex, or gemini.

Validate changes against schemas/adapter.schema.json; try them with pnpm test:adapter.

Running under systemd

MAW uses a dedicated tmux socket (tmux -L maw). For agent sessions to survive systemctl --user restart maw, the tmux server must live in its own user systemd unit, outside maw.service's cgroup. Without that, KillMode=control-group on maw.service SIGKILLs the tmux server (and every agent inside it) on every restart.

Install the shipped maw-tmux.service user unit once per host:

mkdir -p ~/.config/systemd/user
cp deploy/systemd/maw-tmux.service ~/.config/systemd/user/
systemctl --user daemon-reload
systemctl --user enable --now maw-tmux.service

Then update your ~/.config/systemd/user/maw.service so it depends on the tmux unit and (belt + braces) only kills the Node main process:

[Unit]
Description=Multi-Agent Workbench
After=default.target maw-tmux.service
Wants=maw-tmux.service

[Service]
Type=simple
WorkingDirectory=/home/maw
EnvironmentFile=/home/maw/.env
Environment=NODE_ENV=production
ExecStart=/usr/bin/node build/server.js
KillMode=process
Restart=on-failure
RestartSec=5
TimeoutStopSec=10

[Install]
WantedBy=default.target

Reload and restart:

systemctl --user daemon-reload
systemctl --user restart maw

Verify:

cat /proc/$(pgrep -f 'tmux.*-L maw')/cgroup
# should show .../maw-tmux.service — NOT .../maw.service

On macOS dev there is no systemd; tmux auto-spawns on first session and survives a Node crash naturally because it is not in any cgroup.

Plans

Persisted plans live in docs/plans/. Git history is the activity log; plan files are the design record.

License

MIT — see LICENSE.

Copyright (c) 2026 Alexander Remesch.

About

A workbench for AI CLI coding tool for multi-agent coding featuring seamless smartphone remote control.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors