A tiny, single-file web dashboard for headless AI agents running on a machine —
OpenClaw, Hermes, Claude Code, Ollama, vLLM, llama.cpp, or anything you name. It
shows which agents are alive and what they're costing you (CPU, memory, GPU), and
gives you a kill button per agent. Think htop, scoped to just your agents —
the screenshot below is a real run on a fleet node.
No database, no auth framework (one static token file gates the kill switch), no dependencies beyond FastAPI + psutil. Runs on macOS and Linux. It is a per-node monitor and guarded local control panel: fleet schedulers may consume its read-only telemetry, but should own their own scheduling and actuation.
A real run: ten agents grouped by process tree (+N = children rolled up),
per-agent CPU/memory/uptime, launchd-supervised jobs flagged, and a kill button
per row.
AGENT PID STATUS CPU % MEM MB GPU MB UPTIME COMMAND ┆
openclaw +3 48213 ● running 62.4 1840 7320 2h 11m openclaw serve … [kill] [force]
claude-code +9 73590 ● running 97.4 7630 — 1h 02m claude --chann … [kill] [force]
hermes 49001 ● running 18.0 512 — 44m hermes worker … [kill] [force]
ollama 50122 ● running 3.1 9210 14080 6h 02m ollama runner … [kill] [force]
(+N = child processes rolled up under the agent; CPU/mem/GPU are tree totals.)
The schematic above shows the GPU column (NVIDIA only); the screenshot is a real run on Apple Silicon, where per-process GPU stats aren't available so that column is hidden. The UI auto-refreshes every 3s.
-
One row per agent. Agents are grouped by process tree — the spawned children of an agent (inference subprocesses, MCP servers, helpers) are rolled up under it with a
+Nbadge instead of cluttering the list as separate rows. -
Liveness — green dot = running, red = zombie/dead. Status column shows the OS state.
-
Usage — CPU %, resident memory (MB), GPU memory (MB, NVIDIA only), and uptime, refreshed every 3s. CPU/mem/GPU are tree totals — the agent's true cost including everything it spawned.
-
Kill the tree —
killsends SIGTERM to the agent and its children (so spawned helpers don't leak resources),forcesends SIGKILL. SIGTERM auto-escalates to SIGKILL after 3s. The confirm dialog tells you how many child processes will stop. -
Trends, not just snapshots. Each row has a CPU sparkline (last ~20 min, sampled in the background even with no browser open), plus a
hot 5m+badge when an agent has been pegged ≥90% CPU for 5+ minutes, anidle 10m+badge when a long-running agent has done nothing for 10+ minutes, and achurn ×Nbadge when the same agent has died young 3+ times in 10 minutes — the states worth investigating (runaway, possibly wedged, crash-looping under a supervisor). Churn is what hot/idle can't see: a crash-looping process is a fresh pid every poll, so no per-process window ever fills. Aleak?badge fires when an agent's memory ratchets up ≥30% (and ≥128 MB) over 15 minutes without coming back down. -
Alerts. A dashboard only helps while you're looking at it. Add an
alerts:block toagents.yamland any badge appearing runs your command (desktop notification, Telegram bot, pager — anything) with the details in$AUM_*env vars. Fires once per transition with a cooldown, never from thelistCLI. By default onlyhot/churn/leakalert —idleis the normal state of an agent fleet that waits for work, so it's opt-in.alerts: command: 'terminal-notifier -title agents -message "$AUM_MSG"' cooldown: 600 flags: [hot, churn, leak] # the default; add idle to opt in
-
Prometheus
/metrics. Per-agent CPU/mem/instances/restarts and badge states in text exposition format, aggregated per label (no pid-churn series bloat) — point Grafana or any Prometheus scraper athttp://127.0.0.1:8765/metrics. -
Expand the tree. Click the
+Nbadge to unfold an agent's child processes (per-child CPU/mem/command) — see what a kill would actually stop before clicking it. -
Config hot-reload. Edits to
agents.yamlapply on the next poll, no restart. A broken edit keeps the last good config and shows the parse error in the header. -
listsubcommand.agent-usage-manager list(orlist --json) prints a one-shot table to stdout — no server, good for scripts and cron checks. -
Kill-safe table. Rows keep a stable order (sorted by label) and never reorder while your pointer is over the table, so the kill button can't shift under your cursor mid-click.
agent-usage-manager is intentionally not a fleet scheduler, dispatcher, or
multi-host orchestrator. It answers local process questions: what agent process is
running here, what resources is its process tree using, did it enter a suspicious
state, and can this local operator safely stop it?
If you run a separate fleet control plane, treat AUM as an optional read-only
input. Scrape list --json, /api/agents, or /metrics for local OS facts, then
make scheduling, budget, restart, and kill/retire decisions in your own
deterministic control layer. Do not route irreversible fleet operations through
AUM's kill endpoint as a central substrate.
This is the important part — a web page that can kill processes needs guardrails:
- Allowlist only. Only processes matching a pattern in
agents.yamlare ever listed or killable. The kill endpoint re-checks the match server-side before sending any signal, so the dashboard can never be used to kill an arbitrary PID. - Protected patterns. Anything matching
protect:inagents.yaml— plus the monitor's own process and PID 1 — shows a disabled, greyed-out kill button and is refused server-side. - Secret redaction. Command lines often carry tokens/keys in env vars or flags
(
FOO_TOKEN=...,--api-key ...,sk-...,ghp_..., JWTs). The command column redacts these to***before they ever reach the browser — safe to screenshot. - Browser guard (CSRF + DNS rebinding). Binding to localhost doesn't keep
browsers out — any web page you visit can
fetch()a localhost port. Requests whoseHostis a non-local DNS name are refused (DNS-rebinding guard), and a kill request carrying a foreignOriginis refused (CSRF guard) — so a malicious page can't kill your agents or read your process list.curland the dashboard itself are unaffected. - Kill requires a token (caller authorization). The allowlist above says what
may be killed; the token says who may kill. The monitored agents are
themselves untrusted HTTP callers — a prompt-injected agent with an HTTP tool
and localhost reach could otherwise
POST /api/killand take down its siblings (Origin headers are trivially forged outside a browser). The token is auto-generated on first run into a0600file —~/Library/Application Support/agent-usage-manager/kill_tokenon macOS,$XDG_STATE_HOME/agent-usage-manager/kill_token(default~/.local/state/…) elsewhere — and every kill must send it as anX-Kill-Tokenheader. It is never served over HTTP (anything that can curl the dashboard could read it): the dashboard asks you to paste it once on your first kill and keeps it in the browser's localStorage. Delete the file to rotate the token. - Action log. Every kill attempt — success and every refusal — appends a
JSON line to
actions.lognext to the token file: timestamp, caller address, target pid/command, outcome. "What was killed at 3am" and "what's been probing the kill endpoint" both have an answer. Append-only, no rotation; one line per attempt stays tiny. - Non-loopback binds fail closed. It listens on
127.0.0.1; asking it to bind anything else (--host 0.0.0.0, a LAN IP) refuses to start unless you also pass--unsafe-expose. Exposing the port means one static token is all that stands between the network and your agents — put real auth in front (reverse proxy + basic auth, SSH tunnel, etc.) before using that flag.
- GPU column is NVIDIA-only. Per-process GPU memory comes from
nvidia-smi --query-compute-apps— NVIDIA compute processes (CUDA), in practice on Linux. AMD/Intel GPUs aren't read, graphics-only workloads don't appear, and Apple Silicon has no per-process GPU accounting API at all, so the column is hidden on Macs. - Supervision detection is launchd-only (macOS, user domain). Root
LaunchDaemonsaren't flagged — that needs a privilegedlaunchctl print system/…. On Linux, systemd-supervised services (Restart=always) aren't detected either, so killing one looks like it failed when systemd respawns it — usesystemctl stopfor those. - Same-user privileges only. Signals are sent with the server's own
privileges. Agents running as another user (or root) are listed, but a kill
won't take (
killed: 0in the response), and CPU/mem can read as 0 where the OS denies access. - History is in-memory. Sparklines and the
hot/idleflags (~20 min window) rebuild from scratch after a server restart. - Windows is untested. Kill maps to
TerminateProcessvia psutil and may work, but CI covers Linux + macOS only.
Recommended — one command, nothing to install first:
uvx agent-usage-manager
# then open http://127.0.0.1:8765 (it also opens automatically)uvx fetches and runs it in one step — no separate install, no virtualenv, no
leftovers. Don't have uv yet? One line:
curl -LsSf https://astral.sh/uv/install.sh | sh # macOS / Linux
# or: pip install uvOther ways to install
pipx install agent-usage-manager # clean isolated global CLI (needs pipx)
pip install agent-usage-manager # universal; use inside a venv —
# system Python may refuse with
# "externally-managed-environment"Then run agent-usage-manager (flags below).
From a clone (for hacking on it):
git clone <this-repo> && cd agent-usage-manager
./run.sh # venv + editable install, serves on :8765It opens the dashboard in your browser automatically. Flags: --host, --port,
--config /path/to/agents.yaml, --no-browser (for headless/server use),
--unsafe-expose (required for any non-loopback --host — see
Safety).
Edit agents.yaml:
agents:
- label: openclaw # shown as the badge in the UI
match: openclaw # case-insensitive substring of the command line
- label: hermes
match: hermes
- label: claude-code
match: "claude(\\s|$|-code)"
regex: true # treat `match` as a regex instead of substring
protect: # matched + listed, but never killable
- uvicorn
ignore: # never an agent: not listed, not killable
- crashpad # incidental processes that share a name/bundle
- shipit # path with a real agent (crash handlers,
- kiro-cli-term # auto-updaters, integrated-terminal shells, …)A process matches if the pattern hits its executable basename + first few
arguments — deliberately not the whole command line, so a long embedded arg
(e.g. a system prompt mentioning "claude") can't misclassify a wrapper. On macOS
the outermost .app bundle name is also included, so GUI agents that launch
a generically-named binary (Kiro.app → Electron) are still matched by app name.
protect: keeps a matched process listed but refuses to kill it; ignore:
drops it from agent classification entirely.
Which agents.yaml is used — resolved once at startup, first hit wins:
AGENTS_CONFIG=/path/to/agents.yamlenv var (the--configflag sets this)./agents.yamlin the directory you launched from- the default bundled with the package
The dashboard header (and list --json) shows the resolved path, so you can
always see which file is live. Hot-reload watches that one file. An
AGENTS_CONFIG path that doesn't exist is an error at startup, not a silent
fallback.
Some agents run as launchd services (a ~/Library/LaunchAgents/*.plist, or
anything started by brew services). If such a job sets KeepAlive, a signal
can't stop it: the process dies, launchd immediately respawns it under a new PID,
and the dashboard's "kill" looks like it silently failed.
The dashboard detects these (via launchctl list) and marks them with a
launchd badge. Instead of dead-end kill/force buttons it shows the command
that actually stops the job — click to copy:
launchctl bootout gui/<uid>/<label> # stop now
launchctl disable gui/<uid>/<label> # …and don't auto-start at loginThe kill endpoint refuses signals for these jobs (HTTP 409) and returns the same
guidance, so the API never lies about a kill that won't stick. The message is
tailored to the job: KeepAlive jobs are told a signal won't stick at all;
RunAtLoad-only jobs are told a signal works now but the job restarts at next
login. Limitation: root LaunchDaemons aren't flagged — see
Limits & known issues.
Per-process GPU memory comes from nvidia-smi when it's on PATH (Linux / NVIDIA).
Apple Silicon has no per-process GPU accounting API, so the GPU column stays blank
on Macs — CPU and memory are the meaningful resource signals there.
-
GET /api/agents→{ api_version, aum_version, agents: [...], host, cpu_count, mem_total_mb, mem_used_pct, config_path, config_error, ts }— each agent includes read-only telemetry such aspid,create_time,label, resource totals, recent CPUtrend, flags (hot,idle,churn,leakwhen present), protection state, and supervised-process guidance. Pairpidwithcreate_timewhen caching rows so PID reuse cannot alias two different agents. This endpoint is suitable as an input to external tools, not as a fleet-control contract. -
GET /api/tree/{pid}→ the agent's process subtree (per-child pid/name/cpu/mem/cmdline); only works on recognized agents, same authorization as kill -
POST /api/kill/{pid}?force=false→ SIGTERM (or SIGKILL withforce=true). Requires theX-Kill-Tokenheader — the token lives in the0600file shown in the 403 message (see Safety):curl -X POST -H "X-Kill-Token: $(cat ~/Library/Application\ Support/agent-usage-manager/kill_token)" \ http://127.0.0.1:8765/api/kill/48213
Linux (systemd), ~/.config/systemd/user/agent-usage-manager.service:
[Unit]
Description=agent usage manager
[Service]
ExecStart=%h/agent-usage-manager/.venv/bin/uvicorn app:app --port 8765
WorkingDirectory=%h/agent-usage-manager
Restart=on-failure
[Install]
WantedBy=default.targetsystemctl --user enable --now agent-usage-managergit clone https://github.com/minglong51/agent-usage-manager && cd agent-usage-manager
pip install -e ".[dev]"
pytest -qCI runs the test suite on Linux + macOS (Python 3.9 and 3.12) on every push and PR.
Cross-platform note: kill uses psutil's terminate()/kill(), which map to
SIGTERM/SIGKILL on POSIX and TerminateProcess on Windows.
- Added
api_versionandaum_versionto/api/agentsandlist --json. - Added per-agent
create_timeso external telemetry consumers can pair it withpidand avoid PID-reuse aliasing.
- Kill endpoint now requires caller authorization via the static token file.
- Non-loopback binds fail closed unless
--unsafe-exposeis explicitly passed. - Kill attempts and refusals append to the local action log.
- Added deterministic kill-path regression tests, including pid/create_time pins.
- Added synthetic hot/idle/churn/leak trace fixtures.
- Added adversarial matcher cases so lookalike process names stay test-covered.
pip installfails building psutil — no prebuilt wheel for your Python/platform, so pip compiles it: you need a C toolchain and Python headers (xcode-select --installon macOS;apt install gcc python3-devon Debian/Ubuntu). Or skip the problem withuvx agent-usage-manager.- Dashboard is empty / "No matching agents running" — first check which config was picked up (resolution order above; the header shows the resolved path). Then remember matching is against the executable basename + first few arguments, not the full command line — a pattern that only appears deep in the args won't match.
- Kill "doesn't work" — the agent comes back under a new PID — it's
supervised. On macOS the row gets a
launchdbadge with thelaunchctl bootoutcommand that actually stops it; on Linux, systemd services aren't detected (see limits) —systemctl stopthem. Akilled: 0in the kill response means nothing was actually signaled (e.g. the agent runs as another user). - HTTP 403 on every request — the DNS-rebinding guard refuses non-local
hostnames. Use
http://127.0.0.1:8765(or a bare IP) instead of a custom DNS name pointing at the box. - HTTP 403 on kill only ("Kill requires the X-Kill-Token header") — send the token from the file named in the message. In the dashboard, the paste prompt reappears on your next kill click (a stored stale token is forgotten automatically when the server rejects it).
- "refusing to bind …" at startup — non-loopback
--hostvalues fail closed; add--unsafe-exposeonly with auth in front (see Safety).
MIT
