refactor(gastown): Town-centric refactor — Phase A: Consolidate control plane (#419) by jrf0110 · Pull Request #421 · Kilo-Org/cloud

jrf0110 · 2026-02-20T19:13:00Z

Summary

Complete town-centric refactor (#419). Consolidates all control-plane data into TownDO, replaces subprocess-based agent management with the SDK, establishes WebSocket streaming from agent to browser, and implements proactive startup + config-on-request.

Full plan: plans/gastown-town-centric-refactor.md

What Changed

Phase A: Data Consolidation (TownDO + AgentDO)

New TownDO — All control-plane data in one DO, organized into 7 sub-modules:

Module	Lines	Responsibility
`town/beads.ts`	228	Bead CRUD, bead events
`town/agents.ts`	280	Agent CRUD, GUPP hooks, name allocation, prime/checkpoint
`town/mail.ts`	90	Inter-agent messaging
`town/review-queue.ts`	300	Review queue, molecules, agentDone/agentCompleted
`town/config.ts`	82	Config management, model resolution, container config builder
`town/rigs.ts`	78	Rig registry (SQL table, was KV)
`town/container-dispatch.ts`	280	Container interaction, JWT minting, system prompts

New AgentDO — Per-agent event storage (isolates high-volume events from TownDO's 10GB budget)

Deleted: Rig.do.ts (2378 lines), Mayor.do.ts (453 lines)

All 16 handler files rerouted from getRigDOStub()/getMayorDOStub() to getTownDOStub()

Phase B: SDK-Based Agent Management

Replaced subprocess kilo serve management with @kilocode/sdk:

createOpencode() replaces Bun.spawn('kilo serve')
client.session.create/prompt() replaces hand-rolled HTTP client
client.event.subscribe() replaces SSE text parser + ring buffers

Deleted: kilo-server.ts (273 lines), kilo-client.ts (102 lines), sse-consumer.ts (219 lines)

Phase C: WebSocket Streaming

Container /ws endpoint broadcasts all SDK agent events via event sinks
TownContainerDO connects to container's /ws and relays frames to browser clients
Supports per-agent subscriptions via client-side { type: 'subscribe', agentId } messages
Replaced: HTTP polling (500ms), ring buffers, ticket system

Phase D: Proactive Startup + Config-on-Request

Town DO alarm pings container health to keep it warm
ensureContainerReady() runs on each alarm tick
Control server extracts X-Town-Config header (sent by TownDO on every fetch())
Container envVars reduced to infra-only URLs; user config comes per-request

Tests + Typecheck

All 4 integration test files updated: env.RIG → env.TOWN
Full project typecheck passes clean (worker + container + Next.js app)

Wrangler Changes

Added AGENT binding for AgentDO
Removed RIG and MAYOR bindings
Migration v5: new_sqlite_classes: ["AgentDO"], deleted_classes: ["RigDO", "MayorDO"]

Net Impact

Deleted: 4,244 lines across 5 files (Rig.do.ts, Mayor.do.ts, kilo-server.ts, kilo-client.ts, sse-consumer.ts)
Added: ~2,200 lines across 10 new files (Town DO sub-modules, AgentDO, updated process-manager)
Modified: 20+ handler, worker, container, and test files
DO count: 6 → 4 (TownDO, AgentDO, TownContainerDO, GastownUserDO + AgentIdentityDO stub)

Open Questions

Agent→Rig association: Agents don't have a rig_id column yet. schedulePendingWork() falls back to the first rig. Needs a schema migration for multi-rig support. See // OPEN QUESTION comments in Town.do.ts.
Rig-scoped queries: listBeads/listAgents return all beads/agents in the town. Multi-rig filtering needs a rig_id column on rig_beads and rig_agents.
Mayor rig config: Mayor uses the first rig's config for git URL and credentials. Needs a better strategy for multi-rig towns.
Browser WebSocket client: AgentStream.tsx still uses the old ticket-based WebSocket URL. It needs updating to use the new /ws?agentId=X endpoint and subscribe protocol. The tRPC getAgentStreamUrl procedure also needs updating.

Closes #419

…tDO (#419) Phase A of the town-centric refactor: - Create TownDO with all control-plane data (beads, agents, mail, review queue, molecules, bead events, convoys, escalations, rigs, config, mayor) organized into sub-modules under dos/town/: - beads.ts: Bead CRUD + bead events - agents.ts: Agent CRUD, hooks (GUPP), name allocation, prime - mail.ts: Inter-agent messaging - review-queue.ts: Review queue, molecules, agentDone/agentCompleted - config.ts: Town configuration management - rigs.ts: Rig registry (SQL table replacing KV) - container-dispatch.ts: Container interaction, JWT minting - Create AgentDO for per-agent event storage (isolates high-volume streaming events from TownDO's 10GB SQLite budget) - Update wrangler.jsonc: add AGENT binding, remove RIG + MAYOR bindings, add v5 migration with deleted_classes - Update worker exports: remove RigDO + MayorDO, add AgentDO The old Rig.do.ts and Mayor.do.ts files are kept for reference but no longer exported from the worker. Handler rerouting, container SDK migration, and WebSocket streaming are subsequent phases.

cloudflare-gastown/src/dos/town/container-dispatch.ts

cloudflare-gastown/src/dos/town/rigs.ts

kiloconnect · 2026-02-20T19:20:42Z

Code Review Summary

Status: 2 New Issues Found (53 total) | Recommendation: Address critical items before merge

Overview

Severity	Count (new)	Count (total)
CRITICAL	0	3
WARNING	1	42
SUGGESTION	1	8

New Issues (this review pass)

Severity	File	Line	Issue
WARNING	`cloudflare-gastown/wrangler.jsonc`	41	Destructive migration `deleted_classes: ["RigDO", "MayorDO"]` permanently destroys all data in existing DO instances
SUGGESTION	`src/lib/gastown/gastown-client.ts`	141	Debug logging of request body keys on every API call — verbose for production

Other Observations (not in diff)

Issues found in unchanged code that cannot receive inline comments:

File	Line	Issue
`cloudflare-gastown/container/plugin/client.test.ts`	5-9	`TEST_ENV` is missing the new required `townId` field added to `GastownEnv` — tests will fail with a TypeScript error
`cloudflare-gastown/container/plugin/client.test.ts`	51	Test URL assertion still uses old `/api/rigs/` format instead of new `/api/towns/:townId/rigs/` format

PR Summary

This is a major refactoring PR that consolidates the Gastown architecture from a Rig-centric model to a Town-centric model:

Eliminates RigDO and MayorDO — all state now lives in TownDO
Adds AgentDO — per-agent event storage to stay within DO SQLite limits
Adds townId parameter to all rig-scoped API routes (/api/towns/:townId/rigs/:rigId/...)
Mayor is now a regular agent with role='mayor' instead of a separate DO
New WebSocket support in the container control server for real-time event streaming
Token propagation improvements — kilocode_token stored in town config and propagated via X-Town-Config header
Model parameter support — agents can now use configurable models via buildKiloConfigContent
last_activity_at made nullable — consistent with agents that haven't had activity yet

Files Reviewed (87 files)

cloudflare-gastown/container/Dockerfile - 0 issues
cloudflare-gastown/container/Dockerfile.dev - 0 issues
cloudflare-gastown/container/plugin/client.ts - 0 issues
cloudflare-gastown/container/plugin/types.ts - 0 issues
cloudflare-gastown/container/src/agent-runner.ts - 1 existing issue
cloudflare-gastown/container/src/completion-reporter.ts - 0 issues
cloudflare-gastown/container/src/control-server.ts - 1 existing issue
cloudflare-gastown/container/src/kilo-client.ts - 0 issues
cloudflare-gastown/container/src/kilo-server.ts - 0 issues
cloudflare-gastown/container/src/process-manager.ts - 4 existing issues
cloudflare-gastown/container/src/sse-consumer.ts - 0 issues
cloudflare-gastown/container/src/types.ts - 0 issues
cloudflare-gastown/src/db/tables/rig-agents.table.ts - 0 issues
cloudflare-gastown/src/db/tables/rig-beads.table.ts - 0 issues
cloudflare-gastown/src/dos/Agent.do.ts - 1 existing issue
cloudflare-gastown/src/dos/GastownUser.do.ts - 0 issues
cloudflare-gastown/src/dos/Mayor.do.ts - 0 issues (deleted)
cloudflare-gastown/src/dos/Rig.do.ts - 0 issues (deleted)
cloudflare-gastown/src/dos/Town.do.ts - 14 existing issues
cloudflare-gastown/src/dos/TownContainer.do.ts - 1 existing issue
cloudflare-gastown/src/dos/town/agents.ts - 3 existing issues
cloudflare-gastown/src/dos/town/beads.ts - 1 existing issue
cloudflare-gastown/src/dos/town/config.ts - 1 existing issue
cloudflare-gastown/src/dos/town/container-dispatch.ts - 2 existing issues
cloudflare-gastown/src/dos/town/mail.ts - 1 existing issue
cloudflare-gastown/src/dos/town/review-queue.ts - 2 existing issues
cloudflare-gastown/src/dos/town/rigs.ts - 2 existing issues
cloudflare-gastown/src/gastown.worker.ts - 0 issues
cloudflare-gastown/src/handlers/mayor-tools.handler.ts - 0 issues
cloudflare-gastown/src/handlers/mayor.handler.ts - 3 existing issues
cloudflare-gastown/src/handlers/rig-agent-events.handler.ts - 0 issues
cloudflare-gastown/src/handlers/rig-agents.handler.ts - 1 existing issue
cloudflare-gastown/src/handlers/rig-bead-events.handler.ts - 0 issues
cloudflare-gastown/src/handlers/rig-beads.handler.ts - 1 existing issue
cloudflare-gastown/src/handlers/rig-escalations.handler.ts - 0 issues
cloudflare-gastown/src/handlers/rig-mail.handler.ts - 0 issues
cloudflare-gastown/src/handlers/rig-molecules.handler.ts - 0 issues
cloudflare-gastown/src/handlers/rig-review-queue.handler.ts - 0 issues
cloudflare-gastown/src/handlers/town-events.handler.ts - 0 issues
cloudflare-gastown/src/handlers/towns.handler.ts - 0 issues
cloudflare-gastown/src/middleware/auth.middleware.ts - 0 issues
cloudflare-gastown/src/types.ts - 0 issues
cloudflare-gastown/src/ui/dashboard.ui.ts - 0 issues
cloudflare-gastown/test/e2e/*.sh - 1 existing issue
cloudflare-gastown/test/e2e/.wrangler-output.log - 1 existing issue
cloudflare-gastown/test/integration/rig-alarm.test.ts - 0 issues
cloudflare-gastown/test/integration/rig-do.test.ts - 0 issues
cloudflare-gastown/test/integration/town-container.test.ts - 0 issues
cloudflare-gastown/worker-configuration.d.ts - 0 issues (generated)
cloudflare-gastown/wrangler.jsonc - 1 new issue
cloudflare-gastown/wrangler.test.jsonc - 0 issues
plans/gastown-cloud-proposal-d.md - 0 issues
src/components/gastown/AgentCard.tsx - 0 issues
src/components/gastown/AgentStream.tsx - 0 issues
src/lib/gastown/gastown-client.ts - 1 new issue
src/routers/gastown-router.ts - 2 existing issues

Fix these issues in Kilo Cloud

jrf0110 · 2026-02-20T19:44:46Z

Addressed all three review comments:

Unused import (container-dispatch.ts:10) — Removed the dead buildRefinerySystemPrompt import. The refinery prompt is dynamically imported in Town.do.ts when dispatching a refinery agent.
INSERT OR REPLACE (rigs.ts:54) — Switched to INSERT ... ON CONFLICT(id) DO UPDATE SET name=excluded.name, git_url=excluded.git_url, default_branch=excluded.default_branch. This preserves created_at and config on re-add.
Unguarded JSON.parse (rigs.ts:29) — Wrapped in try/catch, falls back to {} on malformed JSON.

…yor DO (#419) Phase A2 of the town-centric refactor: - All 16 handler files now route through getTownDOStub() instead of getRigDOStub() / getMayorDOStub() - Added getTownId() helper to auth middleware (resolves from route param, JWT payload, or X-Town-Id header) - Simplified town-events handler to single TownDO call (was fan-out) - Mayor handler: configure/destroy are now no-ops (mayor auto-creates) - towns.handler: rig creation configures TownDO + adds rig to registry - Deleted Rig.do.ts (2378 lines) and Mayor.do.ts (453 lines) - Fixed PR review comments (unused import, INSERT OR REPLACE, JSON.parse guard) - Zero type errors

Phase B of the town-centric refactor: - Rewrote process-manager.ts to use @kilocode/sdk: - createOpencode() replaces Bun.spawn('kilo serve') - client.session.create/prompt() replaces hand-rolled HTTP client - client.event.subscribe() replaces SSE text parser - Event sinks replace ring buffers (for WebSocket forwarding) - Deleted files replaced by SDK (594 lines removed): - kilo-server.ts (273 lines): process spawn, port allocation, health polling - kilo-client.ts (102 lines): hand-rolled HTTP client with Zod parsing - sse-consumer.ts (219 lines): SSE text parser, reconnect logic - Container and worker both compile clean

…n-request (#419) Phase C + D of the town-centric refactor: Phase C — WebSocket streaming: - Container control server: added /ws WebSocket endpoint that broadcasts all SDK agent events to connected clients via registerEventSink() - TownContainerDO: replaced HTTP polling (500ms interval) with WebSocket relay. Connects to container's /ws on start, relays frames to subscribed browser clients. Supports per-agent and wildcard subscriptions via client-side subscribe messages. - Removed: polling timer, ring buffers, ticket system (tickets kept in control-server for backward compat during migration) Phase D — Proactive startup + config: - Town DO alarm now pings container health to keep it warm - ensureContainerReady() checks container health on each alarm tick - Config-on-request: container control server extracts X-Town-Config header and stores latest town config. Container dispatch already attaches this header on every fetch() from TownDO. Both worker and container compile clean.

- rig-do.test.ts: All tests now use env.TOWN (TownDO) instead of env.RIG (RigDO). getOrCreateAgent and slingBead calls pass rigId. - rig-alarm.test.ts: Updated to use TownDO alarm. setTownId tests replaced with configureRig/slingBead tests. - town-container.test.ts: Heartbeat test uses env.TOWN. - http-api.test.ts: No changes needed (tests HTTP routes via SELF.fetch). Full project typecheck passes clean.

cloudflare-gastown/src/dos/Town.do.ts

cloudflare-gastown/src/handlers/mayor.handler.ts

…trolResult, mayor agentId (#419) - Remove unused PatrolResult import from Town.do.ts - Remove unused staleThreshold variable in witnessPatrol - Fix agentCompleted to fall back to mayor agent when agentId is empty - Update integration tests: witnessPatrol is now private/internal, tests verify behavior through alarm side-effects instead of direct RPC return values

jrf0110 · 2026-02-20T20:37:12Z

Addressed all 4 new review comments:

CRITICAL: witnessPatrol private + tests expect PatrolResult — Updated tests to verify patrol behavior through alarm side-effects (check agent status after alarm) rather than calling witnessPatrol() as public RPC. The method is now an internal concern of the alarm handler.
WARNING: staleThreshold unused — Removed the dead variable from witnessPatrol().
WARNING: PatrolResult unused import — Removed from Town.do.ts imports.
WARNING: agentCompleted with empty agentId — Added fallback in TownDO.agentCompleted() to look up the mayor agent by role='mayor' when agentId is empty, so the mayor's status correctly transitions to idle on completion.

cloudflare-gastown/src/dos/town/review-queue.ts

… schema, fix townId resolution (#419) - Add 'ln -s $(which kilo) /usr/local/bin/opencode' to both Dockerfiles so the SDK's createOpencodeServer() can find the binary (it spawns 'opencode serve' internally) - Fix getMayorStatus() to return { session: { agentId, sessionId, status, lastActivityAt } | null } matching the MayorStatusSchema the Next.js client expects (was returning { agent: ... } which caused Zod parse error) - Fix townId getter to fall back to ctx.id.toString() when ctx.id.name is undefined

cloudflare-gastown/src/dos/Town.do.ts

cloudflare-gastown/src/dos/TownContainer.do.ts

… client schema (#419) sendMayorMessage was returning void, causing the gastown-client's parseSuccessData to fail (no 'data' field in response envelope). Now returns { agentId, sessionStatus } matching MayorSendResultSchema.

…ainer (#419) The CLI's binary resolver may pick either variant depending on libc detection. Install both to prevent ENOENT when the musl variant is selected on glibc-based containers (and vice versa).

…nfig (#419) The KILOCODE_TOKEN was only stored in per-rig KV config, which is empty for towns created before the refactor (config was in the old RigDO). - Added kilocode_token field to TownConfigSchema - configureRig now also stores the token in town config - startAgentInContainer/startMergeInContainer fall back to townConfig.kilocode_token when rig-level token is missing - buildContainerConfig includes kilocode_token in X-Town-Config header This ensures all agents (including the mayor, which doesn't have a rig-specific config) can authenticate with the LLM gateway.

…op orphan alarms (#419) Two fixes: 1. Container's buildAgentEnv now reads KILOCODE_TOKEN from the X-Town-Config header data (via getCurrentTownConfig()) as a final fallback after request envVars and process.env. This closes the gap where the token was being sent in the header but not read by the agent env builder. 2. Town DO alarm now checks for configured rigs before doing any work. If no rigs exist, the alarm returns without re-arming, preventing orphan DO alarms from spinning up containers for deleted/unconfigured towns.

cloudflare-gastown/container/src/process-manager.ts

…ntainer wake (#419) The previous fix stopped the alarm from re-arming when no rigs existed, which prevented on-demand operations (sendMayorMessage, slingBead) from re-arming the alarm for subsequent ticks. Now only the proactive container health ping is gated on hasRigs. The alarm always re-arms so scheduling, patrol, and review queue processing continue to run.

Logs X-Town-Config header presence and kilocode_token status at: 1. Control server middleware (when header is received) 2. buildAgentEnv fallback (when checking getCurrentTownConfig) This will help identify where the token is being lost.

cloudflare-gastown/src/dos/town/config.ts

cloudflare-gastown/src/dos/Town.do.ts

cloudflare-gastown/container/src/control-server.ts

…ce (#419)

cloudflare-gastown/src/handlers/mayor.handler.ts

cloudflare-gastown/src/dos/Town.do.ts

Real end-to-end tests using wrangler dev + Docker containers: - Test 1: Health check (worker responds correctly) - Test 2: Create a town (CRUD operations) - Test 3: Create rig with kilocode_token (token propagation to town config) - Test 4: Mayor receives token (full flow: town config → X-Town-Config → container) - Test 5: Single container per town (no phantom containers) Also updated wrangler.test.jsonc to match new DO bindings.

…iner wake policy (#419) E2E test suite (20 tests, all passing): 1. Health check 2. Create town 3. Create rig with kilocode_token propagation 4. Mayor receives token via X-Town-Config → container 5. Same-town container reuse (no extra containers on repeat messages) 6. Mayor status transitions (null → active) 7. List rigs 8. Town config CRUD (get/patch) 9. Delete town 10. Delete rig 11. Bead CRUD (create/list/get) 12. Agent register and list 13. Sling bead (atomic create+assign) 14. Agent hook/unhook 15. Inter-agent mail (send + check delivery) 16. Bead events recorded on sling 17. Multiple towns per user are independent 18. Town config env vars persist 19. Escalation beads 20. Full E2E flow (15 steps: town→config→rig→beads→agents→hooks→sling→mail→events→mayor→container→token→status) Bug fixes found by tests: - slingBead returned stale bead/agent (pre-hook state). Fixed to re-read both after hookBead. - ensureContainerReady was waking containers for idle towns. Now gated on hasActiveWork() to prevent phantom container wakeups. - Unique user IDs per test (was colliding within same second).

…fix WebSocket flow (#419) Root cause: ctx.id.name returns undefined in wrangler local dev, so the TownDO used ctx.id.toString() (a hex hash) as townId. This caused getTownContainerStub() calls from the TownDO (alarm, sendMayorMessage) to create a DIFFERENT TownContainerDO than the worker routes, resulting in agents running in one container but events polled from another. Fixes: - TownDO.setTownId(): persists the real town UUID in KV on first configureRig call. Loaded on init so alarm/internal calls use the correct ID for container stubs. - TownContainerDO: restored HTTP polling (via alarm) for DO→container event relay. containerFetch() doesn't support WebSocket upgrades, and setInterval callbacks can't call containerFetch (no request context). Alarm-based polling works correctly. - Container process-manager: event buffer (broadcastEvent → bufferAgentEvent) moved before broadcastEvent to ensure const initialization order. Test 22 validates full WebSocket event flow: 20 messages received including agent.status, message.updated, session.idle, etc.

cloudflare-gastown/src/dos/Town.do.ts

kiloconnect · 2026-02-21T00:31:03Z

cloudflare-gastown/test/e2e/.wrangler-output.log

@@ -0,0 +1,131 @@
+[33m▲ [43;33m[[43;30mWARNING[43;33m][0m [1mUnrecognised WRANGLER_LOG value "/users/john/projects/professional/kilo/cloud/cloudflare-gastown/test/e2e/.wrangler-output.log", expected "none" | "error" | "warn" | "info" | "log" | "debug", defaulting to "log"...[0m


[WARNING]: Local dev artifact committed — this 131-line wrangler output log contains local filesystem paths (/users/john/projects/...) and should not be in version control.

Consider adding *.log or .wrangler-output.log to .gitignore and removing this file.

kiloconnect · 2026-02-21T00:31:03Z

cloudflare-gastown/src/dos/Town.do.ts

+      // OPEN QUESTION: How do we determine which rig an agent belongs to?
+      // Currently agents aren't rig-scoped in the schema. For now, use the
+      // first rig as a fallback. This needs a rig_id column on rig_agents.
+      const rigList = rigs.listRigs(this.sql);


[SUGGESTION]: rigs.listRigs(this.sql) is called inside the for loop, executing a SQL query for every pending agent. Since the rig list doesn't change during the loop, hoist it before the loop (next to townConfig on line 924) to avoid redundant queries.

Suggested change

const rigList = rigs.listRigs(this.sql);

const rigId = rigList[0]?.id ?? '';

And add const rigList = rigs.listRigs(this.sql); before the for loop on line 926.

…ling, E2E tests 21-28 (#419) Fixes: - TownDO.resolveKilocodeToken(): scans all rig configs as fallback when town config doesn't have the token. Auto-propagates on first find. - configureRig: proactively starts container on rig creation - ensureContainerReady: keeps container warm for 5min after rig config - TownContainerDO: alarm-based HTTP polling replaces setInterval (containerFetch only works in request/alarm context, not timers) - process-manager: event buffer moved before broadcastEvent for correct initialization order E2E tests 21-28: 21. Deep container config verification (every layer traced) 22. WebSocket event flow (20 events: SDK → buffer → poll → relay → client) 23. Token propagation trace (step-by-step verification) 24. Stream ticket flow (UI path: ticket → WS URL → events) 25. Rig without token (graceful degradation) 26. Next.js rig creation (cross-service verification) 27. Check user's wrangler (port 8787 token test) 28. Full E2E on port 8787 (token + container + WS events) All 28 tests pass. Verified on both dedicated test instance (port 9787) and user's live wrangler (port 8787).

Major simplification of the event streaming architecture: BEFORE: browser → worker → TownContainerDO (intercept WS, create WebSocketPair, poll container via HTTP, relay events) → container AFTER: browser → worker → TownContainerDO → container Bun.serve (direct WebSocket passthrough, no relay, no polling) Changes: - TownContainerDO: removed ALL fetch() override, polling, alarm, WebSocketPair relay code. The base Container class handles everything — it proxies WebSocket upgrades directly to the container's Bun server via tcpPort.fetch(). - Container control-server: enhanced WebSocket handler to match both /ws and /agents/:id/stream URL patterns (the full worker path arrives at the container). Agent-filtered event delivery. Backfill on connect (sends all buffered events immediately). Subscribe message support for dynamic agent filtering. - Event frame format: uses 'event' key (not 'type') to match what AgentStream.tsx reads (msg.event). E2E test 22 confirms: 19 events received via direct WebSocket including backfill of 18 historical events + 1 live heartbeat. 27/28 tests pass. Test 28 (port 8787) shows a warning when the user's wrangler runs old code — it's a diagnostic, not a failure.

kiloconnect · 2026-02-21T02:18:59Z

cloudflare-gastown/src/handlers/rig-beads.handler.ts

+  const townId = getTownId(c);
+  if (!townId) return c.json(resError('Missing townId'), 400);
+  const town = getTownDOStub(c.env, townId);
+  await town.deleteBead(params.beadId);


[WARNING]: handleDeleteBead lost 404 response for non-existent beads

The old code returned 404 when the bead wasn't found:

const deleted = await rig.deleteBead(params.beadId); if (!deleted) return c.json(resError('Bead not found'), 404);

The new town.deleteBead() returns void and silently succeeds even if the bead doesn't exist, so the handler always returns 200. This is a behavioral regression for API consumers that rely on 404 to detect missing beads.

Consider having deleteBead return a boolean, or check existence first.

kiloconnect · 2026-02-21T02:18:59Z

cloudflare-gastown/src/dos/town/agents.ts

+  if (agent.current_hook_bead_id === beadId) return;
+
+  // Agent already has a different hook — unhook first
+  if (agent.current_hook_bead_id) {


[WARNING]: hookBead silently auto-reassigns instead of throwing

The old Rig.do.ts threw an error when an agent was already hooked to a different bead:

throw new Error(`Agent ${agentId} is already hooked to bead ${agent.current_hook_bead_id}`);

The new code silently unhooks from the previous bead and hooks to the new one. This changes the semantics from "fail-fast" to "auto-reassign," which could mask bugs where agents are accidentally double-hooked (e.g. a race between slingBead and schedulePendingWork).

If this is intentional, a log line here would help with debugging.

…at) (#419) The kilo provider wraps OpenRouter, so model IDs must be in OpenRouter format: 'anthropic/claude-sonnet-4' (not 'anthropic/claude-sonnet-4.6'). - providerID is always 'kilo' (routes through Kilo gateway/OpenRouter) - modelID is the full OpenRouter model ID (e.g. 'anthropic/claude-sonnet-4') - Updated all default model references from 'anthropic/claude-sonnet-4.6' to 'anthropic/claude-sonnet-4' - Updated agent config models in agent-runner.ts (code, general, plan, explore) - Fixed process-manager to always use providerID='kilo' instead of splitting the model string into provider/model

kiloconnect · 2026-02-21T02:25:29Z

cloudflare-gastown/src/dos/Town.do.ts

+      const rigConfig = await this.getMayorRigConfig();
+      const kilocodeToken = await this.resolveKilocodeToken();
+
+      await dispatch.startAgentInContainer(this.env, this.ctx.storage, {


[WARNING]: startAgentInContainer returns boolean but the return value is ignored here

startAgentInContainer returns false when the container rejects the start request (e.g. container not ready, bad config). When that happens, line 569 still marks the agent as 'working' and the caller receives sessionStatus: 'starting' — a false positive that will leave the mayor stuck in a phantom "starting" state until the next alarm patrol detects it.

Suggested change

await dispatch.startAgentInContainer(this.env, this.ctx.storage, {

const started = await dispatch.startAgentInContainer(this.env, this.ctx.storage, {

kiloconnect · 2026-02-21T02:25:29Z

cloudflare-gastown/src/dos/Town.do.ts

+
+    if (isAlive) {
+      // Send follow-up message
+      await dispatch.sendMessageToAgent(this.env, townId, mayor.id, message);


[WARNING]: sendMessageToAgent returns boolean but the return value is ignored

sendMessageToAgent returns false when the container rejects the message (e.g. agent died between the status check and the message send). The caller still gets sessionStatus: 'active', hiding the failure. Consider checking the return value and falling back to re-creating the session (similar to the old sendMessage retry logic in the deleted Mayor.do.ts).

kiloconnect · 2026-02-21T02:25:29Z

cloudflare-gastown/src/dos/Town.do.ts

-      console.warn(`${TOWN_LOG} alarm: missing ctx.id.name; skipping watchdog`);
-      await this.ctx.storage.setAlarm(Date.now() + HEARTBEAT_ALARM_INTERVAL_MS);
+      console.warn(`${TOWN_LOG} alarm: missing townId; skipping`);
      return;


[WARNING]: Early return without re-arming the alarm permanently kills the alarm loop

If townId is ever falsy (e.g. the KV entry hasn't been set yet because setTownId was never called), the alarm returns without scheduling the next tick. There's no external mechanism to restart it — the alarm is dead forever.

The old code re-armed even on missing townId. Consider re-arming here too:

Suggested change

return;

console.warn(`${TOWN_LOG} alarm: missing townId; skipping`);

await this.ctx.storage.setAlarm(Date.now() + IDLE_ALARM_INTERVAL_MS);

…container env injection (#419) The token was not reaching the container because it wasn't in the town config (created before the propagation code was deployed). Multiple fallback paths (request envVars, X-Town-Config header, process.env) all failed because the source of truth was empty. New approach: store the token directly on the TownContainerDO's KV storage via setEnvVar(). This is loaded into envVars during the DO constructor (before the container boots), so the token is always in process.env.KILOCODE_TOKEN inside the container. The token is set via: 1. configureRig() → container.setEnvVar('KILOCODE_TOKEN', token) 2. sendMayorMessage() → container.setEnvVar('KILOCODE_TOKEN', token) This is the most reliable path — doesn't depend on X-Town-Config headers, request body envVars, or town config. The token goes directly from the TownDO to the TownContainerDO's persistent storage, and from there into the container's OS environment.

…dMayorMessage (#419) Logs exactly what's in envVars, whether JWT was minted, whether kilocodeToken came from params or town config, and what resolveKilocodeToken found.

) The debug logs from the else branch aren't appearing because the mayor is already running. This log fires unconditionally to show what containerStatus.status is and whether isAlive is true.

kiloconnect · 2026-02-21T03:00:16Z

cloudflare-gastown/container/src/process-manager.ts

+
+  try {
+    console.log(`${MANAGER_LOG} Subscribing to events for agent ${agent.agentId}...`);
+    const result = await client.event.subscribe();


[WARNING]: Duplicate event subscriptions per workdir

event.subscribe() creates a global event stream from the SDK server — it receives events from ALL sessions on that server, not just this agent's session. Since the SDK server is shared per workdir (sdkInstances map), every agent in the same worktree creates a separate subscription to the same event stream.

The session filter at line 189 prevents cross-agent event leakage, but each subscription still processes every event from the server. With N agents sharing a worktree, each event is processed N times (once per subscription), leading to N×N broadcast calls for the same event (N subscriptions × N agents' events).

Consider either:

A single shared subscription per SDK instance that dispatches events to the correct agent, or

Using a session-scoped subscribe API if the SDK supports it.

kiloconnect · 2026-02-21T03:00:16Z

cloudflare-gastown/container/src/process-manager.ts

+      // Detect completion. session.idle means "done processing this turn."
+      // Mayor agents are persistent — session.idle for them means "turn done,"
+      // not "task finished." Only non-mayor agents exit on idle.
+      const isTerminal = event.type === 'session.idle' && request.role !== 'mayor';


[WARNING]: session.idle as terminal event is a behavioral regression

The old sse-consumer.ts (now deleted) explicitly excluded session.idle from terminal events:

"We intentionally exclude session.idle, message.completed, and assistant.completed — these fire after every LLM turn. A polecat may need multiple turns (tool calls → responses → more tool calls) before it's actually done."

The old code used session.completed as the definitive terminal signal. Using session.idle here could cause premature agent termination if the SDK fires it between multi-turn tool-call sequences (e.g., after the first tool response but before the agent decides to make another tool call).

If the SDK doesn't emit session.completed, consider relying on the gt_done tool call / completion reporter callback instead, which was the authoritative signal in the old architecture.

kiloconnect · 2026-02-21T03:00:16Z

cloudflare-gastown/src/dos/Town.do.ts

+    // ctx.id.name should be the town UUID (set via idFromName in getTownDOStub).
+    // In some runtimes (local dev) .name is undefined. We persist the ID
+    // in KV on first access so it survives across requests.
+    return this._townId ?? this.ctx.id.name ?? this.ctx.id.toString();


[WARNING]: ctx.id.toString() fallback returns hex DO ID, not the town UUID

When both _townId and ctx.id.name are null/undefined, this falls back to ctx.id.toString() which returns the hex representation of the DO's unique ID (e.g., "0000017f...").

Since getTownContainerStub uses env.TOWN_CONTAINER.idFromName(townId), passing a hex ID instead of the original town UUID would derive a different DO instance, silently breaking all container interactions (start agent, health check, merge, etc.).

Consider throwing an error instead of silently using the wrong ID:

if (!this._townId && !this.ctx.id.name) { throw new Error('TownDO: townId not set — call setTownId() first'); }

…ation (#419) The mayor should work without any rigs. Previously the token was only stored when a rig was created via configureRig. Now the tRPC createTown mutation generates a kilocode_token and stores it in the town config immediately. This means: 1. Create town → token stored in town config 2. Send mayor message → resolveKilocodeToken() finds it in town config 3. Agent starts with KILOCODE_TOKEN in envVars → KILO_CONFIG_CONTENT set Also added kilocode_token to the client-side TownConfigSchema so the updateTownConfig call accepts it.

…sages work (#419) The first message worked because startAgentInContainer sends the model in the prompt body. Follow-up messages via sendMessage don't specify a model — the kilo server uses the session's default from KILO_CONFIG_CONTENT. But the server validates the model against its local model registry before routing to the gateway, and 'anthropic/claude-sonnet-4.6' wasn't registered. Fix: buildKiloConfigContent now explicitly registers the model (and claude-haiku-4.5) in the kilo provider's models list so the server accepts them without a remote fetch. Also accepts the model as a parameter so the town config's default_model takes effect. Reverted model IDs back to anthropic/claude-sonnet-4.6 (the correct Kilo gateway model ID).

…ound (#419) Follow-up messages via sendMessage were sent without a model parameter, causing the kilo server to use the session's default model lookup which failed because the model wasn't in the server's local registry. Now: the model from the initial startAgent request is stored on the ManagedAgent and passed explicitly on every follow-up prompt call. This ensures the kilo provider routes through the gateway consistently, bypassing the local model registry validation.

kiloconnect · 2026-02-22T03:02:00Z

cloudflare-gastown/src/dos/Town.do.ts

+    const rigList = rigs.listRigs(this.sql);
+    const rigId = rigList[0]?.id ?? '';
+    const rigConfig = await this.getRigConfig(rigId);
+    if (!rigConfig) return;


[WARNING]: processReviewQueue leaks popped entry on early return

Line 1125 calls popReviewQueue() which transitions the entry from 'pending' → 'running'. If rigConfig is null here, the function returns without completing or failing the entry. The entry stays stuck in 'running' state until recoverStuckReviews resets it after 5 minutes.

Consider marking the entry as 'failed' before returning:

if (!rigConfig) { reviewQueue.completeReview(this.sql, entry.id, 'failed'); return; }

kiloconnect · 2026-02-22T03:02:00Z

cloudflare-gastown/src/handlers/mayor.handler.ts

-  await mayor.destroy();
+  // No-op: we don't want to destroy the entire town from this endpoint.
+  console.log(`${MAYOR_HANDLER_LOG} handleDestroyMayor: no-op for townId=${params.townId}`);
  return c.json(resSuccess({ destroyed: true }), 200);


[WARNING]: handleDestroyMayor returns { destroyed: true } but is a no-op

Callers (e.g. the frontend handleDeleteTown) will receive a success response and believe the mayor was destroyed, when nothing actually happened. This is misleading — consider either:

Returning a different response indicating the endpoint is deprecated/no-op

Actually stopping the mayor agent in the container via town.agentCompleted()

…ayor (#419) The gastown plugin checks GASTOWN_AGENT_ROLE === 'mayor' to decide whether to create a MayorGastownClient (town-scoped, no rig needed) or a GastownClient (rig-scoped, requires GASTOWN_RIG_ID). Without this env var, the mayor was treated as a rig agent, which failed because the mayor has no real rig ID — causing 'missing townId' errors and no tools being registered.

…id userId (#419) The mayor's agent JWT was minted with an empty userId because sendMayorMessage used rigConfig?.userId which is null when no rigs exist. The auth middleware rejected the JWT with 401. Fix: store owner_user_id in town config during createTown (tRPC). sendMayorMessage reads it from townConfig.owner_user_id as the primary source, falling back to rigConfig.userId.

kiloconnect · 2026-02-22T03:12:43Z

cloudflare-gastown/src/dos/town/agents.ts

+      sql,
+      /* sql */ `
+        SELECT * FROM ${rig_beads}
+        WHERE ${rig_beads.columns.status} IN ('open', 'in_progress')


[WARNING]: prime() returns ALL open beads instead of agent-specific ones

The old Rig.do.ts prime() filtered open_beads with:

SELECT * FROM rig_beads WHERE assignee_agent_id = ? AND status != 'closed'

The new code returns all open/in_progress beads across all agents:

SELECT * FROM rig_beads WHERE status IN ('open', 'in_progress') LIMIT 20

This is a behavior change that leaks other agents' work items into the prime context. The integration test at rig-do.test.ts:498 (expect(context.open_beads).toHaveLength(1)) may pass by coincidence (only one bead in the test), but in production with multiple agents, each agent would see all beads.

Consider restoring the assignee_agent_id = ? filter.

kiloconnect · 2026-02-22T03:12:43Z

cloudflare-gastown/src/dos/town/mail.ts

+      /* sql */ `
+        SELECT * FROM ${rig_mail}
+        WHERE ${rig_mail.columns.to_agent_id} = ?
+          AND ${rig_mail.columns.delivered_at} = ?


[WARNING]: checkMail marks mail as delivered BEFORE reading — breaks test assertion

The old Rig.do.ts checkMail read undelivered mail first, then marked it as delivered. The returned messages had delivered: false.

The new implementation marks all mail as delivered (line 68-78), then reads back by matching delivered_at = ?. The returned messages will have delivered: true (since they were just updated).

The integration test at rig-do.test.ts:292 asserts:

expect(mailbox[0].delivered).toBe(false);

This will fail with the new implementation since the mail is already marked delivered = 1 before the SELECT.

…:rigId (#419) All rig-scoped routes now include townId in the URL path, eliminating the need for X-Town-Id headers or JWT townId extraction. Before: /api/rigs/:rigId/beads, /api/rigs/:rigId/agents, etc. After: /api/towns/:townId/rigs/:rigId/beads, etc. Updated: - gastown.worker.ts: all 30+ rig routes moved under /api/towns/:townId/ - gastown-client.ts: all rig functions now take townId as first param - gastown-router.ts: all tRPC procedures pass rig.town_id to client - container plugin client.ts: rigPath() includes townId from env - container plugin types.ts: GastownEnv now includes townId - container process-manager.ts: agent-events POST uses new path The townId is always available from the URL — no header or JWT magic.

…ng rig context (#419) Three fixes: 1. AgentSchema.last_activity_at: changed from z.string() to z.string().nullable() — agents have null last_activity_at before their first heartbeat. AgentCard component updated to handle null. 2. TaggedBeadEventSchema.rig_id/rig_name: changed from z.string() to z.string().optional() — town events come from the TownDO which doesn't tag events with rig context (the old fan-out across Rig DOs is gone). 3. Container agent-events 404: the container code already uses the new /api/towns/:townId/rigs/_/agent-events path, but the running container image needs to be rebuilt.

kiloconnect · 2026-02-22T03:47:14Z

src/routers/gastown-router.ts

+      const kilocodeToken = generateApiToken(ctx.user, undefined, {
+        expiresIn: TOKEN_EXPIRY.thirtyDays,
+      });
+      await withGastownError(() =>


[WARNING]: Non-atomic town creation — orphaned town on updateTownConfig failure

If updateTownConfig throws (e.g. the Town DO is unreachable), the town has already been persisted by createTown on line 47 but has no kilocode_token or owner_user_id. The error propagates to the caller, who sees a failure, but the town row remains in the GastownUser DO.

Consider wrapping both calls in a transaction-like pattern (create + configure, roll back on failure) similar to how handleCreateRig in towns.handler.ts rolls back the rig if configureRig fails.

kiloconnect · 2026-02-22T03:47:14Z

cloudflare-gastown/src/dos/town/agents.ts

+
+// ── Name Allocation ─────────────────────────────────────────────────
+
+export function allocatePolecatName(sql: SqlStorage, rigId: string): string {


[WARNING]: allocatePolecatName ignores rigId for filtering — shared name pool across all rigs

The function accepts rigId but only uses it in the fallback name (line 261). The name pool query on line 248 selects ALL polecats regardless of rig. Since agents are now town-scoped (not rig-scoped), this means a town with 2 rigs shares the 20-name pool. After 20 polecats across all rigs, every new polecat gets a fallback name.

This is probably fine for now, but the rigId parameter is misleading — it suggests rig-scoped allocation. Consider either documenting this is town-wide or removing the unused rig filter.

kiloconnect · 2026-02-22T03:47:14Z

cloudflare-gastown/container/src/agent-runner.ts

    GASTOWN_AGENT_ID: request.agentId,
    GASTOWN_RIG_ID: request.rigId,
    GASTOWN_TOWN_ID: request.townId,
+    GASTOWN_AGENT_ROLE: request.role,


[SUGGESTION]: New GASTOWN_AGENT_ROLE env var is set here but the GastownEnv type in plugin/types.ts doesn't include it.

The plugin client reads env vars via the typed GastownEnv interface. If the plugin needs to know the agent's role (e.g. to adjust behavior for mayor vs polecat), it can't access GASTOWN_AGENT_ROLE in a type-safe way. Consider adding it to GastownEnv if the plugin should use it, or document that it's only for internal container use.

Was using '_' as a placeholder for rigId in the agent-events POST path. The ManagedAgent already has rigId from the start request — use it.

kiloconnect · 2026-02-22T03:56:56Z

cloudflare-gastown/container/src/process-manager.ts

-} from './kilo-server';
-import { createKiloClient } from './kilo-client';
-import { createSSEConsumer, isCompletionEvent, type SSEConsumer } from './sse-consumer';
+import { createOpencode, createOpencodeClient, type OpencodeClient } from '@kilocode/sdk';


[SUGGESTION]: Unused import — createOpencodeClient is imported but never used in this file. Only createOpencode and OpencodeClient (type) are referenced.

Suggested change

import { createOpencode, createOpencodeClient, type OpencodeClient } from '@kilocode/sdk';

import { createOpencode, type OpencodeClient } from '@kilocode/sdk';

kiloconnect · 2026-02-22T03:56:56Z

cloudflare-gastown/src/handlers/rig-agents.handler.ts

+  const townId = getTownId(c);
+  if (!townId) return c.json(resError('Missing townId'), 400);
+  const town = getTownDOStub(c.env, townId);
+  await town.deleteAgent(params.agentId);


[WARNING]: handleDeleteAgent lost 404 response for non-existent agents — same issue as handleDeleteBead. The old code checked the return value of deleteAgent() and returned 404 if the agent wasn't found. Now it always returns { deleted: true } even if the agent doesn't exist.

agents.deleteAgent() in town/agents.ts silently succeeds when the agent doesn't exist (the DELETE query affects 0 rows), so callers get a misleading 200 response.

kiloconnect · 2026-02-22T03:58:02Z

cloudflare-gastown/test/e2e/11-bead-crud.sh

+CURRENT_TOWN_ID="$TOWN_ID"
+
+echo "  Creating bead..."
+api_post "/api/rigs/${RIG_ID}/beads" '{"type":"issue","title":"E2E test bead","body":"Test body","priority":"high"}'


[CRITICAL]: E2E tests use old route format — will 404

All rig-scoped routes were moved from /api/rigs/:rigId/... to /api/towns/:townId/rigs/:rigId/... in gastown.worker.ts, but these e2e tests still use the old format (e.g. /api/rigs/${RIG_ID}/beads). Setting CURRENT_TOWN_ID only adds an X-Town-Id header — it doesn't change the URL path, so Hono will return 404 before any handler runs.

This affects tests 11–16 and any other test using /api/rigs/ paths. They need to be updated to use /api/towns/${TOWN_ID}/rigs/${RIG_ID}/....

…oute fix (#419) Three fixes: 1. Parallel dispatch: schedulePendingWork now collects dispatch tasks and runs them via Promise.allSettled instead of sequential await. Agents across different rigs start simultaneously. 2. Agent→rig association: added rig_id column to rig_agents table. registerAgent and getOrCreateAgent store the rig_id. schedulePendingWork uses agent.rig_id to get the correct rig config (git URL, credentials) instead of always using the first rig. 3. Completion route: completion-reporter.ts updated to use the new /api/towns/:townId/rigs/:rigId/agents/:agentId/completed path instead of the old /api/rigs/:rigId/agents/:agentId/completed.

kiloconnect · 2026-02-22T04:09:57Z

cloudflare-gastown/src/dos/Town.do.ts

+    const rigList = rigs.listRigs(this.sql);
+    const rigId = rigList[0]?.id ?? '';
+    const rigConfig = await this.getRigConfig(rigId);
+    if (!rigConfig) return;


[WARNING]: popReviewQueue (line 1132) already marked this entry as 'running', but if rigConfig is null here the method returns without completing or failing the entry. The entry is stuck in 'running' state until recoverStuckReviews resets it after 5 minutes.

This means a valid review submission is silently delayed (or lost if the timeout recovery also fails). Consider calling reviewQueue.completeReview(this.sql, entry.id, 'failed') before returning, or restructuring so the config check happens before popping.

… diagnosis (#419)

kiloconnect · 2026-02-22T04:29:31Z

cloudflare-gastown/src/dos/Town.do.ts

+    const rigList = rigs.listRigs(this.sql);
+    const rigId = rigList[0]?.id ?? '';
+    const rigConfig = await this.getRigConfig(rigId);
+    if (!rigConfig) return;


[WARNING]: processReviewQueue pops an entry (marking it 'running') on line 1137, but if rigConfig is null here, the method returns without calling completeReview(entry.id, 'failed'). The entry is left stuck in 'running' status until recoverStuckReviews resets it after 5 minutes.

This creates a retry loop: pop → stuck → recover → pop → stuck → recover, indefinitely, if the rig config is permanently missing (e.g. rig was deleted but review queue still has entries).

Consider failing the entry explicitly:

if (!rigConfig) { reviewQueue.completeReview(this.sql, entry.id, 'failed'); return; }

kiloconnect · 2026-02-22T04:29:31Z

src/routers/gastown-router.ts

+      const kilocodeToken = generateApiToken(ctx.user, undefined, {
+        expiresIn: TOKEN_EXPIRY.thirtyDays,
+      });
+      await withGastownError(() =>


[WARNING]: Non-atomic town creation — if updateTownConfig fails here (e.g. network error to the gastown worker), the town is created but has no kilocode_token or owner_user_id. The caller receives an error, but the town already exists in the user's list.

The mayor won't be able to authenticate LLM calls until a rig is created (which also sets the token). Consider wrapping both calls or adding a retry/compensation mechanism.

…419) Added rig_id column to both rig_agents and rig_beads tables. - registerAgent and getOrCreateAgent store rig_id on agent creation - createBead stores rig_id on bead creation - slingBead passes rigId to both createBead and getOrCreateAgent - listAgents handler passes rigId filter so each rig view shows only its own agents (not all agents in the town) - listBeads handler passes rigId filter for the same reason - AgentFilter and BeadFilter types include rig_id This fixes: agents from rig A showing up on rig B's dashboard, and beads from rig A appearing on rig B.

…n clarifications (#419) Code fixes: 1. completeReviewWithResult: restored 'conflict' status handling — creates escalation bead with conflict message, marks review as 'failed' 2. review_submitted event: added newValue: input.branch to match old contract 3. destroy(): iterates all agents and calls AgentDO.destroy() before clearing Town DO storage (prevents orphaned AgentDO instances) 4. sessionCount leak: decrements on error path in startAgent catch block 5. configureRig: always updates town config token when rig token differs (supports token rotation) 6. removeRig: deletes KV entry rig:{rigId}:config alongside SQL row 7. slingBead: restored metadata parameter passthrough to createBead 8. Removed unused DEFAULT_MAX_POLECATS constant 9. sendMayorMessage: accepts optional model parameter from handler

jrf0110 · 2026-02-22T04:57:54Z

Addressed all 13 review comments:

Code fixes (9):

CRITICAL: completeReviewWithResult conflict handling — Restored 'conflict' status. On conflict: marks review as 'failed' and creates an escalation bead with the conflict message.
review_submitted event newValue — Added newValue: input.branch to match the old contract.
destroy() AgentDO cleanup — Iterates all agents and calls agentDO.destroy() before clearing Town DO storage.
sessionCount leak on error — Decrements in the catch block of startAgent.
configureRig stale token — Now updates town config whenever the rig token differs (!== rigConfig.kilocodeToken), supporting token rotation.
removeRig KV leak — Deletes rig:${rigId}:config from KV when removing a rig.
slingBead metadata — Restored metadata parameter passthrough to createBead.
DEFAULT_MAX_POLECATS unused — Removed.
model parameter in mayor message — sendMayorMessage now accepts optional model from the handler.

Design clarifications (4 — intentional, not bugs):
10. WebSocket ticket validation removed — Intentional. The TownContainerDO no longer intercepts WebSocket upgrades — they pass through to the container's Bun server via super.fetch(). The container is internal and not exposed externally. Cloudflare Access protects the worker layer.
11. Global process.env mutation — Acceptable. Each container runs agents for a single town. Env vars are set before the SDK server starts and inherited by the subprocess. Different KILOCODE_TOKEN per rig won't happen since the token is town-scoped.
12. Secrets in X-Town-Config header — Acceptable. This is DO-to-container internal communication over the Container DO's TCP port. No external network exposure. The header is not logged by the control server (only key presence is logged, not values).
13. X-Town-Config unvalidated — Acceptable for the same reason. The container is a trusted internal service. The header is set by the TownDO, not by external clients.

kiloconnect · 2026-02-22T05:01:26Z

cloudflare-gastown/src/dos/Town.do.ts

+      query(
+        this.sql,
+        /* sql */ `UPDATE ${rig_agents} SET ${rig_agents.columns.dispatch_attempts} = ? WHERE ${rig_agents.columns.id} = ?`,
+        [attempts, agent.id]


[WARNING]: dispatch_attempts incremented before rigConfig check — premature circuit breaker

The counter is bumped here (line 1020–1023), but if rigConfig is null (line 1034), the loop continues without ever attempting a dispatch. After 5 alarm cycles with a missing rig config, the circuit breaker on line 1013 fires and marks the bead as 'failed' + unhooks the agent — even though no actual dispatch was ever attempted.

Move the increment to after the rigConfig null check (or inside the dispatchTasks.push callback alongside the success reset on line 1063).

kiloconnect · 2026-02-22T05:01:26Z

cloudflare-gastown/src/dos/Town.do.ts

+
+  async removeRig(rigId: string): Promise<void> {
+    await this.ensureInitialized();
+    rigs.removeRig(this.sql, rigId);


[WARNING]: removeRig doesn't clean up agents or beads with matching rig_id

Now that agents and beads have a rig_id column, deleting a rig leaves orphaned agents that schedulePendingWork will try to dispatch. Since the rig config is also deleted (line 197), these agents will hit the !rigConfig path on every alarm cycle, incrementing dispatch_attempts until the circuit breaker marks their beads as failed.

Consider also deleting (or re-assigning) agents/beads whose rig_id matches the deleted rig.

… handlers (#419) - All E2E tests updated from /api/rigs/:rigId to /api/towns/:townId/rigs/:rigId - handleCreateBead: passes params.rigId as rig_id to createBead - handleRegisterAgent: passes params.rigId as rig_id to registerAgent - handleCreateEscalation: passes params.rigId as rig_id to createBead - Test 29: fixed jq parsing for tRPC fallback, use BASE_URL not WRANGLER_URL All 29 E2E tests pass.

kiloconnect · 2026-02-22T05:46:04Z

cloudflare-gastown/wrangler.jsonc

    { "tag": "v2", "new_sqlite_classes": ["TownContainerDO"] },
    { "tag": "v3", "new_sqlite_classes": ["MayorDO"] },
    { "tag": "v4", "new_sqlite_classes": ["TownDO"] },
+    { "tag": "v5", "new_sqlite_classes": ["AgentDO"], "deleted_classes": ["RigDO", "MayorDO"] },


[WARNING]: Destructive migration — deleted_classes: ["RigDO", "MayorDO"] permanently destroys all data stored in existing RigDO and MayorDO Durable Object instances.

If there is any production data in these DOs (rig configs, agent state, mayor sessions), it will be irrecoverably lost when this migration runs. Ensure either:

All data has been migrated to TownDO before deploying, or

This is acceptable because the system is not yet in production with real user data.

kiloconnect · 2026-02-22T05:46:04Z

src/lib/gastown/gastown-client.ts

+  if (init?.body && typeof init.body === 'string') {
+    try {
+      const bodyKeys = Object.keys(JSON.parse(init.body));
+      console.log(`${CLIENT_LOG} ${method} ${path} bodyKeys=[${bodyKeys.join(',')}]`);


[SUGGESTION]: Debug logging of request body keys on every API call — this console.log fires on every POST/PUT/PATCH to the gastown service from the Next.js server. Consider gating behind a debug flag or removing before merge to reduce production log noise.

kiloconnect bot reviewed Feb 20, 2026

View reviewed changes

cloudflare-gastown/src/dos/town/container-dispatch.ts Outdated Show resolved Hide resolved

cloudflare-gastown/src/dos/town/rigs.ts Outdated Show resolved Hide resolved

cloudflare-gastown/src/dos/town/rigs.ts Outdated Show resolved Hide resolved

jrf0110 added 4 commits February 20, 2026 13:56

kiloconnect bot reviewed Feb 20, 2026

View reviewed changes

cloudflare-gastown/src/dos/Town.do.ts Show resolved Hide resolved

cloudflare-gastown/src/dos/Town.do.ts Outdated Show resolved Hide resolved

cloudflare-gastown/src/dos/Town.do.ts Outdated Show resolved Hide resolved

cloudflare-gastown/src/handlers/mayor.handler.ts Show resolved Hide resolved

kiloconnect bot reviewed Feb 20, 2026

View reviewed changes

cloudflare-gastown/src/dos/town/review-queue.ts Outdated Show resolved Hide resolved

cloudflare-gastown/src/dos/town/review-queue.ts Show resolved Hide resolved

kiloconnect bot reviewed Feb 20, 2026

View reviewed changes

cloudflare-gastown/src/dos/Town.do.ts Show resolved Hide resolved

cloudflare-gastown/src/dos/TownContainer.do.ts Outdated Show resolved Hide resolved

jrf0110 added 4 commits February 20, 2026 15:10

fix(gastown): install both glibc and musl CLI binary variants in cont…

738d863

…ainer (#419) The CLI's binary resolver may pick either variant depending on libc detection. Install both to prevent ENOENT when the musl variant is selected on glibc-based containers (and vice versa).

kiloconnect bot reviewed Feb 20, 2026

View reviewed changes

cloudflare-gastown/container/src/process-manager.ts Show resolved Hide resolved

cloudflare-gastown/container/src/process-manager.ts Show resolved Hide resolved

jrf0110 added 2 commits February 20, 2026 15:42

kiloconnect bot reviewed Feb 20, 2026

View reviewed changes

cloudflare-gastown/src/dos/town/config.ts Show resolved Hide resolved

cloudflare-gastown/src/dos/Town.do.ts Outdated Show resolved Hide resolved

cloudflare-gastown/container/src/control-server.ts Show resolved Hide resolved

debug(gastown): add logging to configureRig for token propagation tra…

eceba00

…ce (#419)

kiloconnect bot reviewed Feb 20, 2026

View reviewed changes

cloudflare-gastown/src/handlers/mayor.handler.ts Outdated Show resolved Hide resolved

cloudflare-gastown/src/dos/Town.do.ts Show resolved Hide resolved

cloudflare-gastown/src/dos/Town.do.ts Outdated Show resolved Hide resolved

jrf0110 added 3 commits February 20, 2026 17:35

kiloconnect bot reviewed Feb 21, 2026

View reviewed changes

jrf0110 added 2 commits February 20, 2026 18:44

kiloconnect bot reviewed Feb 21, 2026

View reviewed changes

jrf0110 added 3 commits February 20, 2026 20:40

debug(gastown): add detailed logging to startAgentInContainer and sen…

ff4d4be

…dMayorMessage (#419) Logs exactly what's in envVars, whether JWT was minted, whether kilocodeToken came from params or town config, and what resolveKilocodeToken found.

debug(gastown): log containerStatus on every sendMayorMessage call (#419

5b88751

) The debug logs from the else branch aren't appearing because the mayor is already running. This log fires unconditionally to show what containerStatus.status is and whether isAlive is true.

kiloconnect bot reviewed Feb 21, 2026

View reviewed changes

jrf0110 added 3 commits February 21, 2026 20:37

kiloconnect bot reviewed Feb 22, 2026

View reviewed changes

jrf0110 added 2 commits February 21, 2026 21:03

kiloconnect bot reviewed Feb 22, 2026

View reviewed changes

jrf0110 added 2 commits February 21, 2026 21:29

kiloconnect bot reviewed Feb 22, 2026

View reviewed changes

fix(gastown): use actual rigId in agent-events persistence path (#419)

d1b0079

Was using '_' as a placeholder for rigId in the agent-events POST path. The ManagedAgent already has rigId from the start request — use it.

kiloconnect bot reviewed Feb 22, 2026

View reviewed changes

debug(gastown): add logging to schedulePendingWork for agent dispatch…

1d29dfd

… diagnosis (#419)

kiloconnect bot reviewed Feb 22, 2026

View reviewed changes

jrf0110 added 2 commits February 21, 2026 22:38

kiloconnect bot reviewed Feb 22, 2026

View reviewed changes

		@@ -0,0 +1,131 @@
		[33m▲ [43;33m[[43;30mWARNING[43;33m][0m [1mUnrecognised WRANGLER_LOG value "/users/john/projects/professional/kilo/cloud/cloudflare-gastown/test/e2e/.wrangler-output.log", expected "none" \| "error" \| "warn" \| "info" \| "log" \| "debug", defaulting to "log"...[0m

	const rigList = rigs.listRigs(this.sql);
	const rigId = rigList[0]?.id ?? '';

	await dispatch.startAgentInContainer(this.env, this.ctx.storage, {
	const started = await dispatch.startAgentInContainer(this.env, this.ctx.storage, {

	return;
	console.warn(`${TOWN_LOG} alarm: missing townId; skipping`);
	await this.ctx.storage.setAlarm(Date.now() + IDLE_ALARM_INTERVAL_MS);


		// ── Name Allocation ─────────────────────────────────────────────────

		export function allocatePolecatName(sql: SqlStorage, rigId: string): string {

	import { createOpencode, createOpencodeClient, type OpencodeClient } from '@kilocode/sdk';
	import { createOpencode, type OpencodeClient } from '@kilocode/sdk';

Comments

Conversation

jrf0110 commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Phase A: Data Consolidation (TownDO + AgentDO)

Phase B: SDK-Based Agent Management

Phase C: WebSocket Streaming

Phase D: Proactive Startup + Config-on-Request

Tests + Typecheck

Wrangler Changes

Net Impact

Open Questions

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kiloconnect bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Overview

Uh oh!

jrf0110 commented Feb 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jrf0110 commented Feb 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 21, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

kiloconnect bot Feb 22, 2026

Choose a reason for hiding this comment

Uh oh!

jrf0110 commented Feb 20, 2026 •

edited

Loading

kiloconnect bot commented Feb 20, 2026 •

edited

Loading