Skip to content

Town-Centric Refactor: Consolidate control plane into TownDO, SDK-based agents, WebSocket streaming #419

@jrf0110

Description

@jrf0110

Overview

Major refactoring of the gastown architecture to simplify the DO topology, replace subprocess-based agent management with the SDK, and establish a clean WebSocket event pipeline from agent to browser.

Full plan: plans/gastown-town-centric-refactor.md

Motivation

  1. Data is fragmented across DOs. Agent state, beads, mail, and review queues live in the Rig DO. Convoys/escalations live in the Town DO. Mayor state lives in a separate Mayor DO. The Town DO has no complete picture.
  2. Too many indirection layers for streaming. Events flow through 6 hops with SSE parsing, ring buffers, and HTTP polling in the middle.
  3. Spawning kilo serve as a subprocess is unnecessary. The @kilocode/sdk provides createOpencode() for in-process server lifecycle and event.subscribe() for typed event streams.
  4. Config goes stale. Container env vars are injected once at boot and never refreshed. Model/token changes require a container restart.
  5. Models threaded through 6 layers. Should be configured at the town level and resolved at dispatch time.

Target Architecture

Three DOs total:

  • TownDO — All control-plane data (rigs, agents, beads, mail, review queues, convoys, escalations, config). Single alarm for scheduling, health, and review processing.
  • AgentDO (one per agent) — Event storage only. Isolates high-volume event streams from the 10GB Town DO budget.
  • TownContainerDO — Container lifecycle, WebSocket relay.

Rig DO and Mayor DO are eliminated. The mayor becomes a regular agent row. Rigs become rows in a Town DO table.

SDK replaces subprocess: createOpencode() replaces Bun.spawn('kilo serve'). client.event.subscribe() replaces SSE parsing + ring buffers.

WebSocket all the way: SDK events → container WS → TownContainerDO WS → browser. No SSE, no polling, no tickets.

Config-on-request: TownDO attaches current resolved config to every container fetch() via X-Town-Config header. Zero staleness.

Implementation Phases

Phase A: Data Consolidation

  • A1 — Add all tables to Town DO + create AgentDO
  • A2 — Route all handlers through Town DO + delete Rig DO and Mayor DO

Phase B: SDK-Based Agent Management

  • B1 — Replace kilo-server.ts with SDK createOpencode()
  • B2 — Replace SSE consumer with SDK event.subscribe()
  • B3 — Replace kilo-client.ts with SDK client

Phase C: WebSocket Streaming

  • C1 — Container WebSocket endpoint (multiplexed /ws)
  • C2 — TownContainerDO WebSocket relay
  • C3 — Browser WebSocket client

Phase D: Proactive Startup & Config Cleanup

  • D1 — Proactive container + mayor startup on town creation
  • D2 — Config at rest + config-on-request (eliminate model pass-through, stale injection)

Files Deleted

  • container/src/kilo-server.ts
  • container/src/kilo-client.ts
  • container/src/sse-consumer.ts
  • src/dos/Rig.do.ts
  • src/dos/Mayor.do.ts

Risk Mitigation

  • Data migration: New towns start fresh. Existing towns get a migration alarm.
  • WebSocket reliability: Browser reconnects with backfill from AgentDO.
  • SDK compatibility: Verify createOpencode() works in Bun container.
  • DO storage: Agent events isolated in per-agent AgentDOs (10GB each). Town DO stores only bounded control-plane data.

Parent: #204

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestkilo-auto-fixAuto-generated label by Kilokilo-triagedAuto-generated label by Kilo

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions