Skip to content

Research: pluggable non-Docker agent runtime (decouple agent from container) #1206

@vybe

Description

@vybe

Summary

Investigate introducing a second agent runtime kind so an "agent" is no longer necessarily a Docker container running Claude Code. Today every agent is a container (Architectural Invariant #11, "Docker as Source of Truth"). A native, in-process runtime would let purpose-built node types execute custom backend logic — calling the Anthropic SDK directly only for the judgment parts — while reusing the existing agent UX (listing, sharing, access control, channels, operator queue) without per-agent container overhead.

Context

Came out of a design discussion on multi-tenant program management (linked below as the first consumer). For workloads that are mostly deterministic orchestration with occasional LLM judgment, a full Claude Code container per agent is heavy, costly at scale, and drags in slot/heartbeat/stdout-race machinery the work doesn't need. A runtime seam would concentrate that complexity in one tested engine and open the door to future native node types (deterministic routers, webhook processors, aggregators).

The principle that keeps it sane: Trinity owns identity + state + scheduling + delivery; the runtime owns judgment. A native runtime stores its state in the DB and survives reset/redeploy/deletion.

Acceptance Criteria (research deliverables)

  • Decision doc: runtime interface (execute / start / stop / status / capabilities) and where it slots behind task_execution_service / agent_client
  • runtime_kind model: schema change to agent_ownership (claude_code | native), default claude_code; impact on list_all_agents_fast (must union Docker-labeled agents + DB-native rows)
  • Capabilities descriptor design: how a runtime declares supported tabs/features so the frontend renders a subset ("limited-form" tabs — e.g. native hides Files/SSH/Git/Terminal/Sessions/Loops)
  • Audit of container-assuming code sites (docker_service usage, lifecycle, logs/stats/files/ssh/git) and how each is guarded on runtime_kind or routed through the runtime interface
  • Invariant fix: Add process documentation volume mounts for production #11 reconciliation: document that native agents live in the DB, not Docker; propose updated invariant wording
  • Recommendation: build vs. defer, scope estimate, risks/blast radius

Technical Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    complexity-highComplexity: high (board points 13)priority-p2Importantstatus-incubatingIdea under consideration — pre-Todo, not yet greenlit for developmenttheme-infrastructureTheme: Infrastructuretype-featureNew functionality

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions