Skip to content

Smart routing — CEO learns to route agents by cost, accuracy, and latency #207

@akashgit

Description

@akashgit

Problem

Today all agents use the same model/runner regardless of task complexity. The Researcher gets the same model as the Archivist. There's no cost tracking per agent, no latency analysis, and no feedback loop from experiment outcomes to routing decisions.

Current state:

  • Model is globally set via env var or --runner flag — all agents get the same one
  • Bob runner tracks invocation counts and duration, but Claude runner tracks nothing
  • No token usage or cost attribution per agent role
  • No correlation between model choice and experiment success
  • FEEC strategy doesn't consider cost or time — just keyword-based priority

What's needed

The CEO should learn over time how to route work to minimize cost and latency while maintaining accuracy:

  1. Telemetry layer — track tokens, cost, latency, and outcome per (agent role, task type, model) tuple. Extend the existing events.jsonl infrastructure
  2. Routing statistics — build up a table: for each (role, hypothesis category), which model has the best success rate, cost, and latency?
  3. Smart routing decisions — before spawning an agent, CEO (or a dedicated router) selects model based on:
    • Task complexity (scope of hypothesis, number of files likely affected)
    • Budget remaining in cycle
    • Historical success rate for this role + task type + model combo
    • Latency constraints (is this blocking the critical path?)
  4. Graceful degradation — when approaching budget ceiling, automatically downshift to cheaper models for non-critical roles (Archivist, Researcher on simple lookups)
  5. ACE integration — feed routing statistics into playbook evolution so routing preferences are learned cross-project

Could be implemented as routing logic in runner.py, or as a dedicated Router agent that the CEO consults before each spawn.

Example routing decisions

Role Task Type Route To Why
Builder Simple bugfix Haiku/Sonnet Low complexity, high success rate on cheap models
Builder Complex feature Opus Needs deep reasoning, worth the cost
Reviewer Guard check Opus Precision matters, false negatives are expensive
Archivist Record keeping Haiku Append-only, minimal reasoning needed
Researcher Deep analysis Opus Thorough research pays off
Researcher Quick lookup Sonnet Simple web search, fast turnaround

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions