Skip to content

MemFleet field report: the fleet fixes hold on a real two-machine deployment, plus two rough edges #36

Description

@Regis-RCR

Ran MemFleet through a full end-to-end coordination exercise on a real two-machine deployment
(one engine, agents reaching it over a private network), driven entirely through the public
fleet_* MCP tools. Two things to report: the fleet fixes from the v0.6.17/v0.6.30 round hold
up under real multi-process use, and two smaller edges remain.

The earlier fleet fixes hold (confirmation)

The fleet issues fixed in v0.6.17 and re-confirmed in v0.6.30 all behave correctly on a current
deployment, exercised cross-process rather than in a single demo:

  • Lease lifecycle is solid across every transition I exercised: an exclusive claim never
    double-grants a second agent on the same scope (it queues, state=requested), renew extends
    it, a higher-priority request preempts a lower one, and releasing auto-grants the next queued
    waiter.
  • Already-running agents see each other in real time. A second client process, started
    separately, published an intent that the first process then saw live through its preflight,
    with the natural-language assignment attached. The audit attributed each action to its own
    session. That is the "refresh before every decision" fix working across processes.
  • Recording edits classifies conflicts into additive / overlap / destructive, a destructive
    overlap raises a Class C escalation with a mediation request and a suggested partition, and
    resolving it propagates the right per-agent directive (the winner reads "proceed", the other
    reads "defer"). The durable audit trail captures every step with actor, agent, and timestamp.

27 of 28 coordination operations behaved as documented. Good shape. The one that did not, plus
one restart edge, are below.

Two rough edges

  • fleet_submit_verdict rejects an argument shape its sibling calls accept. It requires its
    verdict as a structured object and rejects the double-encoded JSON string, while
    fleet_publish_intent and fleet_record_episode explicitly accept that same double-encoded
    form (some MCP clients stringify object arguments). The practical effect: a client that
    stringifies object args can drive the human-resolution path (fleet_resolve_escalation,
    string args) but not the agent-mediator path (fleet_submit_verdict). Accepting the
    double-encoded string in verdict too would remove the asymmetry and make the agent-judge
    flow reachable from those clients.
  • A long-running headless deployment cannot be cleanly restarted (license gate). After the
    initial grace period, restarting the daemon exits with "no offline license installed", and
    under a KeepAlive supervisor it throttle-loops instead of coming back up. Running in an
    offline/headless configuration did not bypass it. So a deployment that has been up for a
    while cannot be restarted without intervention. Either let offline/headless mode start
    without the check, or document the requirement loudly before someone hits it in production. A
    side effect worth noting: a headless engine appears to run without a file watcher, so its
    graph likely goes stale unless something re-indexes it; a documented "keep a headless
    deployment fresh" path would help.

(One already-open item, the opaque "bad intent" error on generic episode metadata, is tracked
separately and not repeated here.)

Net

The coordination model is well thought through, and the most load-bearing result is that the
prior fixes hold up under real cross-process, two-machine use: 27 of 28 operations behaved
exactly as documented. The two edges above are the agent-mediator verdict argument shape and
the headless restart/freshness story. Detail on any of these available on request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions