Skip to content

docs(exploration): substrate option — multi-agent text MUD (Evennia + N×LLM NPCs)#31

Merged
cyber-ayi merged 1 commit into
mainfrom
ops/explore-evennia-multi-agent-mud
Jun 6, 2026
Merged

docs(exploration): substrate option — multi-agent text MUD (Evennia + N×LLM NPCs)#31
cyber-ayi merged 1 commit into
mainfrom
ops/explore-evennia-multi-agent-mud

Conversation

@cyber-ayi

Copy link
Copy Markdown
Collaborator

Summary

New exploration note evaluating whether the ladder (adr-0007) should replace its deferred AI Town slot with a self-hosted text MUD (Evennia, possibly DikuMUD-area-fed) populated by N≥3 autonomous LLM-driven NPCs + one operator + one designated "main agent."

Surfaces three core tensions, gates resolution on three concrete spikes (S1/S2/S3) and an operator UX ack. Status remains open pending operator co-presence ack.

Why this is more than a no-op

Per the companion PR's zh prior-art supplement, GenerativeAgentsCN empirically validates N=25 LLM agents running on local Ollama + Qwen3-4B. Combined with telnet/Evennia integration cost being lower than AI Town's Convex+TS stack and project-stack fit (Python+Ollama already in tree), Option B strictly dominates AI Town on the dimensions AI Town was placed on the ladder for.

Why this isn't a P0/P1 replacement

  • Melvor P0 lock (adr-0003 / adr-0007) holds — multi-agent + non-determinism before the apparatus is proven violates the clean-room principle.
  • Stardew P1 is on the ladder for operator co-presence warmth (named NPCs, calendar, gifts, graphical) — Option B doesn't replicate that.

The three tensions

  1. Dyad-vs-multi-agent attention split — mitigation: foreground/background architectural asymmetry (operator + main agent fully-modeled; NPCs small local + short context + not in main agent's reflection loop). Load-bearing constraint — without it, Option B drifts into "another Generative Agents + Concordia."
  2. Setting choice — wuxia gives a 30-yr zh asset trove but imports a 升级打怪 culture that risks adr-0004; cannot resolve until S2 + operator setting ack.
  3. Determinism is unprecedented at this combination — Paracosm / Miniverse close on a subset, neither MUD-shaped. Either pay the cost (S3) or accept non-CI-replayable runs.

Gating spikes (must all pass)

ID Spike Output
S1 Fork GenerativeAgentsCN; N=25 with Qwen3-4B on operator HW; measure tokens/tick. cost-ceiling number
S2 Import DikuMUD tbamud area into Evennia + skim wuxia-MUD lib (pkuxkx wiki / mudcore). content-track decision
S3 Single-node Ollama + fixed seed + greedy + logical tick + content-addressed LLM-response cache. Byte-equal replay of 50-tick scenario. determinism feasibility
S0 (after S1–S3) AgentScope vs ElizaOS-core comms bench-off. runtime within Option B

Companion PR

#30 — adds the zh supplement to exploration/prior-art-l2-l3.md that this note references via [[prior-art-l2-l3]]. The two PRs are intentionally separate: the prior-art update is independently useful regardless of how Option B resolves.

Test plan

  • Frontmatter validates (status: open, related-adrs cited, resolves-to blank).
  • All [[name]] links resolve to existing notes or are intentional forward references.
  • All external URLs render and are reachable.
  • Read the "Three core tensions" section critically — these are the load-bearing constraints; flag anything that feels weak.
  • harness-ci signal (path-filtered; no harness/ touched, but runs on every PR per recent change).

🤖 Generated with Claude Code

… N×LLM NPCs)

A new exploration evaluating whether the ladder (adr-0007) should replace its
deferred AI Town slot with a self-hosted Evennia + N≥3 autonomous LLM NPCs +
operator + main agent. Surfaces three core tensions, gates resolution on
three concrete spikes (S1/S2/S3) and an operator UX ack.

Why this isn't a no-op:
- Per the companion zh prior-art supplement, GenerativeAgentsCN empirically
  validates N=25 LLM agents on local Ollama + Qwen3-4B — materially de-risks
  the local-cost question that previously made AI Town's slot effectively
  dormant. Combined with telnet/Evennia integration cost being lower than
  AI Town's Convex+TS stack and project-stack fit (Python+Ollama already in
  tree), the option *strictly dominates* AI Town on the dimensions AI Town
  was placed on the ladder for.

Why this isn't proposed as a P0/P1 replacement:
- Melvor P0 lock (adr-0003/0007) holds — multi-agent + non-determinism
  before the apparatus is proven violates the clean-room principle.
- Stardew P1 is on the ladder for *operator co-presence warmth* (named
  NPCs, calendar, gifts, graphical) — Option B doesn't replicate that.

The three core tensions, recorded so future readers can interrogate them:
1. Dyad-vs-multi-agent attention split — load-bearing mitigation:
   foreground/background architectural asymmetry (operator + main agent
   fully-modeled; NPCs small local + short context + not in reflection loop).
2. Setting choice cuts across criteria — wuxia gives a zh asset trove but
   imports a skill-grinding culture that risks adr-0004; cannot resolve
   until S2 + operator ack.
3. Determinism at multi-LLM × persistent world × seeded replay is
   unprecedented — Paracosm/Miniverse close on a subset.

Gating spikes:
- S1: fork GenerativeAgentsCN, measure tokens/tick at N=25 on operator HW.
- S2: import a DikuMUD area into Evennia + skim wuxia-MUD lib for content
  decision input.
- S3: seeded local Ollama + logical tick + content-cached LLM responses;
  byte-equal replay of a 50-tick scripted scenario.
- S0 (after S1–S3): AgentScope vs ElizaOS-core comms bench.

Decision gates and out-of-scope set in note. Status stays `open` pending
operator co-presence UX ack — the blocking signal before any spike.

Sources include LIGHT (Meta 2019, closest historical MUD-AI precedent),
GenerativeAgentsCN, AgentScope, AgentVerse, Concordia, Paracosm,
Miniverse, vLLM reproducibility, DikuMUD `tbamud`, AI People (GoodAI,
closed commercial parallel), and the zh ecosystem links (pkuxkx wiki,
mudcore, mudchina, mud.ren炎黄 MUD).

Companion PR adds the underlying zh supplement to exploration/prior-art-l2-l3.md.

Session-Id: 019e9e62-7e3f-7286-9de2-7b3bc7b9369d
Agent: cc-rc-bot
@cyber-ayi cyber-ayi merged commit bd19f54 into main Jun 6, 2026
1 check passed
@cyber-ayi cyber-ayi deleted the ops/explore-evennia-multi-agent-mud branch June 6, 2026 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant