From 8c1ec75ebaaed125a1ddf957381fb928052708a5 Mon Sep 17 00:00:00 2001 From: cyber-ayi <259769279+cyber-ayi@users.noreply.github.com> Date: Wed, 3 Jun 2026 16:24:19 -0700 Subject: [PATCH] docs(exploration): prior-art survey for L2 runtimes + L3 cognition MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fills the T2/T4 context gap — the L2/L3 prior-art survey was only in chat, never written to docs (grep found just scattered name-drops). - exploration/prior-art-l2-l3.md — current (2026) community/research work for the harness runtime (L2: ElizaOS/OpenClaw/Hermes/Voyager/PIANO) and cognition layer (L3: Generative Agents, Letta/MemGPT/Mem0, Colas/MAGELLAN, needs/personality frameworks), mapped to our needs. Two tensions: optimization bias is everywhere (borrow goal-generation, drop the reward objective — adr-0004); the "non-optimal believable long-horizon" gap is confirmed = the project's contribution. Recommendations feed T2 (runtime: ElizaOS-core + OpenClaw-gateway + Hermes- fallback) and T4 (reuse Generative Agents memory/reflection + Letta/Mem0; goal-gen from Colas minus learning-progress; needs/identity from Sophia/needs-emergence). - ROADMAP — harness-runtime gate now cites the lean; note added to Open explorations. Session-Id: 019e8d56-605a-7b45-8ef0-21ee576aa7a9 Agent: cc-rc-bot Co-authored-by: cyber-ayi <259769279+cyber-ayi@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) --- ROADMAP.md | 3 +- exploration/prior-art-l2-l3.md | 93 ++++++++++++++++++++++++++++++++++ 2 files changed, 95 insertions(+), 1 deletion(-) create mode 100644 exploration/prior-art-l2-l3.md diff --git a/ROADMAP.md b/ROADMAP.md index 8ea1e81..df6b842 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -83,7 +83,7 @@ Each downstream task carries its own small decision-gate. None block T1. | Open decision | Surfaces in | Status | |---|---|---| -| harness **runtime**: OpenClaw vs ElizaOS vs Hermes | T2 / T5 | candidates listed in `harness/README.md`, unchosen | +| harness **runtime**: OpenClaw vs ElizaOS vs Hermes | T2 / T5 | unchosen; **lean = ElizaOS-core + OpenClaw-gateway + Hermes-fallback** per `exploration/prior-art-l2-l3.md` | | **IPC transport** TS↔Python (how JSON crosses the seam) | T5 | explicitly deferred by ADR-0006 | | Melvor **act path**: mod API vs CDP / headless-Chromium | T2 | `TASKS.md` requires choose-and-justify | | **salient push** channel: ntfy vs Discord | T3 | **converging → MVP: AstrBot over Discord, two-way** (transport-not-brain; `discord.py` fallback). See `exploration/gateway-selection.md` | @@ -96,6 +96,7 @@ Evaluations still in motion live in `exploration/` and resolve into ADRs. |---|---|---| | `exploration/substrate-selection.md` | **resolved → adr-0007** | substrate ladder: Melvor P0, Stardew P1+ (dual-control staged to P1) | | `exploration/gateway-selection.md` | converging | operator↔agent channel; MVP = AstrBot over Discord, two-way (transport-not-brain; `discord.py` fallback) | +| `exploration/prior-art-l2-l3.md` | converging | L2 runtimes + L3 cognition survey; feeds T2 (runtime) and T4 (drive design): reuse-vs-refuse, drop the optimization objective | ## Governance / ops track (parallel to the product) diff --git a/exploration/prior-art-l2-l3.md b/exploration/prior-art-l2-l3.md new file mode 100644 index 0000000..cc5ed96 --- /dev/null +++ b/exploration/prior-art-l2-l3.md @@ -0,0 +1,93 @@ +--- +topic: prior-art-l2-l3 +status: converging +date: 2026-06-03 +related-adrs: [adr-0004, adr-0006] +resolves-to: +--- + +# Prior art — L2 runtimes & L3 cognition (what to build on, what to refuse) + +> Status: **converging**. A survey of current (2026) community/research work for +> the harness runtime (L2) and the cognition layer (L3), mapped to our needs. +> Feeds T2 (runtime choice) and T4 (drive-layer design). Resolves into ADRs when +> those are committed. + +## Question + +What existing L2 runtimes and L3 cognition systems do we stand on, and which +carry biases (reward/optimization) we must **not** adopt given autotelic-not-reward +(`adr-0004`)? + +## Criteria + +- **L2 harness:** runtime + perception↔action loop, local-model friendly, async + operator channel, tolerant of dual-control; **no built-in reward/curriculum bias**. +- **L3 cognition:** autotelic-not-reward, identity + needs/drives + memory/reflection, + long-horizon *non-optimal believable* behavior. + +## L2 — harness / runtime / game-playing agents + +| Impl | What | Fits | Don't / risk | +|---|---|---|---| +| **ElizaOS** | TS agent runtime; plugin = actions/providers/**evaluators** loop; PostgreSQL memory; Worlds/Rooms; targets game NPCs/companions | best cognition-friendly harness; TS (adr-0006); "agent-in-world" precedent | heavily **Web3/crypto**-oriented now — use only the core loop | +| **OpenClaw** | TS/Node/Electron; messaging-platform UI; **browser relay over CDP**; Ollama | browser/CDP + **operator async gateway** | heavy (Electron/sandboxes); CDP moot for SMAPI/MCP substrates | +| **Hermes** (NousResearch) | provider-agnostic, **local-first** (Ollama/LM Studio) | local "fallback brain" (privacy routing, adr-0006 rule 6) | needs ≥64k ctx; no world precedent | +| **Voyager / ODYSSEY / MindForge / Co-Voyager** | Minecraft open-world agents: **auto-curriculum + skill library** | reference architecture for open-world exploration | **skill-acquisition + curriculum = reward-shaped → conflicts with adr-0004**; reference only | +| **PIANO** (Project Sid / Altera) | parallel multi-stream cognition; agents **generate own goals** from social motivation; 1000+ agents | good **L3** reference (multi-stream + self-generated goals) | civilization/multi-agent framing; not a single-dyad harness | + +## L3 — cognition / memory / autotelic motivation + +| Impl | What | Fits | Gap / risk | +|---|---|---|---| +| **Generative Agents** (Park 2023) | memory stream + **reflection** + planning | memory/reflection **paradigm reusable directly** | behavior still serves schedules/goals | +| **Letta/MemGPT · Mem0** | tiered memory (core/recall/archival), auto promote/compress | **memory infrastructure off the shelf** | memory layer only; no drive/identity | +| **autotelic line** (Colas; **MAGELLAN** ICML'25; "Beyond Utility" NeurIPS'25) | agents self-generate NL goals; MAGELLAN uses **learning-progress (LP)** to guide goal choice | goal-generation machinery = the selector's theoretical core | **LP is an intrinsic reward → still optimizing**; borrow the mechanism, **drop the optimization objective** | +| **needs / personality / artificial life** (**Sophia** 2512.18202; "personality from **needs alone**"; evolving_personality; SPeCtrum) | personality/behavior **emerging from basic needs**; persistent identity | closest to the identity+needs engine | social-emergence framing; known **persona drift** + convergence to "average persona" | + +## Two tensions + +1. **Optimization bias is everywhere.** Open-world agents (Voyager/PIANO) and + "autotelic RL" (Colas/MAGELLAN learning-progress) ultimately **optimize** + (skills / curriculum / LP). Our stance ("purposeless, non-optimal, an end in + itself") is *more radical*. → borrow goal-generation machinery, **deliberately + discard the optimization objective**, or the drive layer collapses back into a + reward maximizer (`adr-0004`). +2. **The gap is confirmed.** "Persistent personality/needs sustaining long-horizon + *non-optimal believable* behavior" is essentially unstudied — exactly the niche + `design/autotelic-drives.md` claims. The 2026 needs/artificial-life line is the + closest but still social-emergence-framed and drift-prone. **This unfilled niche + is the project's contribution.** + +## Recommendations + +- **L2 (don't build a runtime — adopt one):** **ElizaOS** core (cognition-friendly + loop + memory + world precedent, TS) — *strip the crypto*; **OpenClaw** for + browser/CDP + the operator async gateway when needed; **Hermes** as the local-first + fallback brain. Voyager/PIANO are **reference architectures only** (reward bias). +- **L3 (self-build — it's the IP — but stand on giants):** reuse **Generative Agents** + memory+reflection + **Letta/Mem0** for storage; take goal-generation from + **Colas/MAGELLAN** but **cut the learning-progress reward**; take identity+needs + from the **needs-emergence/Sophia** line. Spend the budget on the unfilled niche: + long-horizon, non-optimal, believable inhabitation. + +## Per-task relevance + +- **T2 (Melvor adapter / runtime):** the L2 table + lean (ElizaOS-core / OpenClaw- + gateway / Hermes-fallback) is the input to the `harness runtime` decision-gate + (ROADMAP). Note: for Melvor/Stardew (mod/SMAPI/MCP adapters) OpenClaw's CDP edge + is less decisive. +- **T4 (drive layer):** the L3 table + the two tensions are the design backdrop — + *what to reuse, what to refuse*. The "drop the optimization objective" rule is the + load-bearing constraint, alongside `adr-0004`. + +## Open items + +- Commit the L2 runtime choice → a `harness runtime` ADR (resolves the ROADMAP gate). +- T4 will likely spawn its own exploration (selector design without an optimization + objective; memory/identity stack choice). + +## Sources + +- L2: [ElizaOS](https://www.elizaos.ai/) · [ElizaOS/OpenClaw/Hermes compared](https://innfactory.ai/en/blog/openclaw-vs-hermes-agent-comparison/) · [OpenClaw browser harness](https://openclawlaunch.com/guides/openclaw-browser-harness) · [Hermes Agent](https://github.com/nousresearch/hermes-agent) · [Voyager](https://voyager.minedojo.org/) · [ODYSSEY](https://openreview.net/pdf?id=vtGLtSxtqv) · [MindForge](https://arxiv.org/pdf/2411.12977) · [Project Sid / PIANO](https://arxiv.org/abs/2411.00114) +- L3: [Generative Agents](https://arxiv.org/pdf/2304.03442) · [Letta/MemGPT vs Mem0](https://vectorize.io/articles/mem0-vs-letta) · [Augmenting Autotelic Agents w/ LLMs (Colas)](https://proceedings.mlr.press/v232/colas23a/colas23a.pdf) · [Colas publications (MAGELLAN)](https://cedriccolas.com/publications/) · [LLM Agents Beyond Utility](https://arxiv.org/abs/2510.14548) · [Sophia: Persistent Agent Framework for Artificial Life](https://arxiv.org/pdf/2512.18202) · [Personality from needs alone](https://www.eurekalert.org/news-releases/1099709) · [SPeCtrum identity](https://arxiv.org/pdf/2502.08599)