From 53dbd6eb26d369de2be2adfeda76fdec7e65c996 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Fri, 20 Mar 2026 05:45:43 +0000
Subject: [PATCH 1/7] Initial plan


From af66230c8a6456e62f4a69770d85e6b2d4c3446b Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Fri, 20 Mar 2026 05:49:45 +0000
Subject: [PATCH 2/7] =?UTF-8?q?docs:=20rewrite=20README=20with=20compellin?=
 =?UTF-8?q?g=20problem=20=E2=86=92=20solution=20narrative?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Co-authored-by: dgenio <12731907+dgenio@users.noreply.github.com>
---
 README.md | 196 +++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 157 insertions(+), 39 deletions(-)

diff --git a/README.md b/README.md
index ba8184c..752a867 100644
--- a/README.md
+++ b/README.md
@@ -1,33 +1,96 @@
 # contextweaver
 
-> Dynamic context management for tool-using AI agents.
+> Phase-specific, budget-aware context compilation for tool-using AI agents.
 
-contextweaver solves the **context window problem**: as tool catalogs grow and
-conversations accumulate history, naive concatenation blows past token limits.
-contextweaver provides **phase-specific budgeted context compilation**, a
-**context firewall** for large tool outputs, **result envelopes** with
-structured fact extraction, and **bounded-choice routing** over large tool
-catalogs via DAG + beam search.
+**536 tests passing · zero runtime dependencies · deterministic output · Python ≥ 3.10**
 
-## Features
+---
 
-- **Context Engine** — seven-stage pipeline that compiles a phase-aware,
-  budget-constrained prompt from the event log.
-- **Context Firewall** — intercepts large tool outputs, stores raw data
-  out-of-band, and injects compact summaries.
-- **Routing Engine** — navigates catalogs of 100+ tools via a bounded DAG
-  so the LLM only sees a focused shortlist.
-- **Protocol Adapters** — first-class adapters for MCP and A2A protocols.
-- **Zero Dependencies** — pure Python ≥ 3.10, stdlib only.
-- **Deterministic** — identical inputs always produce identical outputs.
+## The Problem
 
-## Installation
+Imagine a tool-using agent with a 100-tool catalog and a 50-turn conversation history.
+At each step the agent must answer four questions:
+
+1. **Route** — which tool should I call?
+2. **Call** — what arguments?
+3. **Interpret** — what did it return?
+4. **Answer** — how do I respond to the user?
+
+**Naive approach A — concatenate everything:**
+
+```
+100 tool schemas (≈50k tokens) + 50 turns (≈30k tokens) = 80k tokens
+Token limit: 8k → 10× overflow
+```
+
+**Naive approach B — cherry-pick manually:**
+
+```
+Pick 10 tools, last 5 turns → lose dependency chains
+Agent hallucinates tool calls, repeats questions, forgets context
+```
+
+**contextweaver approach — phase-specific budgeted compilation:**
+
+```
+Route phase:  5 tool cards (≈500 tokens), no full schemas
+Answer phase: 3 relevant turns + dependency closure (≈2k tokens)
+Result:       2.5k tokens, complete context, deterministic
+```
+
+See [`examples/before_after.py`](examples/before_after.py) for a runnable side-by-side comparison.
+
+---
+
+## How contextweaver Solves It
+
+contextweaver provides two cooperating engines:
+
+```
+                ┌────────────────────────────┐
+  Events ──────>│      Context Engine         │──> ContextPack (prompt)
+                │  candidates → closure →     │
+                │  sensitivity → firewall →   │
+                │  score → dedup → select →   │
+                │  render                     │
+                └────────────────────────────┘
+                           ▲ facts / episodes
+                ┌──────────┴─────────────────┐
+  Tools ───────>│      Routing Engine         │──> ChoiceCards
+                │  Catalog → TreeBuilder →    │
+                │  ChoiceGraph → Router       │
+                └────────────────────────────┘
+```
+
+**Context Engine** — eight-stage pipeline:
+
+1. **generate_candidates** — pull events from the log; inject episodic memory and facts.
+2. **dependency_closure** — if a selected item has a `parent_id`, include the parent automatically.
+3. **sensitivity_filter** — drop or redact items at or above the configured sensitivity floor.
+4. **apply_firewall** — large tool outputs are summarised; raw bytes move to `ArtifactStore`.
+5. **score_candidates** — rank by recency, tag match, kind priority, and token cost.
+6. **deduplicate_candidates** — remove near-duplicates using Jaccard similarity.
+7. **select_and_pack** — greedily pack highest-scoring items into the phase token budget.
+8. **render_context** — assemble final prompt string with `BuildStats` metadata.
+
+**Routing Engine** — four-stage pipeline:
+
+1. **Catalog** — register and manage `SelectableItem` objects.
+2. **TreeBuilder** — convert a flat catalog into a bounded `ChoiceGraph` DAG.
+3. **Router** — beam-search over the graph; deterministic tie-breaking by ID.
+4. **ChoiceCards** — compact, LLM-friendly cards (never includes full schemas).
+
+---
+
+## Quickstart
+
+### Install
 
 ```bash
 pip install contextweaver
 ```
 
-Or install from source:
+Or from source:
 
 ```bash
 git clone https://github.com/dgenio/contextweaver.git
@@ -35,7 +98,7 @@ cd contextweaver
 pip install -e ".[dev]"
 ```
 
-## Quick start
+### Minimal agent loop
 
 ```python
 from contextweaver.context.manager import ContextManager
@@ -43,24 +106,25 @@ from contextweaver.types import ContextItem, ItemKind, Phase
 
 mgr = ContextManager()
 mgr.ingest(ContextItem(id="u1", kind=ItemKind.user_turn, text="How many users?"))
-mgr.ingest(ContextItem(id="tc1", kind=ItemKind.tool_call, text="db_query('SELECT COUNT(*) FROM users')", parent_id="u1"))
-mgr.ingest(ContextItem(id="tr1", kind=ItemKind.tool_result, text="count: 1042", parent_id="tc1"))
+mgr.ingest(ContextItem(id="tc1", kind=ItemKind.tool_call,
+                       text="db_query('SELECT COUNT(*) FROM users')", parent_id="u1"))
+mgr.ingest(ContextItem(id="tr1", kind=ItemKind.tool_result,
+                       text="count: 1042", parent_id="tc1"))
 
 pack = mgr.build_sync(phase=Phase.answer, query="user count")
-print(pack.prompt)       # budget-aware compiled context
-print(pack.stats)        # what was kept, dropped, deduplicated
+print(pack.prompt)   # budget-aware compiled context
+print(pack.stats)    # what was kept, dropped, deduplicated
 ```
 
-## Routing large tool catalogs
+### Route a large tool catalog
 
 ```python
 from contextweaver.routing.catalog import Catalog, load_catalog_json
 from contextweaver.routing.tree import TreeBuilder
 from contextweaver.routing.router import Router
 
-items = load_catalog_json("catalog.json")
 catalog = Catalog()
-for item in items:
+for item in load_catalog_json("catalog.json"):
     catalog.register(item)
 
 graph = TreeBuilder(max_children=10).build(catalog.all())
@@ -69,14 +133,58 @@ result = router.route("send a reminder email about unpaid invoices")
 print(result.candidate_ids)
 ```
 
+---
+
+## Framework Integrations
+
+| Framework | Guide | Use Case |
+|---|---|---|
+| MCP | [Guide](docs/integration_mcp.md) | Tool conversion, session loading, firewall |
+| A2A | [Guide](docs/integration_a2a.md) | Agent cards, multi-agent sessions |
+| LlamaIndex | Guide (coming soon) | RAG + tools with budget control |
+| OpenAI Agents SDK | Guide (coming soon) | Function-calling agents with routing |
+| Google ADK | Guide (coming soon) | Gemini tool-use with context budgets |
+| LangChain / LangGraph | Guide (coming soon) | Chain + graph agents with firewall |
+
+---
+
+## Why Trust contextweaver?
+
+| Proof point | Detail |
+|---|---|
+| **536 tests passing** | Context pipeline, routing engine, firewall, adapters, CLI, sensitivity enforcement |
+| **Zero runtime dependencies** | Stdlib-only, Python ≥ 3.10. Works with any LLM provider. No vendor lock-in. |
+| **Deterministic** | Tie-break by ID, sorted keys. Identical inputs always produce identical outputs. |
+| **Protocol-based stores** | `EventLog`, `ArtifactStore`, `EpisodicStore`, `FactStore` are `typing.Protocol` interfaces — swap any backend. |
+| **MCP + A2A adapters** | First-class support for both emerging agentic standards. |
+| **`BuildStats` transparency** | Every context build reports exactly what was kept, dropped, deduplicated, and why. |
+
+---
+
+## Core Concepts
+
+| Concept | Description |
+|---|---|
+| `ContextItem` | Atomic event log entry: user turn, agent message, tool call, tool result, fact, plan state. |
+| `Phase` | `route` / `call` / `interpret` / `answer` — each with its own token budget. |
+| `ContextFirewall` | Intercepts large tool outputs: stores raw bytes out-of-band, injects compact summary. |
+| `ChoiceGraph` | Bounded DAG over the tool catalog. Router beam-searches it; LLM sees only a focused shortlist. |
+| `ResultEnvelope` | Structured tool output: summary + extracted facts + artifact handles + views. |
+| `BuildStats` | Per-build diagnostics: candidate count, included/dropped counts, token usage, drop reasons. |
+
+See [`docs/concepts.md`](docs/concepts.md) for the full glossary and
+[`docs/architecture.md`](docs/architecture.md) for pipeline detail and design rationale.
+
+---
+
 ## CLI
 
 contextweaver ships with a CLI for quick experimentation:
 
 ```bash
-contextweaver demo                          # end-to-end demonstration
-contextweaver init                          # scaffold config + sample catalog
-contextweaver build --catalog c.json --out g.json  # build routing graph
+contextweaver demo                                    # end-to-end demonstration
+contextweaver init                                    # scaffold config + sample catalog
+contextweaver build --catalog c.json --out g.json    # build routing graph
 contextweaver route --graph g.json --query "send email"
 contextweaver print-tree --graph g.json
 contextweaver ingest --events session.jsonl --out session.json
@@ -94,18 +202,11 @@ contextweaver replay --session session.json --phase answer
 | `mcp_adapter_demo.py` | MCP adapter: tool conversion, session loading, firewall |
 | `a2a_adapter_demo.py` | A2A adapter: agent cards, multi-agent sessions |
 
-Run all examples:
-
 ```bash
-make example
+make example   # run all examples
 ```
 
-## Documentation
-
-- [Architecture](docs/architecture.md) — package layout, pipeline stages, design principles
-- [Concepts](docs/concepts.md) — ContextItem, phases, firewall, ChoiceGraph, etc.
-- [MCP Integration](docs/integration_mcp.md) — adapter functions, JSONL format, end-to-end example
-- [A2A Integration](docs/integration_a2a.md) — adapter functions, multi-agent sessions
+---
 
 ## Development
 
@@ -119,6 +220,23 @@ make demo     # run the built-in demo
 make ci       # all of the above
 ```
 
+See [CONTRIBUTING.md](CONTRIBUTING.md) for setup instructions.
+
+---
+
+## Roadmap
+
+| Milestone | Status | Highlights |
+|---|---|---|
+| **v0.1 — Foundation** | ✅ complete | Context Engine, Routing Engine, MCP + A2A adapters, CLI, sensitivity enforcement, logging |
+| **v0.2 — Integrations** | 🚧 in progress | Framework integration guides (LlamaIndex, OpenAI Agents SDK, Google ADK, LangChain) |
+| **v0.3 — Tooling** | 📋 planned | DAG visualization, merge compression, LLM-assisted labeler |
+| **Future** | 📋 planned | Context versioning, distributed stores, multi-agent coordination |
+
+See [CHANGELOG.md](CHANGELOG.md) for the detailed release history.
+
+---
+
 ## License
 
 Apache-2.0

From c896cb7bc4d04e3b68d5c846fbbc967284a85dbc Mon Sep 17 00:00:00 2001
From: Diogo Andre Passagem Santos <dandrsantos@corporativo.pt>
Date: Fri, 20 Mar 2026 06:32:59 +0000
Subject: [PATCH 3/7] docs: align stage-1 pipeline wording with implementation

---
 README.md            | 2 +-
 docs/architecture.md | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/README.md b/README.md
index 752a867..e5b0c7d 100644
--- a/README.md
+++ b/README.md
@@ -64,7 +64,7 @@ contextweaver provides two cooperating engines:
 
 **Context Engine** — eight-stage pipeline:
 
-1. **generate_candidates** — pull events from the log; inject episodic memory and facts.
+1. **generate_candidates** — pull phase-relevant events from the log for this request.
 2. **dependency_closure** — if a selected item has a `parent_id`, include the parent automatically.
 3. **sensitivity_filter** — drop or redact items at or above the configured sensitivity floor.
 4. **apply_firewall** — large tool outputs are summarised; raw bytes move to `ArtifactStore`.
diff --git a/docs/architecture.md b/docs/architecture.md
index a6a2359..6c924fa 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -43,8 +43,8 @@ the "context window problem" for tool-using AI agents.
 The Context Engine compiles a phase-aware, budget-constrained prompt from
 the event log. The pipeline has eight stages:
 
-1. **generate_candidates** — pull events from the event log and inject
-   episodic memory and facts into the candidate pool.
+1. **generate_candidates** — pull phase-relevant events from the event log
+   into the initial candidate pool.
 2. **dependency_closure** — if a selected item has a `parent_id`, bring
    the parent along even if it scored lower.
 3. **sensitivity_filter** — drop or redact items whose `sensitivity`

From 599a17f96c0afb87e3bac81ca290003193b1216d Mon Sep 17 00:00:00 2001
From: Diogo Andre Passagem Santos <dandrsantos@corporativo.pt>
Date: Fri, 20 Mar 2026 06:35:27 +0000
Subject: [PATCH 4/7] docs: align firewall wording with implementation

---
 README.md            | 4 ++--
 docs/architecture.md | 6 +++---
 docs/concepts.md     | 7 ++++---
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/README.md b/README.md
index e5b0c7d..5259d30 100644
--- a/README.md
+++ b/README.md
@@ -67,7 +67,7 @@ contextweaver provides two cooperating engines:
 1. **generate_candidates** — pull phase-relevant events from the log for this request.
 2. **dependency_closure** — if a selected item has a `parent_id`, include the parent automatically.
 3. **sensitivity_filter** — drop or redact items at or above the configured sensitivity floor.
-4. **apply_firewall** — large tool outputs are summarised; raw bytes move to `ArtifactStore`.
+4. **apply_firewall** — tool results are stored out-of-band; large outputs are summarized/truncated before prompt assembly.
 5. **score_candidates** — rank by recency, tag match, kind priority, and token cost.
 6. **deduplicate_candidates** — remove near-duplicates using Jaccard similarity.
 7. **select_and_pack** — greedily pack highest-scoring items into the phase token budget.
@@ -167,7 +167,7 @@ print(result.candidate_ids)
 |---|---|
 | `ContextItem` | Atomic event log entry: user turn, agent message, tool call, tool result, fact, plan state. |
 | `Phase` | `route` / `call` / `interpret` / `answer` — each with its own token budget. |
-| `ContextFirewall` | Intercepts large tool outputs: stores raw bytes out-of-band, injects compact summary. |
+| `ContextFirewall` | Intercepts tool results: stores raw bytes out-of-band, injects compact summary (with truncation for large outputs). |
 | `ChoiceGraph` | Bounded DAG over the tool catalog. Router beam-searches it; LLM sees only a focused shortlist. |
 | `ResultEnvelope` | Structured tool output: summary + extracted facts + artifact handles + views. |
 | `BuildStats` | Per-build diagnostics: candidate count, included/dropped counts, token usage, drop reasons. |
diff --git a/docs/architecture.md b/docs/architecture.md
index 6c924fa..d49ee89 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -49,9 +49,9 @@ the event log. The pipeline has eight stages:
    the parent along even if it scored lower.
 3. **sensitivity_filter** — drop or redact items whose `sensitivity`
    level meets or exceeds `ContextPolicy.sensitivity_floor`.
-4. **apply_firewall** — large tool results (above threshold) are
-   summarised; the raw output is stored in the ArtifactStore and replaced
-   with a compact reference + summary.
+4. **apply_firewall** — tool results are stored out-of-band in the
+   ArtifactStore and replaced with summarized/truncated text for prompt
+   assembly.
 5. **score_candidates** — rank candidates by recency, tag match, kind
    priority, and token cost.
 6. **deduplicate_candidates** — remove near-duplicate items using Jaccard
diff --git a/docs/concepts.md b/docs/concepts.md
index 9731c12..e9337c9 100644
--- a/docs/concepts.md
+++ b/docs/concepts.md
@@ -46,9 +46,10 @@ Key fields: `id`, `kind`, `name`, `description`, `tags`, `namespace`,
 
 ## Context Firewall
 
-The context firewall prevents large tool outputs from consuming the
-entire token budget. When a tool result exceeds the configured threshold
-(default 2 000 characters), the firewall:
+The context firewall intercepts `tool_result` items before raw output
+reaches the prompt. It stores the raw output in the `ArtifactStore`,
+replaces the prompt-facing text with a compact summary, and prevents
+large tool outputs from consuming the entire token budget. In practice:
 
 1. Stores the raw output in the `ArtifactStore`.
 2. Generates a compact summary using the `Summarizer`.

From 0e26d53dd2ab22f632b1c2851749aef281509b4d Mon Sep 17 00:00:00 2001
From: Diogo Andre Passagem Santos <dandrsantos@corporativo.pt>
Date: Fri, 20 Mar 2026 06:36:54 +0000
Subject: [PATCH 5/7] docs: make README test count resilient

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 5259d30..29458cb 100644
--- a/README.md
+++ b/README.md
@@ -2,7 +2,7 @@
 
 > Phase-specific, budget-aware context compilation for tool-using AI agents.
 
-**536 tests passing · zero runtime dependencies · deterministic output · Python ≥ 3.10**
+**500+ tests passing · zero runtime dependencies · deterministic output · Python ≥ 3.10**
 
 ---
 
@@ -152,7 +152,7 @@ print(result.candidate_ids)
 
 | Proof point | Detail |
 |---|---|
-| **536 tests passing** | Context pipeline, routing engine, firewall, adapters, CLI, sensitivity enforcement |
+| **500+ tests passing** | Context pipeline, routing engine, firewall, adapters, CLI, sensitivity enforcement |
 | **Zero runtime dependencies** | Stdlib-only, Python ≥ 3.10. Works with any LLM provider. No vendor lock-in. |
 | **Deterministic** | Tie-break by ID, sorted keys. Identical inputs always produce identical outputs. |
 | **Protocol-based stores** | `EventLog`, `ArtifactStore`, `EpisodicStore`, `FactStore` are `typing.Protocol` interfaces — swap any backend. |

From 2053c96178afd633343d3212c1bd170c063f77fc Mon Sep 17 00:00:00 2001
From: Diogo Andre Passagem Santos <dandrsantos@corporativo.pt>
Date: Fri, 20 Mar 2026 06:44:31 +0000
Subject: [PATCH 6/7] docs: fix architecture pipeline diagram order

---
 docs/architecture.md | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/docs/architecture.md b/docs/architecture.md
index d49ee89..c2682d9 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -8,9 +8,10 @@ the "context window problem" for tool-using AI agents.
 ```
                ┌────────────────────────────┐
   Events ─────>│      Context Engine         │──> ContextPack (prompt)
-               │  candidates → score →       │
-               │  dedup → select → firewall  │
-               │  → prompt                   │
+               │  candidates → closure →     │
+               │  sensitivity → firewall →   │
+               │  score → dedup → select →   │
+               │  render                     │
                └────────────────────────────┘
                           ▲ facts / episodes
                ┌──────────┴─────────────────┐

From cd1727dd9431e308dce2c63b0535875a6b7e8a48 Mon Sep 17 00:00:00 2001
From: Diogo Andre Passagem Santos <dandrsantos@corporativo.pt>
Date: Fri, 20 Mar 2026 06:45:36 +0000
Subject: [PATCH 7/7] docs: clarify firewall fact extraction output

---
 docs/concepts.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/concepts.md b/docs/concepts.md
index e9337c9..96d53c2 100644
--- a/docs/concepts.md
+++ b/docs/concepts.md
@@ -53,7 +53,7 @@ large tool outputs from consuming the entire token budget. In practice:
 
 1. Stores the raw output in the `ArtifactStore`.
 2. Generates a compact summary using the `Summarizer`.
-3. Extracts structured facts for the `FactStore`.
+3. Extracts structured facts into the `ResultEnvelope`.
 4. Replaces the original item text with a summary + artifact reference.
 
 ## Result Envelope