diff --git a/README.md b/README.md index 41c09d30..f0b67e0c 100644 --- a/README.md +++ b/README.md @@ -650,7 +650,7 @@ mcp_servers: memory: provider: agentmemory -Verify with `curl http://localhost:3111/agentmemory/health`. Open http://localhost:3113 for the real-time viewer. For deeper 6-hook memory provider integration (pre-LLM context injection, turn capture, archived local memory removals, system prompt block), copy integrations/hermes from the agentmemory repo to ~/.hermes/plugins/agentmemory. +Verify with `curl http://localhost:3111/agentmemory/health`. Open http://localhost:3113 for the real-time viewer. For deeper 7-hook memory provider integration (pre-LLM context injection, turn and tool-result capture, archived local memory removals, task completion, system prompt block), copy integrations/hermes from the agentmemory repo to ~/.hermes/plugins/agentmemory. ``` Full guide: [`integrations/hermes/`](integrations/hermes/) diff --git a/docs/todos/2026-06-19-issue-331-hermes-python-plugin-hooks/plan.md b/docs/todos/2026-06-19-issue-331-hermes-python-plugin-hooks/plan.md new file mode 100644 index 00000000..3daec684 --- /dev/null +++ b/docs/todos/2026-06-19-issue-331-hermes-python-plugin-hooks/plan.md @@ -0,0 +1,326 @@ +# Hermes Python Plugin Hooks Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Expose Hermes Python plugin tool-level `post_tool_use` observations and a task-completion lifecycle hook without changing core REST, MCP, schema, auth, or persistence behavior. + +**Architecture:** Keep the change inside the Hermes adapter boundary. `sync_turn()` will prefer explicit completed tool-use/tool-result pairs from `messages`, otherwise preserve the existing conversation fallback; a shared observation helper will keep session/project/cwd/timestamp/agent-id behavior consistent. `on_task_completed()` will emit the existing `task_completed` observe payload shape and be declared as a Hermes Python method-name hook. + +**Tech Stack:** Python Hermes plugin, TypeScript/Vitest subprocess tests, YAML manifest contract. + +--- + +## Files + +- Modify: `integrations/hermes/__init__.py` + - Add helper functions for truncation, message content block walking, tool pair extraction, and shared observe emission. + - Extend `sync_turn()` to emit structured tool observations when explicit pairs exist. + - Add `on_task_completed(..., **kwargs)` and fallback `MemoryProvider` stub method. +- Modify: `integrations/hermes/plugin.yaml` + - Add `on_task_completed` to the declared Hermes lifecycle hooks. +- Modify: `integrations/hermes/README.md` + - Update the Hermes hook badge/count and behavior list from 6 to 7 lifecycle hooks. +- Modify: `README.md` + - Update only the Hermes-specific deeper integration text from 6-hook to 7-hook; do not touch Codex/Factory hook counts. +- Modify: `test/hermes-plugin.test.ts` + - Update expected hook list and manifest/implementation parity coverage. +- Modify: `test/integration-plaintext-http.test.ts` + - Add Python subprocess tests for sync-turn extraction, fallback preservation, malformed input tolerance, agent-id propagation, and task completion payload shape. +- Modify: `docs/todos/2026-06-19-issue-331-hermes-python-plugin-hooks/todo.md` + - Keep progress, review notes, and verification evidence current. + +## Task 1: Add Red Tests For Hermes Hook Behavior + +**Files:** +- Modify: `test/integration-plaintext-http.test.ts` +- Modify: `test/hermes-plugin.test.ts` + +- [x] **Step 1: Add a manifest contract expectation** + +In `test/hermes-plugin.test.ts`, add `on_task_completed` to `expectedHermesHooks`. + +```ts +const expectedHermesHooks = [ + "prefetch", + "sync_turn", + "on_session_end", + "on_pre_compress", + "on_memory_write", + "on_task_completed", + "system_prompt_block", +]; +``` + +- [x] **Step 2: Add Python behavior tests** + +Append focused tests under `describe("Hermes plaintext bearer guard", ...)` or split into a nearby `describe("Hermes hook payloads", ...)` in `test/integration-plaintext-http.test.ts`. + +Test cases: +- `sync_turn extracts Anthropic-style tool_use/tool_result messages` +- `sync_turn preserves conversation fallback when no completed tool pairs exist` +- `sync_turn ignores malformed message history without throwing` +- `sync_turn does not re-emit the same tool_use_id for cumulative histories` +- `sync_turn truncates long string tool output with the Node-hook sentinel` +- `sync_turn truncates large object tool output as serialized JSON with the Node-hook sentinel` +- `on_task_completed emits task_completed observations` + +Use Python subprocess snippets that: +- import `integrations/hermes/__init__.py` with `importlib.util` +- initialize `AgentMemoryProvider` +- monkeypatch `mod._api` before `provider.initialize(...)` so tests never make real startup network calls +- monkeypatch `mod._api_bg` to capture observe payloads synchronously +- assert request body fields instead of making network calls + +Expected Anthropic-style message fixture: + +```python +messages = [ + { + "role": "assistant", + "content": [ + { + "type": "tool_use", + "id": "toolu_1", + "name": "Bash", + "input": {"command": "pwd"}, + } + ], + }, + { + "role": "user", + "content": [ + { + "type": "tool_result", + "tool_use_id": "toolu_1", + "content": "repo\n", + } + ], + }, +] +``` + +Expected observation body: + +```python +{ + "hookType": "post_tool_use", + "sessionId": "session-331", + "project": "/tmp/project", + "cwd": "/tmp/project", + "data": { + "tool_name": "Bash", + "tool_input": {"command": "pwd"}, + "tool_output": "repo\n", + }, +} +``` + +Expected task completion body: + +```python +{ + "hookType": "task_completed", + "sessionId": "session-331", + "data": { + "task_id": "task-1", + "task_subject": "Hermes hooks", + "task_description": "x" * 2000, + "teammate_name": "worker", + "team_name": "team", + }, +} +``` + +Expected long string truncation: + +```python +assert calls[-1]["body"]["data"]["tool_output"] == "x" * 8000 + "\n[...truncated]" +``` + +Expected large object truncation: + +```python +assert calls[-1]["body"]["data"]["tool_output"].endswith("...[truncated]") +assert len(calls[-1]["body"]["data"]["tool_output"]) <= 8014 +``` + +- [x] **Step 3: Run red tests** + +Run: + +```bash +corepack pnpm exec vitest run test/hermes-plugin.test.ts test/integration-plaintext-http.test.ts --exclude test/integration.test.ts +``` + +Expected: FAIL because `on_task_completed` is not implemented/declared and `sync_turn()` still emits only `tool_name: "conversation"`. + +## Task 2: Implement Minimal Hermes Hook Support + +**Files:** +- Modify: `integrations/hermes/__init__.py` +- Modify: `integrations/hermes/plugin.yaml` +- Modify: `integrations/hermes/README.md` if exact hook list text is stale + +- [x] **Step 1: Add helper methods in `AgentMemoryProvider`** + +Add private helpers near `_with_agent_id` / `sync_turn`: + +```python +def _timestamp(self) -> str: + return time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()) + +def _observe_bg(self, hook_type: str, data: dict, session_id: str | None = None) -> None: + _api_bg(self._base, "observe", self._with_agent_id({ + "hookType": hook_type, + "sessionId": session_id or self._session_id, + "project": self._project, + "cwd": self._project, + "timestamp": self._timestamp(), + "data": data, + }), secret=self._secret) +``` + +Use real final names that match local style and keep comments minimal. + +- [x] **Step 2: Add conservative extraction helpers** + +Implement helpers that: +- accept only a list of message dicts +- walk `message["content"]` lists and single dicts +- record calls where `type == "tool_use"` and `id`, `name` exist +- record results where `type == "tool_result"` and `tool_use_id` exists +- emit only completed pairs +- keep a private per-provider set of emitted `(session_id, tool_use_id)` pairs so cumulative `messages` histories do not re-emit old tool observations +- ignore malformed/unsupported content +- truncate string/object outputs to 8000 characters using the same sentinel style as the Node hook + +- [x] **Step 3: Extend `sync_turn()`** + +Change `sync_turn()` so: +- it reads `kwargs.get("messages", [])` +- if completed tool observations exist, emits those observations and skips the conversation fallback +- if all completed pairs were already emitted for that session, preserves the existing conversation fallback instead of duplicating old tool observations +- otherwise emits the current conversation fallback data shape +- it continues honoring `kwargs.get("session_id", self._session_id)` + +- [x] **Step 4: Add `on_task_completed()`** + +Add fallback stub to the local `MemoryProvider` class: + +```python +def on_task_completed(self, **kwargs: Any) -> None: pass +``` + +Add provider method: + +```python +def on_task_completed(self, **kwargs: Any) -> None: + self._observe_bg("task_completed", { + "task_id": kwargs.get("task_id"), + "task_subject": kwargs.get("task_subject"), + "task_description": kwargs.get("task_description")[:2000] if isinstance(kwargs.get("task_description"), str) else "", + "teammate_name": kwargs.get("teammate_name"), + "team_name": kwargs.get("team_name"), + }, session_id=kwargs.get("session_id", self._session_id)) +``` + +Keep the signature permissive so Hermes can pass extra context without breaking the plugin. + +- [x] **Step 5: Update manifest/docs** + +Add `on_task_completed` to `integrations/hermes/plugin.yaml`. + +Update exact Hermes hook-count surfaces: +- `integrations/hermes/README.md` badge text/alt from `Hooks-6_lifecycle` / `6 lifecycle hooks` to 7. +- `integrations/hermes/README.md` prompt text from `6-hook memory provider plugin` to 7. +- `integrations/hermes/README.md` lifecycle bullet list for structured `sync_turn()` tool capture and new `on_task_completed()`. +- `README.md` Hermes-specific deeper integration sentence from `6-hook` to 7. + +Do not update unrelated Codex/Factory hook counts. + +## Task 3: Green Tests, Cleanup, And Verification + +**Files:** +- Verify all changed files. +- Update `docs/todos/2026-06-19-issue-331-hermes-python-plugin-hooks/todo.md`. + +- [x] **Step 1: Run targeted green tests** + +Run: + +```bash +corepack pnpm exec vitest run test/hermes-plugin.test.ts test/integration-plaintext-http.test.ts --exclude test/integration.test.ts +``` + +Expected: PASS. + +- [x] **Step 2: Run Python syntax check without source-tree pycache** + +Run: + +```bash +PYTHONPYCACHEPREFIX=/tmp/agentmemory-pycache python3 -m py_compile integrations/hermes/__init__.py +``` + +Expected: exit 0. + +- [x] **Step 3: Run focused simplification pass** + +Use `$simple-code` on the touched Hermes/plugin/test surface only: +- remove duplicate helper branches +- keep explicit ID-pairing logic readable +- preserve API, manifest, auth, URL, persistence, and observe contracts + +- [x] **Step 4: Run final verification** + +Run: + +```bash +git diff --check +corepack pnpm test +semgrep scan --config p/default --error --metrics=off . +``` + +Expected: +- no whitespace errors +- non-integration suite passes +- Semgrep exits 0 + +Before any commit, stage only task-owned files and run: + +```bash +gitleaks protect --staged --redact +``` + +Expected: exit 0. + +## Task 4: Local GitHub PR Preparation + +**Files:** +- Task-owned changed files only. + +- [x] **Step 1: Run `$github-push-prepare` local branch-prep** + +Allowed by this `$github-feature-loop` invocation: +- local task-owned cleanup +- local staging/commit +- narrow `git fetch origin main` +- merging captured `origin/main` base into this branch if needed +- post-base verification + +Not allowed without separate explicit current-turn approval: +- `git push` +- PR creation +- PR merge +- force push +- destructive cleanup + +- [x] **Step 2: Handoff** + +Report: +- task state path and this plan path +- changed behavior and files +- commits created +- verification/security gate results +- PR base SHA +- whether push/PR creation did not run due to approval gate diff --git a/docs/todos/2026-06-19-issue-331-hermes-python-plugin-hooks/todo.md b/docs/todos/2026-06-19-issue-331-hermes-python-plugin-hooks/todo.md new file mode 100644 index 00000000..a33a0977 --- /dev/null +++ b/docs/todos/2026-06-19-issue-331-hermes-python-plugin-hooks/todo.md @@ -0,0 +1,232 @@ +# Issue 331 Hermes Python Plugin Hooks + +Task id: `2026-06-19-issue-331-hermes-python-plugin-hooks` + +## Scope + +Handle GitHub issue #331 on branch +`issue/331-hermes-python-plugin-hooks` in the isolated Codex worktree +`/Users/A1538552/.codex/worktrees/6c2b/agentmemory`. + +## Sprint Contract + +Goal: validate and, if approved at the plugin-interface checkpoint, implement a +surgical Hermes Python plugin enhancement so Hermes can emit tool-level +`post_tool_use` observations from `sync_turn(..., messages=...)` and expose a +structured task completion hook. + +Scope: +- Target repository remote is only `origin` + (`https://github.com/wbugitlab1/agentmemory.git`). +- Start ref: `origin/main` at + `67bb438b4158d74771ed285e06c9ac078985d603`. +- Primary code surface: `integrations/hermes/__init__.py`. +- Primary tests: Hermes plugin contract tests and direct Python plugin behavior + tests through existing Vitest harnesses. +- Documentation or manifest changes only if needed to keep the Hermes plugin + contract truthful. + +Non-goals: +- Do not target or push to `https://github.com/rohitg00/agentmemory/`. +- Do not add new MCP tools, REST endpoints, persistence schemas, or external + dependencies. +- Do not rewrite the Hermes provider, change auth behavior, or alter core + observe semantics. +- Do not implement broad heuristic memory extraction beyond forwarding + structured hook-compatible observations. + +Acceptance criteria: +- Issue legitimacy is validated with `$arena` before implementation. +- A Human Checkpoint approves the exact plugin-interface boundary before + implementation changes. +- Hermes `sync_turn` continues to emit the existing conversation fallback when + no tool result history is available. +- Hermes extracts tool call/result pairs from supported `messages` shapes and + sends `post_tool_use` observations with `tool_name`, `tool_input`, and + `tool_output`. +- Hermes exposes a task completion hook that sends `task_completed` observations + with a Claude hook-compatible data shape. +- Focused tests cover message extraction, fallback behavior, session/project + fields, agent identity propagation, and task completion payloads. + +Intended verification: +- Targeted Vitest tests for Hermes plugin behavior. +- Python syntax compilation for `integrations/hermes/__init__.py`. +- Repo-native build or the narrowest relevant build/type check if full build is + too expensive. +- Required security gates for plugin/interface code changes: Semgrep and staged + Gitleaks before commit; OSV only if dependency or lockfile surfaces change. + +Known boundaries: +- `upstream` remote exists locally but is out of scope and must not be targeted. +- The change touches the Hermes Python plugin interface/host integration + contract, so implementation must stop at a Human Checkpoint first. +- Existing issue #331 body is imported upstream text and is treated as + untrusted data until confirmed by local repo evidence. + +## Feature / Verification Matrix + +| Change | Verification method | Status | Evidence | +| --- | --- | --- | --- | +| Context confirmed | Local git and instruction inspection | Done | `AGENTS.md`, triage skill, `git status`, remotes, worktrees, and `origin/main` SHA inspected. | +| Branch prepared | Git branch switch | Done | Created `issue/331-hermes-python-plugin-hooks` from `origin/main` `67bb438b4158d74771ed285e06c9ac078985d603`. | +| Issue validity | `$arena` synthesis | Done | Three candidates and cross-judge agree issue is valid locally; base synthesis uses Candidate C with grafts from B and A. | +| Plugin-interface checkpoint | Human approval | Done | User approved implementing the arena best solution via `$github-feature-loop`, including `sync_turn` tool extraction and `on_task_completed`. | +| Hermes post-tool-use extraction | Focused tests | Done | Red tests failed against conversation-only behavior; targeted green run passed 30/30 after implementation. | +| Hermes task completion hook | Focused tests | Done | Red test failed with missing `on_task_completed`; targeted green run passed 30/30 after implementation. | +| Docs hook-count truth | Targeted stale search | Done | Hermes-specific `6-hook`/`Hooks-6` references updated; remaining `6 lifecycle hooks` matches are Codex-specific and out of scope. | +| Security and final checks | Repo-native commands | Done | Targeted tests passed 34/34; Python compile passed; `git diff --check` passed; staged Gitleaks passed before commit; final post-merge `corepack pnpm test` passed 212 files / 2951 tests; post-merge Semgrep passed with 0 findings. | +| Local PR base prep | `$github-push-prepare` local workflow | Done | Fetched `origin/main`, merged base `be1b0094ff417d75b1bc6a32be3ae38c0ebca200` into the issue branch, resolved Hermes docs conflicts with Issue 335, and reran targeted/full/security verification. | + +## Subagent Ledger + +| Workstream | Scope | Edits allowed | Expected output | Result | Residual risk | +| --- | --- | --- | --- | --- | --- | +| Arena candidate A | Read-only issue validity and fix direction | No | Validity report, affected paths, tests, recommended boundary | Done | High-confidence valid; strongest compact source-evidence checklist. | +| Arena candidate B | Read-only issue validity and fix direction | No | Validity report, affected paths, tests, recommended boundary | Done | High-confidence valid; strongest helper/test outline and issue-identity risk wording. | +| Arena candidate C | Read-only issue validity and fix direction | No | Validity report, affected paths, tests, recommended boundary | Done | Medium-high confidence valid; selected as base for conservative explicit tool-use/result pairing. | +| Arena judge | Read-only candidate scoring | No | Rubric scores and base recommendation | Done | Recommended Candidate C as base with corrections, grafts from B/A, and task-completion callback-name checkpoint. | +| Plan architecture reviewer | Read-only plan review | No | High/Medium scope/integration findings only | Done | Initial Medium findings on cumulative-history idempotency and README scope were fixed in `plan.md`; re-review returned ACCEPT. | +| Plan test/verification reviewer | Read-only plan review | No | High/Medium test/verification findings only | Done | Initial Medium findings on `_api` stubbing, truncation tests, and README count/list updates were fixed in `plan.md`; re-review returned ACCEPT. | +| Final security/privacy reviewer | Read-only implementation review | No | Critical/Important findings only | Done | ACCEPT; no actionable security/privacy issue found. | +| Final test coverage reviewer | Read-only implementation review | No | Critical/Important findings only | Done | ACCEPT; targeted tests, compile, and diff-check evidence inspected. | +| Final maintainability/integration reviewer | Read-only implementation review | No | Critical/Important findings only | Done | Initial Important finding on cumulative-history fallback suppression was fixed with a red/green regression; re-review returned ACCEPT. | + +## Progress Notes + +- 2026-06-19: Read active repo instructions and the project-local + `triage-next-github-issues` skill. +- 2026-06-19: Confirmed worktree path, detached start state, remotes, worktree + list, and `origin/main` SHA. Created and switched to + `issue/331-hermes-python-plugin-hooks`. +- 2026-06-19: Public GitHub issue #331 is open on + `wbugitlab1/agentmemory`; local code evidence confirms Hermes `sync_turn` + currently emits only a synthetic conversation `post_tool_use`, while Claude + hook scripts have richer `post_tool_use` and `task_completed` payloads. +- 2026-06-19: Arena completed. Validity decision: valid local Hermes adapter + gap, not already fixed by adjacent Hermes config/manifest work. Synthesis + base is Candidate C: parse only explicit completed tool-use/tool-result + structures paired by IDs, preserve conversation fallback when no supported + pairs exist, and do not declare a task completion hook until the exact Hermes + callback name/signature is approved. Grafts from Candidate B: shared observe + helper, concrete Python subprocess tests, and issue-identity risk wording. + Graft from Candidate A: no REST/MCP/schema/dependency/core-observe changes + are needed. +- 2026-06-19: Cross-judge noted branch is behind current `origin/main`, but the + intervening commits do not touch `integrations/hermes/__init__.py`, + `integrations/hermes/plugin.yaml`, `test/hermes-plugin.test.ts`, or + `test/integration-plaintext-http.test.ts`. The task remains grounded on the + required start ref until integration/PR prep. + +## Arena Synthesis + +Decision: issue #331 is valid for this fork's local `origin` issue. The issue +body is imported from upstream issue #826, so the authoritative task contract is +the local `wbugitlab1/agentmemory#331` body plus repo evidence, not upstream +issue number #331. + +Base: Candidate C. + +Grafts: +- From Candidate B: use a small shared observe helper in the Hermes provider + that preserves `_api_bg`, `_with_agent_id`, session/project/cwd, URL/secret, + and agent ID behavior; add direct Python subprocess red/green tests. +- From Candidate A: keep the change within Hermes plugin code, manifest, docs + if needed, and focused tests; do not add REST endpoints, MCP tools, schema, + iii functions, dependencies, or core observe changes. + +Rejected: +- Broad transcript mining or heuristic experience extraction from assistant + prose. +- Emitting both structured tool observations and the conversation fallback for + the same turn. +- Declaring `TaskCompleted` directly in `plugin.yaml`; Hermes manifest entries + are Python lifecycle method names. +- Adding OpenAI-style `tool_calls` extraction unless implementation can do so + with explicit IDs and no broad role heuristics. + +Checkpoint boundary: +- Proposed implementation uses `sync_turn(..., messages=...)` to extract + explicit Anthropic/Hermes-style `tool_use` and `tool_result` content blocks + paired by `id` / `tool_use_id`; unsupported shapes keep current conversation + fallback. +- Proposed task completion hook name is `on_task_completed`, with a permissive + `**kwargs` signature and `plugin.yaml` entry of the same name. It emits the + existing `task_completed` observe shape: `task_id`, `task_subject`, + `task_description` truncated to 2000 characters, `teammate_name`, and + `team_name`. +- 2026-06-19: User approved implementing the arena best solution via + `$github-feature-loop`. +- 2026-06-19: Wrote `plan.md`, ran two read-only plan review lanes, fixed all + Medium findings in the plan, and received ACCEPT from both lanes. +- 2026-06-19: Added red tests for Hermes manifest parity, structured + `sync_turn` extraction, fallback preservation, malformed messages, + cumulative-history idempotency, large-output truncation, and + `on_task_completed`. First targeted Vitest run failed as expected against + missing hook/extraction behavior after deterministic dependency setup. +- 2026-06-19: Implemented Hermes structured tool extraction, per-session + emitted tool-use ID guard, shared observe helper, `on_task_completed`, + manifest/docs updates, and focused simplification. Targeted Vitest passed + `test/hermes-plugin.test.ts` + `test/integration-plaintext-http.test.ts` + with 30/30 tests. Python compile and `git diff --check` passed. +- 2026-06-19: Final maintainability review found an Important issue where + cumulative histories containing only already-emitted tool IDs suppressed the + conversation fallback. Added a failing regression test, fixed `sync_turn()` to + return early only after emitting at least one new tool observation, and reran + targeted Hermes tests: 31/31 passed. +- 2026-06-19: Final review lanes: security/privacy ACCEPT, test coverage ACCEPT, + maintainability/integration ACCEPT after fix. Verification passed: + `PYTHONPYCACHEPREFIX=/tmp/agentmemory-pycache python3 -m py_compile + integrations/hermes/__init__.py`, `git diff --check`, + `corepack pnpm test` (211 files / 2909 tests), and + `semgrep scan --config p/default --error --metrics=off .` (0 findings). +- 2026-06-19: Staged task-owned files and ran + `gitleaks protect --staged --redact` before commit; result: no leaks found. + Created commit `4229948a` (`feat(hermes): capture tool and task hooks`). +- 2026-06-19: Fetched current `origin/main` and merged base + `51f926fe918a100228d6ce92dadb19933e46ccbf` into the issue branch, producing + merge commit `724cd97a`. Post-merge PR diff still contains only the eight + Issue 331 task-owned files. +- 2026-06-19: Post-merge verification passed: + `corepack pnpm exec vitest run test/hermes-plugin.test.ts + test/integration-plaintext-http.test.ts --exclude test/integration.test.ts` + (31/31 tests), `PYTHONPYCACHEPREFIX=/tmp/agentmemory-pycache python3 -m + py_compile integrations/hermes/__init__.py`, `git diff --check + refs/remotes/origin/main...HEAD`, `corepack pnpm test` (211 files / 2926 + tests), and `semgrep scan --config p/default --error --metrics=off .` (0 + findings across 955 tracked files). +- 2026-06-20: Before push/PR/merge, fetched current `origin/main` again and + merged base `a461f1e52e637a7ca587fede9506aa9f521f72ed` into the issue + branch. Post-merge PR diff still contains only the eight Issue 331 task-owned + files. Reverification passed: targeted Hermes tests (31/31), + `PYTHONPYCACHEPREFIX=/tmp/agentmemory-pycache python3 -m py_compile + integrations/hermes/__init__.py`, `git diff --check + refs/remotes/origin/main...HEAD`, `semgrep scan --config p/default + --error --metrics=off .` (0 findings across 957 tracked files), and final + `corepack pnpm test` (211 files / 2930 tests). Two earlier full-suite + attempts timed out in unrelated CLI/project tests while the failing files + passed in isolation; final full-suite rerun passed. +- 2026-06-20: `origin/main` advanced again to + `682a133e66fa1650d62ae460d089b8ec19aaa92b` via Issue 335 while preparing the + PR. Merged the new base, resolved README/Hermes README conflicts by + preserving Issue 331's 7-hook/tool-result/task-completion language and Issue + 335's archived local memory removals language. Reverification is required + after this merge before push. +- 2026-06-20: Final post-Issue-335-merge verification passed: targeted Hermes + tests (34/34), `PYTHONPYCACHEPREFIX=/tmp/agentmemory-pycache python3 -m + py_compile integrations/hermes/__init__.py`, `git diff --check`, + `gitleaks protect --staged --redact`, `corepack pnpm exec vitest run + --exclude test/integration.test.ts --maxWorkers=4` (211 files / 2933 tests), + and `semgrep scan --config p/default --error --metrics=off .` (0 findings + across 959 tracked files). Default-worker full test attempts hit unrelated + CLI/Codex timeout flakes; each failing file passed in isolation and the + bounded-worker full suite passed. +- 2026-06-20: `origin/main` advanced again to + `be1b0094ff417d75b1bc6a32be3ae38c0ebca200` via Issue 313. Merged that base; + PR diff remained limited to the eight Issue 331 files. Final verification on + that base passed: targeted Hermes tests (34/34), + `PYTHONPYCACHEPREFIX=/tmp/agentmemory-pycache python3 -m py_compile + integrations/hermes/__init__.py`, `git diff --check + refs/remotes/origin/main...HEAD`, `corepack pnpm test` (212 files / 2951 + tests), and `semgrep scan --config p/default --error --metrics=off .` (0 + findings across 973 tracked files). diff --git a/integrations/hermes/README.md b/integrations/hermes/README.md index 77407471..b14ba932 100644 --- a/integrations/hermes/README.md +++ b/integrations/hermes/README.md @@ -14,7 +14,7 @@

61 MCP tools - 6 lifecycle hooks + 7 lifecycle hooks 95.2% R@5 Self-hosted Apache 2.0 @@ -51,10 +51,10 @@ and viewer fields. Open the real-time viewer at http://localhost:3113 to watch memories being captured live. If I want deeper integration — pre-LLM context injection, turn-level -capture, archived local memory removals, and system prompt block -injection — copy `integrations/hermes` from the agentmemory repo to +and tool-result capture, archived local memory removals, task completion, +and system prompt block injection — copy `integrations/hermes` from the agentmemory repo to `~/.hermes/plugins/agentmemory` instead. That gives me the -6-hook memory provider plugin on top of the MCP server. +7-hook memory provider plugin on top of the MCP server. ``` That's it. Hermes handles the rest. @@ -106,10 +106,11 @@ npx @agentmemory/agentmemory The plugin auto-detects the running server and hooks into the Hermes agent loop. Make sure `memory.provider` is set to `agentmemory` in `~/.hermes/config.yaml`: - `prefetch()` injects relevant memories before each LLM call -- `sync_turn()` captures every conversation turn in the background +- `sync_turn()` captures every conversation turn and structured tool results in the background - `on_session_end()` marks sessions complete for summarization - `on_pre_compress()` re-injects context before compaction - `on_memory_write()` skips add/update events to avoid duplicate memories and archives removed local memory entries to agentmemory +- `on_task_completed()` captures structured task completion signals - `system_prompt_block()` injects project profile at session start ### Environment variables diff --git a/integrations/hermes/__init__.py b/integrations/hermes/__init__.py index 87c5529e..22947a45 100644 --- a/integrations/hermes/__init__.py +++ b/integrations/hermes/__init__.py @@ -47,6 +47,7 @@ def sync_turn(self, user: str, assistant: str, **kwargs: Any) -> None: pass def on_session_end(self, messages: list, **kwargs: Any) -> None: pass def on_pre_compress(self, messages: list, **kwargs: Any) -> None: pass def on_memory_write(self, action: str, target: str, content: str, **kwargs: Any) -> None: pass + def on_task_completed(self, **kwargs: Any) -> None: pass def shutdown(self, **kwargs: Any) -> None: pass @@ -270,6 +271,8 @@ def initialize(self, session_id: str, **kwargs: Any) -> None: self._archive_removed_memory = _archive_removed_memory_enabled() self._session_id = session_id self._project = kwargs.get("cwd", os.getcwd()) + if not hasattr(self, "_emitted_tool_use_ids"): + self._emitted_tool_use_ids = set() if os.environ.get("AGENTMEMORY_REQUIRE_HTTPS") == "1": _check_plaintext_bearer_guard(self._base, self._secret) @@ -283,6 +286,19 @@ def _with_agent_id(self, body: dict) -> dict: agent_id = getattr(self, "_agent_id", "") return {**body, "agentId": agent_id} if agent_id else body + def _timestamp(self) -> str: + return time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()) + + def _observe_bg(self, hook_type: str, data: dict, session_id: str | None = None) -> None: + _api_bg(self._base, "observe", self._with_agent_id({ + "hookType": hook_type, + "sessionId": session_id or self._session_id, + "project": self._project, + "cwd": self._project, + "timestamp": self._timestamp(), + "data": data, + }), secret=self._secret) + def get_config_schema(self) -> list[dict]: return [ { @@ -429,19 +445,80 @@ def handle_tool_call(self, name: str, args: dict) -> str: return json.dumps({"error": f"Unknown tool: {name}"}) + def _content_blocks(self, content: Any) -> list[dict]: + if isinstance(content, dict): + return [content] + if isinstance(content, list): + return [item for item in content if isinstance(item, dict)] + return [] + + def _truncate_tool_output(self, value: Any, max_length: int = 8000) -> Any: + if isinstance(value, str): + return value[:max_length] + "\n[...truncated]" if len(value) > max_length else value + if isinstance(value, (dict, list)): + text = json.dumps(value, ensure_ascii=False, separators=(",", ":")) + return text[:max_length] + "...[truncated]" if len(text) > max_length else value + return value + + def _tool_observations_from_messages(self, messages: Any) -> list[dict]: + if not isinstance(messages, list): + return [] + + calls: dict[str, dict] = {} + results: list[tuple[str, Any]] = [] + for message in messages: + if not isinstance(message, dict): + continue + for block in self._content_blocks(message.get("content")): + block_type = block.get("type") + if block_type == "tool_use": + tool_use_id = block.get("id") + tool_name = block.get("name") + if isinstance(tool_use_id, str) and isinstance(tool_name, str): + calls[tool_use_id] = { + "tool_name": tool_name, + "tool_input": block.get("input"), + } + elif block_type == "tool_result": + tool_use_id = block.get("tool_use_id") + if isinstance(tool_use_id, str): + results.append((tool_use_id, block.get("content"))) + + observations: list[dict] = [] + for tool_use_id, output in results: + call = calls.get(tool_use_id) + if call: + observations.append({ + "tool_use_id": tool_use_id, + "data": { + **call, + "tool_output": self._truncate_tool_output(output), + }, + }) + return observations + def sync_turn(self, user: str, assistant: str, **kwargs: Any) -> None: - _api_bg(self._base, "observe", self._with_agent_id({ - "hookType": "post_tool_use", - "sessionId": kwargs.get("session_id", self._session_id), - "project": self._project, - "cwd": self._project, - "timestamp": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()), - "data": { - "tool_name": "conversation", - "tool_input": user[:500], - "tool_output": assistant[:2000], - }, - }), secret=self._secret) + session_id = kwargs.get("session_id", self._session_id) + observations = self._tool_observations_from_messages(kwargs.get("messages", [])) + if observations: + emitted = getattr(self, "_emitted_tool_use_ids", set()) + emitted_new_observation = False + for observation in observations: + key = (session_id, observation["tool_use_id"]) + if key in emitted: + continue + emitted.add(key) + emitted_new_observation = True + self._observe_bg("post_tool_use", observation["data"], session_id=session_id) + self._emitted_tool_use_ids = emitted + if emitted_new_observation: + return + + self._observe_bg("post_tool_use", { + "tool_name": "conversation", + "tool_input": user[:500], + "tool_output": assistant[:2000], + }, session_id=session_id) def on_session_end(self, messages: list, **kwargs: Any) -> None: _api(self._base, "session/end", self._with_agent_id({ @@ -485,6 +562,16 @@ def on_memory_write(self, action: str, target: str, content: str, **kwargs: Any) body["files"] = [file_target] _api_bg(self._base, "remember", self._with_agent_id(body), secret=self._secret) + def on_task_completed(self, **kwargs: Any) -> None: + description = kwargs.get("task_description") + self._observe_bg("task_completed", { + "task_id": kwargs.get("task_id"), + "task_subject": kwargs.get("task_subject"), + "task_description": description[:2000] if isinstance(description, str) else "", + "teammate_name": kwargs.get("teammate_name"), + "team_name": kwargs.get("team_name"), + }, session_id=kwargs.get("session_id", self._session_id)) + def shutdown(self, **kwargs: Any) -> None: pass diff --git a/integrations/hermes/plugin.yaml b/integrations/hermes/plugin.yaml index 9ea5cb98..879e6b92 100644 --- a/integrations/hermes/plugin.yaml +++ b/integrations/hermes/plugin.yaml @@ -9,4 +9,5 @@ hooks: - on_session_end - on_pre_compress - on_memory_write + - on_task_completed - system_prompt_block diff --git a/test/hermes-plugin.test.ts b/test/hermes-plugin.test.ts index 87e7ed64..11d6ebe8 100644 --- a/test/hermes-plugin.test.ts +++ b/test/hermes-plugin.test.ts @@ -7,6 +7,7 @@ const expectedHermesHooks = [ "on_session_end", "on_pre_compress", "on_memory_write", + "on_task_completed", "system_prompt_block", ]; diff --git a/test/integration-plaintext-http.test.ts b/test/integration-plaintext-http.test.ts index 6c09e7f5..7e7c8220 100644 --- a/test/integration-plaintext-http.test.ts +++ b/test/integration-plaintext-http.test.ts @@ -222,6 +222,15 @@ describe("Hermes plaintext bearer guard", () => { rmSync(home, { recursive: true, force: true }); }); + function runHermesPython(script: string) { + const result = spawnSync("python3", ["-c", script], { + cwd: process.cwd(), + env: { ...process.env, HOME: home }, + encoding: "utf8", + }); + expect(result.status, result.stderr || result.stdout).toBe(0); + } + it("covers loopback, remote HTTP, HTTPS, and require-HTTPS behavior", () => { const script = String.raw` import importlib.util @@ -769,4 +778,302 @@ assert provider.is_available() is False }); expect(result.status, result.stderr || result.stdout).toBe(0); }); + + it("sync_turn extracts Anthropic-style tool_use/tool_result messages", () => { + const script = String.raw` +import importlib.util +import os +from pathlib import Path + +for key in ("AGENTMEMORY_SECRET", "AGENTMEMORY_URL", "AGENTMEMORY_REQUIRE_HTTPS", "AGENTMEMORY_AGENT_ID"): + os.environ.pop(key, None) + +spec = importlib.util.spec_from_file_location("agentmemory_hermes", "integrations/hermes/__init__.py") +mod = importlib.util.module_from_spec(spec) +assert spec.loader is not None +spec.loader.exec_module(mod) + +provider = mod.AgentMemoryProvider() +hermes_home = Path(os.environ["HOME"]) / "custom-hermes" +hermes_home.mkdir() +provider.save_config({"url": "https://memory.example"}, str(hermes_home)) + +observe_calls = [] +def fake_api(base, path, body=None, method="POST", secret=""): + return {"success": True} +def fake_api_bg(base, path, body=None, secret=""): + observe_calls.append({"base": base, "path": path, "body": body, "secret": secret}) + +mod._api = fake_api +mod._api_bg = fake_api_bg +provider.initialize("session-331", hermes_home=str(hermes_home), cwd="/tmp/project", agent_identity="hermes") + +messages = [ + {"role": "assistant", "content": [{"type": "tool_use", "id": "toolu_1", "name": "Bash", "input": {"command": "pwd"}}]}, + {"role": "user", "content": [{"type": "tool_result", "tool_use_id": "toolu_1", "content": "repo\\n"}]}, +] +provider.sync_turn("user", "assistant", session_id="session-331", messages=messages) + +assert len(observe_calls) == 1, observe_calls +call = observe_calls[0] +assert call["base"] == "https://memory.example", call +assert call["path"] == "observe", call +assert call["body"]["hookType"] == "post_tool_use", call +assert call["body"]["sessionId"] == "session-331", call +assert call["body"]["project"] == "/tmp/project", call +assert call["body"]["cwd"] == "/tmp/project", call +assert call["body"]["agentId"] == "hermes", call +assert call["body"]["data"] == { + "tool_name": "Bash", + "tool_input": {"command": "pwd"}, + "tool_output": "repo\\n", +}, call +`; + runHermesPython(script); + }); + + it("sync_turn preserves the conversation fallback when no completed tool pairs exist", () => { + const script = String.raw` +import importlib.util +import os +from pathlib import Path + +for key in ("AGENTMEMORY_SECRET", "AGENTMEMORY_URL", "AGENTMEMORY_REQUIRE_HTTPS"): + os.environ.pop(key, None) + +spec = importlib.util.spec_from_file_location("agentmemory_hermes", "integrations/hermes/__init__.py") +mod = importlib.util.module_from_spec(spec) +assert spec.loader is not None +spec.loader.exec_module(mod) + +provider = mod.AgentMemoryProvider() +hermes_home = Path(os.environ["HOME"]) / "custom-hermes" +hermes_home.mkdir() +provider.save_config({"url": "https://memory.example"}, str(hermes_home)) + +observe_calls = [] +mod._api = lambda base, path, body=None, method="POST", secret="": {"success": True} +mod._api_bg = lambda base, path, body=None, secret="": observe_calls.append({"path": path, "body": body}) +provider.initialize("session-331", hermes_home=str(hermes_home), cwd="/tmp/project") +provider.sync_turn("user text", "assistant text", session_id="session-331", messages=[]) + +assert len(observe_calls) == 1, observe_calls +assert observe_calls[0]["body"]["data"] == { + "tool_name": "conversation", + "tool_input": "user text", + "tool_output": "assistant text", +}, observe_calls +`; + runHermesPython(script); + }); + + it("sync_turn tolerates malformed message history and keeps the fallback", () => { + const script = String.raw` +import importlib.util +import os +from pathlib import Path + +for key in ("AGENTMEMORY_SECRET", "AGENTMEMORY_URL", "AGENTMEMORY_REQUIRE_HTTPS"): + os.environ.pop(key, None) + +spec = importlib.util.spec_from_file_location("agentmemory_hermes", "integrations/hermes/__init__.py") +mod = importlib.util.module_from_spec(spec) +assert spec.loader is not None +spec.loader.exec_module(mod) + +provider = mod.AgentMemoryProvider() +hermes_home = Path(os.environ["HOME"]) / "custom-hermes" +hermes_home.mkdir() +provider.save_config({"url": "https://memory.example"}, str(hermes_home)) + +observe_calls = [] +mod._api = lambda base, path, body=None, method="POST", secret="": {"success": True} +mod._api_bg = lambda base, path, body=None, secret="": observe_calls.append({"path": path, "body": body}) +provider.initialize("session-331", hermes_home=str(hermes_home), cwd="/tmp/project") +provider.sync_turn( + "user text", + "assistant text", + session_id="session-331", + messages=[{"content": [{"type": "tool_result", "tool_use_id": "missing", "content": "ignored"}]}, "bad"], +) + +assert len(observe_calls) == 1, observe_calls +assert observe_calls[0]["body"]["data"]["tool_name"] == "conversation", observe_calls +`; + runHermesPython(script); + }); + + it("sync_turn does not re-emit the same tool_use_id for cumulative histories", () => { + const script = String.raw` +import importlib.util +import os +from pathlib import Path + +for key in ("AGENTMEMORY_SECRET", "AGENTMEMORY_URL", "AGENTMEMORY_REQUIRE_HTTPS"): + os.environ.pop(key, None) + +spec = importlib.util.spec_from_file_location("agentmemory_hermes", "integrations/hermes/__init__.py") +mod = importlib.util.module_from_spec(spec) +assert spec.loader is not None +spec.loader.exec_module(mod) + +provider = mod.AgentMemoryProvider() +hermes_home = Path(os.environ["HOME"]) / "custom-hermes" +hermes_home.mkdir() +provider.save_config({"url": "https://memory.example"}, str(hermes_home)) + +observe_calls = [] +mod._api = lambda base, path, body=None, method="POST", secret="": {"success": True} +mod._api_bg = lambda base, path, body=None, secret="": observe_calls.append({"path": path, "body": body}) +provider.initialize("session-331", hermes_home=str(hermes_home), cwd="/tmp/project") + +messages = [ + {"content": [{"type": "tool_use", "id": "toolu_1", "name": "Read", "input": {"file_path": "README.md"}}]}, + {"content": [{"type": "tool_result", "tool_use_id": "toolu_1", "content": "ok"}]}, +] +provider.sync_turn("user", "assistant", session_id="session-331", messages=messages) +provider.sync_turn("user", "assistant", session_id="session-331", messages=messages) + +assert len(observe_calls) == 2, observe_calls +assert observe_calls[0]["body"]["data"]["tool_name"] == "Read", observe_calls +assert observe_calls[1]["body"]["data"]["tool_name"] == "conversation", observe_calls +`; + runHermesPython(script); + }); + + it("sync_turn captures the conversation when cumulative tool history has no new pairs", () => { + const script = String.raw` +import importlib.util +import os +from pathlib import Path + +for key in ("AGENTMEMORY_SECRET", "AGENTMEMORY_URL", "AGENTMEMORY_REQUIRE_HTTPS"): + os.environ.pop(key, None) + +spec = importlib.util.spec_from_file_location("agentmemory_hermes", "integrations/hermes/__init__.py") +mod = importlib.util.module_from_spec(spec) +assert spec.loader is not None +spec.loader.exec_module(mod) + +provider = mod.AgentMemoryProvider() +hermes_home = Path(os.environ["HOME"]) / "custom-hermes" +hermes_home.mkdir() +provider.save_config({"url": "https://memory.example"}, str(hermes_home)) + +observe_calls = [] +mod._api = lambda base, path, body=None, method="POST", secret="": {"success": True} +mod._api_bg = lambda base, path, body=None, secret="": observe_calls.append({"path": path, "body": body}) +provider.initialize("session-331", hermes_home=str(hermes_home), cwd="/tmp/project") + +messages = [ + {"content": [{"type": "tool_use", "id": "toolu_1", "name": "Read", "input": {"file_path": "README.md"}}]}, + {"content": [{"type": "tool_result", "tool_use_id": "toolu_1", "content": "ok"}]}, +] +provider.sync_turn("first user", "first assistant", session_id="session-331", messages=messages) +provider.sync_turn("second user", "second assistant", session_id="session-331", messages=messages) + +assert len(observe_calls) == 2, observe_calls +assert observe_calls[0]["body"]["data"]["tool_name"] == "Read", observe_calls +assert observe_calls[1]["body"]["data"] == { + "tool_name": "conversation", + "tool_input": "second user", + "tool_output": "second assistant", +}, observe_calls +`; + runHermesPython(script); + }); + + it("sync_turn truncates large tool outputs like the Node post-tool-use hook", () => { + const script = String.raw` +import importlib.util +import os +from pathlib import Path + +for key in ("AGENTMEMORY_SECRET", "AGENTMEMORY_URL", "AGENTMEMORY_REQUIRE_HTTPS"): + os.environ.pop(key, None) + +spec = importlib.util.spec_from_file_location("agentmemory_hermes", "integrations/hermes/__init__.py") +mod = importlib.util.module_from_spec(spec) +assert spec.loader is not None +spec.loader.exec_module(mod) + +provider = mod.AgentMemoryProvider() +hermes_home = Path(os.environ["HOME"]) / "custom-hermes" +hermes_home.mkdir() +provider.save_config({"url": "https://memory.example"}, str(hermes_home)) + +observe_calls = [] +mod._api = lambda base, path, body=None, method="POST", secret="": {"success": True} +mod._api_bg = lambda base, path, body=None, secret="": observe_calls.append({"path": path, "body": body}) +provider.initialize("session-331", hermes_home=str(hermes_home), cwd="/tmp/project") + +long_text = "x" * 9000 +provider.sync_turn("user", "assistant", session_id="session-331", messages=[ + {"content": [{"type": "tool_use", "id": "toolu_text", "name": "Bash", "input": {"command": "cat big.txt"}}]}, + {"content": [{"type": "tool_result", "tool_use_id": "toolu_text", "content": long_text}]}, +]) +provider.sync_turn("user", "assistant", session_id="session-331", messages=[ + {"content": [{"type": "tool_use", "id": "toolu_obj", "name": "Fetch", "input": {"url": "https://example.com"}}]}, + {"content": [{"type": "tool_result", "tool_use_id": "toolu_obj", "content": {"body": "y" * 9000}}]}, +]) + +text_output = observe_calls[0]["body"]["data"]["tool_output"] +object_output = observe_calls[1]["body"]["data"]["tool_output"] +assert text_output == "x" * 8000 + "\n[...truncated]", len(text_output) +assert isinstance(object_output, str), object_output +assert object_output.endswith("...[truncated]"), object_output[-30:] +assert len(object_output) == 8014, len(object_output) +`; + runHermesPython(script); + }); + + it("on_task_completed emits task_completed observations", () => { + const script = String.raw` +import importlib.util +import os +from pathlib import Path + +for key in ("AGENTMEMORY_SECRET", "AGENTMEMORY_URL", "AGENTMEMORY_REQUIRE_HTTPS", "AGENTMEMORY_AGENT_ID"): + os.environ.pop(key, None) + +spec = importlib.util.spec_from_file_location("agentmemory_hermes", "integrations/hermes/__init__.py") +mod = importlib.util.module_from_spec(spec) +assert spec.loader is not None +spec.loader.exec_module(mod) + +provider = mod.AgentMemoryProvider() +hermes_home = Path(os.environ["HOME"]) / "custom-hermes" +hermes_home.mkdir() +provider.save_config({"url": "https://memory.example"}, str(hermes_home)) + +observe_calls = [] +mod._api = lambda base, path, body=None, method="POST", secret="": {"success": True} +mod._api_bg = lambda base, path, body=None, secret="": observe_calls.append({"path": path, "body": body}) +provider.initialize("session-331", hermes_home=str(hermes_home), cwd="/tmp/project", agent_identity="hermes") +provider.on_task_completed( + session_id="session-task", + task_id="task-1", + task_subject="Hermes hooks", + task_description="x" * 2500, + teammate_name="worker", + team_name="team", +) + +assert len(observe_calls) == 1, observe_calls +body = observe_calls[0]["body"] +assert body["hookType"] == "task_completed", body +assert body["sessionId"] == "session-task", body +assert body["project"] == "/tmp/project", body +assert body["cwd"] == "/tmp/project", body +assert body["agentId"] == "hermes", body +assert body["data"] == { + "task_id": "task-1", + "task_subject": "Hermes hooks", + "task_description": "x" * 2000, + "teammate_name": "worker", + "team_name": "team", +}, body +`; + runHermesPython(script); + }); });