Skip to content

Add hermes-agent plugin for Agent Receipts audit trails#532

Draft
ojongerius wants to merge 5 commits into
mainfrom
claude/hermes-agent-plugin-28khf
Draft

Add hermes-agent plugin for Agent Receipts audit trails#532
ojongerius wants to merge 5 commits into
mainfrom
claude/hermes-agent-plugin-28khf

Conversation

@ojongerius

@ojongerius ojongerius commented May 22, 2026

Copy link
Copy Markdown
Contributor

⚠️ Draft — do not merge. Pending manual end-to-end verification by the author against a live hermes-agent runtime + agent-receipts daemon. Automated tests (pytest/ruff/pyright) pass, but this has not yet been exercised in a real runtime. See the Checklist.

What

Adds a complete hermes-agent plugin (agent-receipts-hermes) that generates cryptographically signed, hash-linked audit trails for every tool call an agent makes. The plugin classifies tool calls using a bundled taxonomy, forwards frames to the local agent-receipts daemon over AF_UNIX, and exposes two agent-facing tools (ar_query_receipts, ar_verify_chain) for introspection.

Why

Provides operators with a tamper-evident audit trail of agent actions. Implements ADR-0010 (Flavor B) where the daemon owns signing, hashing, chain state, and persistence — the plugin's only job is to classify and forward frames. Mirrors the openclaw plugin's API so operators can use either implementation with the same muscle memory.

Implementation

  • Plugin entry (__init__.py): Registers pre_tool_call / post_tool_call hooks and agent-facing tools via best-effort introspection of the host ctx.
  • Hook handlers (hooks.py): Classify tool calls, forward unsigned frames to daemon, correlate pre/post calls via pending map with thread-safe eviction.
  • Classification (classify.py): Tool name → action type + risk level via exact mapping or prefix pattern fallback; supports custom taxonomies that merge with bundled defaults.
  • Daemon integration (daemon_store.py): Read-only access to daemon's SQLite database; surfaces DaemonUnavailable error when daemon is absent.
  • Agent tools (tools.py): ar_query_receipts (filter by action/risk/status, return newest-first with stats) and ar_verify_chain (verify signatures and hash linkage).
  • CLI (cli.py): Receipt Explorer (agent-receipts-hermes receipts|verify|export) for auditing outside agent sessions.
  • Config (config.py): Resolution of daemon socket, database, and public key paths with XDG/env var support.
  • Taxonomy (taxonomy.json): Bundled tool → action mappings and prefix patterns (filesystem, system, browser, network, etc.).

Testing

  • Unit tests for classification, config resolution, tool execution, and hook state management.
  • Integration test exercises full plugin lifecycle (register → hooks → real AF_UNIX socket) with a fake daemon server.
  • CLI smoke tests verify table and JSON output.
  • All tests pass; linter and type checker (strict mode) pass.
  • ⚠️ Not yet manually verified against a real hermes-agent runtime or a real agent-receipts daemon — see Checklist.

Security

  • Unserialisable tool arguments are dropped from frames, not stringified via __repr__ (prevents attacker-controllable content in signed audit trail).
  • Pending map is guarded by threading.Lock to prevent concurrent pre/post invocations from corrupting state.
  • No crypto material or chain state held by plugin; daemon is single source of truth.
  • Fire-and-forget frame delivery: if daemon socket is unreachable, a one-shot warning is logged and delivery drops silently.

Checklist

  • Tests pass for all changed components
  • Linter passes (ruff check)
  • No real keys or secrets in the diff
  • AGENTS.md added with architecture overview and code conventions
  • Security: inputs validated at trust boundaries, edge cases tested (nil, empty, concurrent, unserialisable args)
  • Manual end-to-end verification by the author against a live hermes-agent + agent-receipts daemon — pending; PR stays draft until done

https://claude.ai/code/session_01Cm5msM2JMUWSzoPnHvJhx2

Comment thread hermes/src/agent_receipts_hermes/hooks.py Fixed
Comment thread plugins/hermes/src/agent_receipts_hermes/classify.py Fixed
Comment thread plugins/hermes/tests/helpers.py Fixed
Comment thread plugins/hermes/tests/helpers.py Fixed
Comment thread hermes/tests/test_hooks.py Fixed

@ojongerius ojongerius left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two latent issues from a thorough review — neither blocks today's CI (the suite, ruff, and pyright are green against the pinned agent-receipts==0.9.0), but both are worth addressing before merge. Inline comments have the detail:

  1. Bundled taxonomy / taxonomyPath is log-only — it doesn't affect the signed audit trail; the daemon does the authoritative classification.
  2. Pinning published 0.9.0 diverges from the in-tree SDK — bumping past it breaks the import and inverts the "fire-and-forget" guarantee (emit now raises EmitTransportError by default under ADR-0025).

The security handling (no-repr JSON, adversarial-__repr__ on-wire test, pending-map locking + stale eviction, read-only store) is solid.


Generated by Claude Code

Comment thread plugins/hermes/src/agent_receipts_hermes/hooks.py
Comment thread plugins/hermes/src/agent_receipts_hermes/__init__.py
Comment thread plugins/hermes/src/agent_receipts_hermes/classify.py Fixed
@ojongerius ojongerius force-pushed the claude/hermes-agent-plugin-28khf branch from 7d27f40 to cfd9d7d Compare June 9, 2026 06:02
claude added 4 commits June 10, 2026 00:30
Adds a new top-level `hermes/` Python package — an Agent Receipts
plugin for NousResearch's hermes-agent. Mirrors the openclaw plugin's
architecture under ADR-0010 (Flavor B): the daemon owns signing,
hashing, chain state, and storage; the plugin is a thin emitter that
classifies tool calls and forwards frames over AF_UNIX.

Wires `pre_tool_call` / `post_tool_call` via `ctx.register_hook`, plus
best-effort introspection of `ctx.register_tool` to expose two
agent-callable tools — `ar_query_receipts` and `ar_verify_chain` —
that read directly from the daemon's SQLite database.

Ships a bundled taxonomy covering ~50 common hermes tools (`read_file`,
`bash`, `web_fetch`, `memory_*`, `subagent_*`, etc.) with overridable
prefix patterns, a Receipt Explorer CLI (`agent-receipts-hermes
receipts|verify|export`), and a pytest suite (60 tests) including an
end-to-end integration test that boots the real Emitter against an
in-process AF_UNIX server.

Strict pyright + ruff (lint + format) clean.

https://claude.ai/code/session_01Cm5msM2JMUWSzoPnHvJhx2
…lter, tautology

Addresses critical-review feedback on the POC:

- _safe_json no longer falls back to ``repr()`` for unknown objects.
  An attacker-controllable ``__repr__`` could otherwise inject
  misleading content into the signed audit trail. Non-serialisable
  values now drop the field; the frame still goes through.
- query_receipts: remove the post-filter on ``timestamp != after``.
  The SDK's ReceiptQuery already uses ``timestamp > ?`` (strictly
  exclusive) so the extra pass was dead code, and the accompanying
  comment inverted the SDK's actual semantics.
- test_integration.test_frame_layout_matches_daemon_wire_protocol:
  replace the ``struct.pack(...) == struct.pack(...)`` tautology
  with a real round-trip check of the length-prefix encoding.
- _attempt_register_tool: narrow swallowed exceptions to
  ``TypeError | AttributeError`` and log each rejected candidate at
  DEBUG so operators can diagnose mismatches against undocumented
  hermes APIs.
- plugin.yaml: drop the ``tools:`` block (manifest-driven tool
  registration isn't documented upstream and would risk double
  registration alongside the ctx-based probe).
- __all__ trimmed to ``["VERSION", "register"]``; everything else
  remains importable from its submodule.
- summarise_receipt / broken_at_or_none lifted into daemon_store and
  shared between the agent tools and the CLI (removes two copies).
- Rename loop variable ``field`` → ``field_name`` in classify.py.

61/61 tests pass, ruff lint+format clean, pyright strict clean.

https://claude.ai/code/session_01Cm5msM2JMUWSzoPnHvJhx2
…at test, type tightening

Round-2 review found that the previous "fix" pass left two real
issues open and missed several smaller ones. This commit closes
all of them and adds tests that actually prove the property under
review.

Thread safety (MEDIUM-C):
- HookState.pending is now guarded by a threading.Lock so concurrent
  pre/post invocations cannot trip _evict_stale mid-iteration with
  RuntimeError: dictionary changed size during iteration.
- The lock is released before the emitter call so socket I/O never
  blocks other hooks.
- New test_concurrent_pre_post_does_not_raise exercises 8 worker
  threads × 200 pre/post pairs and asserts no races escape.

Real wire-format check (MEDIUM-B):
- FakeSocketServer now records the 4-byte length prefix separately
  from the body so the integration test can assert
  unpack(">I", captured_header)[0] == len(body) — a meaningful
  round-trip, replacing the previous self-equality tautology.

Adversarial __repr__ defence in depth (MEDIUM-A):
- test_unserialisable_args_drop_field_not_frame now also asserts
  the forged content appears nowhere in the recorded frame.
- New test_adversarial_repr_never_reaches_wire boots the real
  Emitter against the fake socket and inspects raw bytes, catching
  any regression that bypasses _safe_json directly.

Other fixes:
- _parse_limit explicitly rejects bool — isinstance(True, int) is
  True in Python, so limit=True silently became limit=1.
- read_public_key distinguishes EACCES from missing-file so an
  operator running the agent as the wrong user gets a pointed hint
  instead of a misleading "daemon not running" message.
- summarise_receipt, _format_table, _print_verify, _receipt_to_jsonable,
  and _wrap_presentation tightened from Any to AgentReceipt /
  ChainVerification / StoreStats; pyright-strict now catches callers
  passing the wrong model.
- _load_default_taxonomy_or_empty wraps the bundled-taxonomy load
  in a try/except so a malformed taxonomy.json cannot brick
  `import agent_receipts_hermes` at module-load time.

67/67 tests pass (was 61), ruff lint+format clean, pyright strict clean.

https://claude.ai/code/session_01Cm5msM2JMUWSzoPnHvJhx2
…axonomy is log-only

Review follow-ups for the hermes plugin.

_emit now swallows transport-class failures in addition to ValueError /
RuntimeError. The pinned agent-receipts 0.9.0 swallows transport errors
inside emit(), but newer releases (sdk/py is at 0.12.0a1) raise
EmitTransportError by default under ADR-0025 unless built with
best_effort=True — a kwarg 0.9.0 does not accept. Catching the base class
keeps the fire-and-forget guarantee across SDK versions without importing a
type that may be absent on the installed SDK. Covered by a new test.

Documented that the plugin's classification (classify.py, taxonomy.json,
taxonomyPath) is diagnostic-only: the plugin forwards just the tool name and
the daemon performs the authoritative classification that lands in the
signed receipt. Updated README, AGENTS.md, and docstrings accordingly.

Cleared code-quality lint nits: EmitterLike.emit Protocol stub carries a
docstring, FakeSocketServer.stop teardown except blocks are commented, and
the thread-safety test narrows BaseException to Exception.
@ojongerius ojongerius force-pushed the claude/hermes-agent-plugin-28khf branch from cfd9d7d to f8b2f73 Compare June 10, 2026 00:30
Move the plugin from hermes/ to plugins/hermes/ to group third-party
agent-runtime integrations (hermes, future openclaw) apart from the
first-party binaries Agent Receipts ships itself (hook, mcp-proxy, daemon,
collector), which stay at the repo root.

Add .github/workflows/hermes.yml — a path-filtered "CI: hermes" job
modelled on sdk-py.yml (uv sync, ruff check, ruff format --check, pyright
src, pytest across Python 3.11-3.13). Until now nothing ran the plugin's
tests/lint/types in CI.

Also: add plugins/AGENTS.md defining what belongs under plugins/, note the
new directory in the root AGENTS.md layout + quick-reference table, and fix
the relative doc links and pyproject Homepage URL for the deeper path.
Comment thread plugins/hermes/src/agent_receipts_hermes/classify.py
@ojongerius ojongerius marked this pull request as draft June 10, 2026 06:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants