29 Apr 02:02

377888b

v0.18.0 — Sprint 2 Baseline: Self-Improving Skill Engine Latest

Latest

[0.18.0] — 2026-04-29 — Sprint 2 baseline: skill mutation engine + replay + agentskills.io interop

The self-improving skill loop. Skills become first-class hash-protected
artifacts; replay against synthetic fixtures produces composite outcome
scores; mutators propose candidate variants; winners promote atomically.
Per-project skills override global at resolve time. Cross-project
promotion candidates surface via findGlobalPromotionCandidates.

This is the Sprint 2 baseline — verified end-to-end with both unit
tests and a live cross-project demo against Postgres. v0.18.1 (next)
adds the CLI-based runtime mutator + outcome-trigger guardrails + operator-
gated global promotion queue, all without requiring an Anthropic API key.

Added — skill subsystem (`src/skills/`)

types.ts (192 lines) — Skill, SkillRun, SkillMutation, MutationContext type graph
loader.ts (323 lines) — markdown frontmatter parser + HMAC-SHA256 body sign
storage.ts (259 lines) — SQLite CRUD + tamper detection (SkillTamperedError)
storage_pg.ts (248 lines) — Postgres mirror for skills_pg / skill_runs_pg / skill_mutations_pg
storage_dual.ts (146 lines) — backend-aware dispatch (sqlite | postgres | dual)
scoring.ts (246 lines) — composite outcome score (accuracy + cost + speed) + acceptance
replay.ts (234 lines) — synthetic-fixture replay harness with HMAC-verify gate
mutator.ts (228 lines) — pluggable Mutator interface + helpers
mutators/local_mock.ts (71 lines) — deterministic test mutator
mutators/realtime_sonnet.ts (125 lines) — Anthropic Messages API direct
mutators/batch_sonnet.ts (159 lines) — Anthropic Batch API (50% discount)
orchestrator.ts (256 lines) — full select→mutate→replay→promote cycle
format/agentskills_io.ts (144 lines) — agentskills.io interop import/export

Added — cron primitive (`src/cron/`)

scheduler.ts (190 lines) — in-process scheduler with persistence, daily/interval triggers, history bound

Added — 3 SQLite migrations (20-22) and 3 PG migrations (6-8)

skills / skills_pg — versioned hash-protected skill registry (UNIQUE active per name+scope)
skill_runs / skill_runs_pg — execution telemetry with composite outcome score
skill_mutations / skill_mutations_pg — proposal + replay + promotion ledger

Added — 7 new MCP tools

Tool	Purpose
`zc_skill_list`	List active skills with recent score
`zc_skill_show`	Full skill detail (HMAC-verified)
`zc_skill_score`	Aggregate score + acceptance check
`zc_skill_run_replay`	Replay against fixtures via LocalDeterministicExecutor
`zc_skill_propose_mutation`	Run one mutation cycle on demand
`zc_skill_export`	Export as agentskills.io markdown
`zc_skill_import`	Accept agentskills.io markdown → store as skill

Added — entrypoint scripts

scripts/run-nightly-mutations.mjs — OS cron entrypoint (Linux cron / Windows Task Scheduler)
scripts/sprint2-cross-project-demo.mjs — live cross-project promotion demo (verified)
scripts/sprint2-live-demo.mjs — single-project mutation cycle demo (verified)

Added — RT-S2-* security tests

RT-S2-05: ZC_MUTATOR_MODEL allowlist falls back to local-mock on unknown values
RT-S2-07: pre-submission secret_scanner rejects API-key / AWS-key payloads
RT-S2-08: skill body HMAC mismatch → SkillTamperedError on storage read
RT-S2-09: candidate body HMAC verified before replay; mismatch → marked failed

Documentation

docs/SKILLS_WALKTHROUGH.md (~250 lines) — comprehensive usage guide

Test suite: 786/786 (was 645)

132 new Sprint 2 unit tests
9 new PG-mirror integration tests (require live PG)
All quality gates green: ESLint 0 errors, env-pinning linter 0 unclassified
Live cross-project demo: 9/9 steps pass against real Postgres

Migration notes

3 new SQLite migrations (20-22) auto-apply on first run
3 new PG migrations (6-8) require ZC_TELEMETRY_BACKEND=postgres|dual for activation
New env var ZC_MUTATOR_MODEL (allowlist-enforced; defaults to local-mock)
No breaking changes — Sprint 2 additions are additive

Architectural decisions ratified (D1-D6)

D1: Storage = dual (SQLite per-project default + PG centralized; both supported in this release)
D2: Skill scope = hierarchical (per-project overrides global at resolve time)
D3: Replay benchmark source = synthetic fixtures first (real-historical replay deferred to Sprint 2.5)
D4: Mutation engine = Sonnet 4.6 batch primary + realtime fallback + LocalMock for tests
D5: Per-tool-call cost storage (skill_runs.total_cost rolls up)
D6: Existing learnings/ JSONL kept; auto-feedback loop from v0.17.2 preserved

Sprint 2.5 deferrals

Tracked in C:\Users\Amit\AI_projects\.harness-planning\ARCHITECTURAL_LESSONS.md:

S2.5-1 Subprocess sandbox executor (RT-S2-03/04)
S2.5-2 Real-historical replay
S2.5-3 Override confirmation prompt (RT-S2-06)
S2.5-4 Cross-project auto-promotion
S2.5-5 Compacted-segment HMAC (RT-S2-08 for compaction)
S2.5-7 zc_unredact tool
S2.5-8 Skill injection scanner (RT-S2-01 hardening)

Assets 2

20 Apr 14:13

iampantherr

v0.17.2

cd209ce

v0.17.2 — Architectural Lints (L1+L3) + Learning-Loop Closure (L4)

[0.17.2] — 2026-04-20 — Architectural lints (L1+L3) + learning-loop closure (L4)

Pre-Sprint-2 hardening round. Closes three classes of bugs identified by
the v0.17.1 verification retrospective before the mutation-engine build
begins. All three are "catch future regressions automatically so we
don't keep rediscovering the same class of bug by luck":

Added — L1: env-pinning linter (`scripts/check-env-pinning.mjs`)

Static analysis script that walks src/**/*.ts for every process.env.ZC_*
reference, classifies each as CRITICAL / SHARED_PROPAGATED / OPERATIONAL,
and verifies CRITICAL vars are explicitly pinned in BOTH orchestrator +
worker launcher heredocs of A2A_dispatcher/start-agents.ps1.

Would have caught the v0.17.0 ZC_AGENT_ID pollution bug that silently
mis-attributed 16 consecutive tool_calls to the wrong agent_id (breaking
per-agent HKDF subkey isolation + RLS + log scoping).

14-case self-test (scripts/check-env-pinning.test.mjs) covering happy
path, missing pin, unclassified var, shared-propagation warnings,
bracket-notation refs, missing dispatcher path.
Run via npm run check:env (production) or npm run check:env:test (selftest).
Exit 0 = all green, exit 1 = new var unclassified OR critical missing.

Added — L3: ESLint flat config with `@typescript-eslint/no-floating-promises`

Installed eslint@9 + typescript-eslint@8 with a minimal config focused
on the single most-load-bearing rule: no-floating-promises. When the
outcomes.ts module became async in v0.12.0, the posttool-outcomes.mjs
hook kept calling resolveGitCommitOutcome(...) without await — the
process exited before the async DB write completed. 9 months of
undetected outcome-data loss. The lint would have caught it on the
first write.

Scanned src/ on install: found 3 real floating-promise violations
(2 recordToolCall in server.ts, 1 reader.cancel in fetcher.ts).
All fixed with explicit void operator + comments documenting intent.
Self-test (scripts/test-lint-catches-floating-promise.mjs) creates
a synthetic TS file with an unawaited call, confirms ESLint fails on
it, and confirms void + await both silence the rule. 5/5 pass.
Run via npm run lint or npm run lint:test.

Added — L4: outcome → learnings JSONL auto-feedback (`src/outcome_feedback.ts`)

Closes the learning loop. Previously, a failure becoming a learning
required agent discipline: (1) notice failure, (2) write to
failures.jsonl, (3) remember the format, (4) let the hook mirror. Four
points of failure, all behavioral.

Now: recordOutcome({outcomeKind: 'rejected' | 'failed' | 'insufficient' | 'errored' | 'reverted'}) atomically appends a structured JSON line
to <projectPath>/learnings/failures.jsonl. Successful outcomes
(shipped, accepted) with confidence ≥ 0.9 append to
learnings/experiments.jsonl. Future sessions retrieve via zc_search
without any agent discipline required.

Features:

Best-effort; swallows errors (never affects the primary outcome row).
Auto-creates learnings/ dir if missing (guard: projectPath must exist).
Symlink-escape guard: target must resolve inside <projectPath>/learnings/.
Payload capped at 64 KB per line; oversized evidence → dropped with a marker.
Concurrent writers don't corrupt — single appendFileSync per line.

16 unit tests covering every outcome-kind branch, security guards
(symlink escape, ghost projectPath), large-evidence truncation, rapid
concurrent appends, and downstream-consumer format (learnings-indexer
can mirror these rows into PG).

Live verified end-to-end: called recordOutcome with kind='rejected'
→ failures.jsonl gained 1 structured line tagged
"source":"auto-feedback-v0.17.1". Low-confidence accepted correctly
skipped. High-confidence shipped landed in experiments.jsonl.

Test suite: 645/645 (+16 from v0.17.1)

New: src/outcome_feedback.test.ts (16 tests)
New: scripts/check-env-pinning.test.mjs (14 cases)
New: scripts/test-lint-catches-floating-promise.mjs (5 cases)

Migration

No schema changes. No behavior changes for existing outcomes — the
feedback module is additive. Projects with no learnings/ dir get one
auto-created on the first failure/success outcome.
Operators running CI should add npm run check:env + npm run lint
to the pipeline.

Assets 2

20 Apr 13:56

iampantherr

v0.17.1

b52ad6f

v0.17.1 — Agent-Idle Fixes + Recall Cache + Cost Correctness

[0.17.1] — 2026-04-20 — Agent-idle fixes (A+B+C+D) + recall cache + cost-correctness (Tier 1+2)

Hotfix round addressing five issues found in live verification of v0.17.0:
(a) agents going idle after zc_summarize_session instead of draining the
task queue, (b) zc_recall_context dominating session cost at ~82% on Opus,
(c) tool-call cost accounting billed at the wrong rate (5× over-reported on
Opus), (d) infra-tool noise polluting the orchestrator's "do it myself vs.
delegate to Sonnet developer" cost comparisons, and (e) seven
architectural bugs surfaced by end-to-end data-flow tracing.

Added — `src/recall_cache.ts` (60s TTL + change-detection)

In-memory cache for zc_recall_context keyed by (project_path, agent_id).
TTL 60s; cache miss on any new working_memory / broadcasts /
session_events row. Repeat calls inside the window return the prior
response prefixed with (cached Xs ago) — saves ~800 output tokens per hit.
Estimated savings: ~$0.06/call on Opus, ~$0.012/call on Sonnet.
force: true arg bypasses the cache when an agent explicitly wants fresh data.
Cache is scoped per (project_hash, agent_id) — no cross-agent leakage.
Process-lifetime only; max 64 entries with FIFO prune.
11 unit tests.

Added — Tier 1 pricing: `computeToolCallCost()` in `src/pricing.ts`

Tool calls now billed from the LLM's perspective:

Tool call args (what the LLM generated to invoke) → billed at model's output rate
Tool response (what the LLM reads on its next turn) → billed at model's input rate

The naive computeCost() inverted these, over-reporting cost by ~5× on Opus
(output $75/Mtok vs. input $15/Mtok). For zc_recall_context:

Before: 798 × $75/Mtok = $0.060 (treated as Opus output)
After: 798 × $15/Mtok = $0.012 (Opus reads as input on next turn)

Matters because the Opus orchestrator uses cost tracking to decide "do I
handle this myself vs. delegate to the Sonnet developer" — inflated
numbers nudge toward unnecessary delegation.

Added — Tier 2 infra-tool zero-cost (`INFRA_TOOLS` set)

DB-assembly tools (zc_recall_context, zc_file_summary, zc_project_card,
zc_status) now return cost_usd=0. Rationale: their responses are
deterministic from DB state — no LLM, no Ollama, no external service — so
per-call work is negligible. Token counts still accurate so audits can
recompute via computeToolCallCost.

Override: set ZC_DISABLE_INFRA_ZERO_COST=1 when you want full cost
reconciliation against Anthropic invoices.

Added — HTTP endpoint `GET /api/v1/queue/stats-by-role`

Returns { role: { queued, claimed, done, failed } } for task_queue_pg.
Used by the A2A dispatcher's new checkWorkerWake (see A2A_dispatcher
v0.17.1) to poke idle workers when their role has claimable work.

Fixed — outcomes resolver pipeline (3 latent bugs from v0.12.0+)

getMostRecentToolCallForSession was SQLite-only. In Postgres mode
session lookups returned null → resolveGitCommitOutcome +
resolveFollowUpOutcomes silently no-op'd. Result: every outcome row
since v0.12.0 (when the function became async) failed to persist.
posttool-outcomes.mjs hook had the same SQLite-only query for session
id discovery. Fixed with the same PG lookup + SQLite fallback pattern.
Hook called resolveGitCommitOutcome(...) without await. Process
exited before the async resolver's DB write completed. 9 months of
undetected outcome-data loss (L3 in the architectural-lessons doc).

Fixed — `learnings-indexer.mjs` hook coverage gaps

Previously matched only Write|Edit|MultiEdit|NotebookEdit. Agents
using echo ... >> learnings/X.jsonl via Bash silently bypassed the
hook. Now matches Bash too and parses >> / > redirection
targets from the command.
Hook only wrote to SQLite; Postgres learnings_pg populated only via
manual scripts/backfill-learnings.mjs. Now mirrors to PG when
ZC_TELEMETRY_BACKEND=postgres|dual. Module-resolution handles running
from ~/.claude/hooks/ with no node_modules via file:// fallback
to SC repo's node_modules/pg.
projectPath hashing normalized via realpathSync so forward-slash /
backslash variants on Windows hash consistently.

Test suite: 629/629 (+12 from v0.17.0)

Added src/recall_cache.test.ts (11 tests: cold-miss, hit, staleness,
cross-agent/project isolation, TTL, undefined-agent bucketing).
Added telemetry non-infra-tool cost test.
Updated postgres_backend.test.ts RT-S3-06 + sprint1_integration.test.ts
for new cost formula.

Migration

Pure code fixes — no schema changes.
Historical tool_calls_pg rows retain their old cost_usd values; new
rows use corrected formula.
To use -WorkerCount N with PG backend, ensure sc-api is rebuilt from
v0.17.1 source (adds /api/v1/queue/stats-by-role endpoint).

Assets 2

20 Apr 03:07

iampantherr

v0.17.0

866d91f

v0.17.0 — Work-Stealing Queue + Model Router + Ownership Guard + Multi-Worker Pools

[0.17.0] — 2026-04-20 — Sprint 3 Phase 3: Work-Stealing Queue + Model Router + Ownership Guard + Multi-Worker Pools

Sprint 3 Phase 3 — the pieces that let multiple workers in the same role share one task queue without stepping on each other. Closes the "single worker per role" limit that v0.15.0/v0.16.0 left in place.

Added — Postgres work-stealing queue (§8.2)

task_queue_pg table (migration id=5) with state CHECK constraint + routing index (project_hash, role, state, ts) + partial heartbeat index WHERE state='claimed'.
src/task_queue.ts — seven operations backed by FOR UPDATE SKIP LOCKED so N workers can race-claim atomically without blocking each other:
- enqueueTask() — idempotent (ON CONFLICT DO NOTHING)
- claimTask() — atomic primitive (UPDATE ... WHERE task_id = (SELECT ... FOR UPDATE SKIP LOCKED LIMIT 1))
- heartbeatTask() — workers must call every 30s
- completeTask() / failTask() — terminal states (fail bumps retries)
- reclaimStaleTasks(staleAfterSeconds=300) — sweep dead claims back to queue
- getQueueStats() — counts by state
13 unit tests (src/task_queue.test.ts) including:
- RT-S4-01: 50 concurrent workers × 100 tasks → each task claimed EXACTLY once (no double-claim; core correctness property of SKIP LOCKED)
- RT-S4-02: 600s-stale heartbeat → reclaim back to queued + retries++
- RT-S4-03: failTask bumps retries + persists failure_reason
- RT-S4-04: cross-role + cross-project scope isolation

Added — 6 MCP tools exposing the queue

zc_enqueue_task (orchestrator) · zc_claim_task (worker) · zc_heartbeat_task · zc_complete_task · zc_fail_task · zc_queue_stats
Worker agent_id is sourced from ZC_AGENT_ID env var so a multi-worker pool (e.g. developer-1/2/3 all role=developer) shares one queue keyed by (project_hash, role) and claims atomically.
5 MCP integration tests (src/task_queue_mcp.test.ts) covering end-to-end lifecycle, 3-worker race, fail path, stats aggregation, cross-project isolation.

Added — Complexity-based model router (§8.5)

src/indexing/model_router.ts — chooseModel(complexity 1-5) returns {model, tier, reason, estimatedInputCostPerMtok, inputClamped}:
- 1-2 → Haiku 4.5 (trivial tasks, $0.25/Mtok)
- 3-4 → Sonnet 4.6 (standard work, $3.00/Mtok — cost/quality sweet spot)
- 5 → Opus 4.7 (hard reasoning, $15.00/Mtok)
Env overrides: ZC_MODEL_TIER_{HAIKU,SONNET,OPUS} resolved per call so operators can flip at runtime.
Safe defaults: null / undefined / NaN / Infinity / out-of-range → Sonnet with inputClamped=true.
19 unit tests covering tier mapping, rounding, clamping edges, env overrides, result shape.
zc_choose_model MCP tool wraps it.

Added — File-ownership overlap guard at `/api/v1/broadcast` (§8.2)

HTTP API rejects ASSIGN whose file_ownership_exclusive overlaps any in-flight (unmerged) ASSIGN's exclusive set → HTTP 409 Conflict with overlapping_files + conflicting_broadcast_id. Prevents two workers being assigned the same file.
"In-flight" = ASSIGN whose task has no subsequent MERGE in the last 200 broadcasts.
5 integration tests (src/ownership_guard.test.ts):
- RT-S4-05: overlapping exclusive → 409
- RT-S4-06: disjoint exclusive → 200
- RT-S4-07: re-ASSIGN allowed after MERGE of the prior task
- Plus back-compat (no excl set) + non-ASSIGN types bypass guard

Fixed — `recallSharedChannel` was silently dropping v0.15.0 §8.1 structured columns

SQLite-path recallSharedChannel only projected legacy columns. All downstream consumers saw file_ownership_exclusive=undefined even when the DB column was populated — the ownership-guard work surfaced this hidden v0.15.0 gap. Now projects all 7 v0.15.0 §8.1 columns with NULL → undefined semantics.

Added — `-WorkerCount N` on `start-agents.ps1` + role-tagged registration (A2A_dispatcher side)

New -WorkerCount param (1-20, default 1). When > 1, expands each -Roles entry into N numbered workers suffixed -1..-N:

start-agents.ps1 -Roles developer -WorkerCount 3
# → spawns developer-1, developer-2, developer-3
#   each with its own WT window, worktree, registration
#   all sharing role="developer" — one work-stealing queue

Get-AgentRole helper strips -N suffix so $roleMeta + roles.json deep-prompt lookups still work.
register.mjs accepts --role flag / ZC_AGENT_ROLE env → writes _agent_roles[agentId] sidecar so dispatcher can route by role without breaking the existing agentId → pane string map.
Back-compat: WorkerCount=1 (default) preserves legacy plain names ("developer" not "developer-1").
Env propagation fix: worker/orchestrator launch scripts now also propagate ZC_POSTGRES_* + ZC_TELEMETRY_BACKEND so the agent's MCP server can reach task_queue_pg (closes the longstanding v0.10.4 env-propagation follow-up).

Added — `scripts/backfill-learnings.mjs` (close the learning loop)

The PostToolUse learnings-indexer.mjs hook only mirrors NEW Write/Edit events — prior <project>/learnings/*.jsonl rows never get indexed into learnings / learnings_pg. So agents couldn't zc_search past decisions/failures from earlier sessions.
New script scans <project>/learnings/*.jsonl, categorizes by filename stem, idempotently upserts (via UNIQUE), mirrors to PG when ZC_TELEMETRY_BACKEND=postgres|dual.
Verified on Test_Agent_Coordination: 6 rows backfilled (3 decisions + 3 metrics). Previously both SQLite and PG had 0 learnings rows despite JSONL content existing.

Test Suite

617/617 unit+integration tests pass (was 575 pre-v0.17.0; +42 new: 13 task_queue + 19 model_router + 5 ownership guard + 5 task_queue MCP).
Live E2E on Test_Agent_Coordination with -WorkerCount 3: agent called zc_choose_model (verified 2→haiku, 4→sonnet, 5→opus tier mapping), enqueued 3 disjoint-ownership tasks via zc_enqueue_task, workers atomically claimed via zc_claim_task, committed actual file hardening (e.g. checkRequest(req) in src/rate-limiter.js throwing TypeError: rate-limiter: req argument is required; harden: validate argv in index commit f25acf5a).

Migration

Schema: migration id=5 (task_queue_pg) is idempotent + additive — Postgres-only feature (no SQLite companion).
API: zero breaking changes. All new MCP tools are additive.
Env for workers: if you run in HTTP/Postgres mode, restart agents via start-agents.ps1 so they pick up the updated launch scripts that propagate ZC_POSTGRES_*. Until then, zc_enqueue_task/zc_claim_task return Postgres pool unavailable.

Assets 2

19 Apr 01:40

iampantherr

v0.16.0

1396be5

v0.16.0 — Sprint 3 Phase 2: Postgres Backend + T3.1 SET LOCAL ROLE + T3.2 RLS

Sprint 3 Phase 2 — Postgres backend (deferred since v0.12.x) + both Tier 3 access-control fixes from §8.6 of the canonical plan. Closes the v0.15.0 limitation where structured ASSIGN fields were silently dropped in HTTP API mode.

Three major adds

Postgres backend for telemetry/outcomes

`ChainedTablePostgres` mirrors `ChainedTableSqlite` using `BEGIN; SELECT row_hash ... FOR UPDATE; INSERT; COMMIT` (Postgres analog of SQLite's BEGIN IMMEDIATE). Same chain content (HKDF-keyed HMAC) — rows byte-identical across backends, migration is a SQL copy.

Wired in via existing `ZC_TELEMETRY_BACKEND=sqlite|postgres|dual` env switch.

Tier 3 fix T3.1 — per-query SET LOCAL ROLE

Each agent now writes telemetry under a per-agent Postgres role (`zc_agent_`), lazily provisioned with minimum INSERT/SELECT/UPDATE grants. Each chained INSERT runs inside `BEGIN; SET LOCAL ROLE ; INSERT; COMMIT` — Postgres' `current_user` reflects the actual writing agent, not the pool's user.

Tier 3 fix T3.2 — Row-Level Security on outcomes_pg

4 RLS policies enforce read tiers (Chin & Older 2011 Ch5+13, Bell-LaPadula confidentiality):

`public/internal` → any role
`confidential` → registered agent
`restricted` → ONLY `created_by_agent_id` (matched against `current_setting('zc.current_agent')`)

This is enforced INSIDE Postgres, not in app code. Even a compromised agent process with valid DB credentials cannot read other agents' restricted outcomes.

HTTP API forwards structured ASSIGN columns

Closes the v0.15.0 known limitation. `POST /api/v1/broadcast` now accepts and forwards all 7 v0.15.0 structured fields.

Tests

575/575 pass (565 + 10 new Postgres tests)
Postgres tests run against real local Docker container, auto-skip when no PG reachable
RT-S3-05 verified live: cross-agent read of `'restricted'` row blocked by Postgres RLS even with shared DB credentials
RT-S3-06 verified live: chain hashes byte-identical across SQLite + Postgres (rows migrate without rehashing)

Bugs found + fixed during integration

`provisionAgentRole` originally inside writer txn → grants invisible to SET LOCAL ROLE. Fixed via separate-connection provisioning.
`SELECT FOR UPDATE` needs `UPDATE` privilege on most PG versions — added explicit GRANT.
Missing `GRANT USAGE ON SCHEMA public` — required for table access.

Known limitations

Existing `securecontext-api` Docker container is v0.8.0 — needs rebuild (`docker compose build sc-api && docker compose up -d sc-api`) to pick up v0.16.0 endpoints
Live multi-agent test through `start-agents.ps1` with `ZC_TELEMETRY_BACKEND=postgres` requires that container rebuild — functionally validated via 10 unit tests against real Postgres + RT-S3-05 cross-agent RLS test

Upgrade notes

Backward-compatible by default. Don't set `ZC_TELEMETRY_BACKEND` and SQLite continues exactly as v0.15.0.

To enable Postgres backend:

Set `ZC_POSTGRES_PASSWORD` (or `ZC_POSTGRES_URL`)
Set `ZC_TELEMETRY_BACKEND=postgres` (or `=dual` for parity verification)
Pool's owning role needs `CREATEROLE` privilege (bundled `scuser` already has it)
Rebuild + redeploy the Docker `securecontext-api` container

What's next

v0.17.0 — §8.2-8.5 work-stealing queue + worker pool spawning + file-ownership enforcement + complexity-based model routing. Uses the Postgres backend shipped here.

See CHANGELOG.md for full details.

Assets 2

19 Apr 00:52

iampantherr

v0.15.0

3eb6e0a

v0.15.0 — Sprint 3 Phase 1: Structured ASSIGN + MAC Classification (Tier 3 part)

First slice of Sprint 3. Foundation pieces that don't require Postgres backend.

Two features

§8.1 Structured ASSIGN broadcast schema (additive, backward-compatible)

7 new optional fields on `zc_broadcast` for type=ASSIGN:

`acceptance_criteria` (testable assertions)
`complexity_estimate` (1-5)
`file_ownership_exclusive` + `file_ownership_read_only` (path-traversal-filtered)
`task_dependencies` (broadcast IDs that must MERGE first)
`required_skills`
`estimated_tokens`

Existing ASSIGN broadcasts work unchanged (backward-compat). Dispatcher in v0.17.0 will consume these for tier routing + file-ownership enforcement.

§8.6 T3.2 MAC-style classification on outcomes (Chin & Older 2011 Ch5+Ch13)

Classification labels: `public` / `internal` / `confidential` / `restricted` with read-filter:

`'restricted'` rows readable ONLY by `created_by_agent_id` — closes the cross-agent leak gate from §8.6 T3.2

`resolveUserPromptOutcome` now auto-tags `'restricted'` with the agent's identity (sentiment about user messages belongs to the originating agent only).

Tests

565/565 pass (541 baseline + 24 new)
RT-S3-02: cross-agent read of `'restricted'` row blocked
RT-S3-03: legacy rows get `'internal'` default; CHECK blocks NULL
RT-S3-04: SQL injection via classification value blocked by CHECK constraint
Edge cases: complexity clamping, oversize cap, path traversal, integer-only deps, downgrade of restricted-without-creator

Live verification

Real Claude CLI agent on Test_Agent_Coordination processed broadcast #1037 (4 tool_calls). Local-mode broadcastFact verified all 7 structured fields round-trip through SQLite.

Known limitations (deferred to v0.16.0)

HTTP API mode: existing api-server (Docker container) doesn't yet know about structured ASSIGN columns. Local mode works fully.
T3.1 per-agent Postgres role: deferred since it depends on Postgres backend landing first (per §8.6 acceptance criteria).
v0.17.0 will land §8.2-8.5: work-stealing queue, worker pool spawning, file-ownership enforcement, complexity-based routing.

See CHANGELOG.md for full details.

Assets 2

18 Apr 22:36

iampantherr

v0.14.0

7f9721d

v0.14.0 — Native AST + Provenance Tagging + Louvain Community Detection

The "deeper internal capabilities" release. Three features that complement v0.13.0's graphify integration — bringing similar structural-understanding capabilities natively to SC's KB even when graphify isn't available.

Three features

Phase A — Provenance tagging

Every `working_memory` and `source_meta` row now carries a `provenance` flag (Chin & Older 2011 Ch6+Ch7 'speaks-for' formalism — every claim carries its trust chain):

EXTRACTED — read directly from a primary source
INFERRED — produced by an LLM
AMBIGUOUS — multiple plausible readings
UNKNOWN — legacy default

API additive (backward compat). Promotion/downgrade via re-assert. Migrations 16+17 with CHECK constraint. RT-S3-01 verifies SQL injection blocked.

Phase B — AST extractor (TS/JS/Python)

Regex-based deterministic L0/L1 for code files without an LLM call. ~80% LLM cost reduction on indexing for code-heavy projects.

Live samples from the agent run:

`rate-limiter.js` → "REST API Rate Limiter Middleware. Contains 1 class, 1 function."
`search.js` → "Task Search — Fuzzy Matching... Contains 2 functions, 1 import."

Why regex first, tree-sitter later: tree-sitter requires per-language WASM grammars (~500KB each) that aren't bundled. Regex covers 80/20 case at zero install friction. Interface designed for v0.15.0 swap with no breaking change.

Phase C — Louvain community detection

`zc_kb_cluster` + `zc_kb_community_for` MCP tools cluster KB sources by graph topology (no embeddings needed). For "what's related to X" questions, two files that import each other are obviously related — no embedding call needed.

Live verification: clustered 26 sources from Test_Agent_Coordination into 5 communities (sizes 6+6+5+2+1).

(Algorithm note: Louvain not Leiden — Leiden isn't published as npm package. Same family, similar quality.)

Test summary

541/541 tests pass (470 baseline + 71 new)
Live agent run: All three features fired correctly with real Claude CLI agents on Test_Agent_Coordination
Edge cases covered: empty files, syntax-broken files, very-large >5MB files, comments-only, abstract classes, generator functions, default exports, Python all, async def, decorators

Two new MCP tools

Tool	Purpose
`zc_kb_cluster()`	Run Louvain over KB; persist communities
`zc_kb_community_for(source)`	Look up a source's community + community-mates

Backward compatible

All existing code paths unchanged:

`rememberFact` and `indexContent` keep old signatures (provenance is new optional last arg)
AST is automatic for code extensions (no API change)
Migrations 16+17 are defensive (idempotent)

Recommended workflow for agents

Question	Right tool
"What's the architecture of this project?"	`zc_kb_cluster` first, drill in with `zc_kb_community_for`
"What's related to file X?"	`zc_kb_community_for("file:src/X.ts")`
"Summarize this code file"	`zc_file_summary` (now AST-extracted if TS/JS/Python)

What's next

Sprint 3 picks up Tier 3 access-control fixes — see `HARNESS_EVOLUTION_PLAN.md §8.6` (locked with hard "DO NOT START" gate).

See CHANGELOG.md for full details.

Assets 2

18 Apr 21:24

iampantherr

v0.13.0

566cd53

v0.13.0 — graphify Integration: Structural Knowledge Graph as a First-Class SC Capability

SC + graphify stacked. SC now proxies to graphify (29.7k★, AI coding assistant skill) so agents can navigate the structural knowledge graph alongside SC's persistent state + telemetry. They solve different problems and stack multiplicatively for token savings on architectural questions.

Three new MCP tools

`zc_graph_query(query)` — natural-language query over the structural graph (god nodes, communities, relationships)
`zc_graph_path(from, to)` — shortest path between two named nodes
`zc_graph_neighbors(node)` — immediate neighbors of a named node

All three return helpful hints when graphify isn't set up — they're inert until `pip install graphifyy && /graphify .` is run.

Auto-index `GRAPH_REPORT.md`

`zc_index_project` now auto-detects `graphify-out/GRAPH_REPORT.md` and indexes it into SC's KB so agents discover it via normal `zc_search` without needing to know graphify exists.

Token savings (combined SC + graphify)

Question	Without either	SC alone	graphify alone	Both stacked
Architectural ("how does auth work")	~25k	~2k	~500 orient	~1.5k
State / history	N/A	~1.5k	N/A	~1.5k
Specific implementation	N/A	~800	N/A	~800

Tests

470/470 pass (459 baseline + 11 new graph_proxy tests)
Live subprocess path not unit-tested (requires Python + graphifyy in CI; covered by manual integration)

How to enable

```bash

One-time

pip install graphifyy && graphify install

Per project

/graphify .

In your AI assistant

zc_graph_query "how does the auth flow connect to the database?"
```

If graphify isn't installed, SC works exactly as before. The new tools just return hints.

Recommended workflow

Question type	Right tool
Architectural / structural	`zc_graph_query` first, then `zc_search` for precise content
State / history	`zc_recall_context`
Specific implementation	`zc_search`
What's connected to X	`zc_graph_neighbors`

Deferred to v0.14.0

The deeper internal capabilities (complement graphify rather than replace):

Native AST tree-sitter pre-pass for code files (LLM-free L0 — ~50% indexing cost reduction)
EXTRACTED / INFERRED / AMBIGUOUS provenance tagging (Chin & Older 2011 "speaks-for" formalism — every claim carries its trust chain)
Leiden community detection over SC's KB (graph topology beats vector similarity for some queries)

Then Sprint 3 picks up Tier 3 access-control fixes — see `HARNESS_EVOLUTION_PLAN.md §8.6`.

See CHANGELOG.md for full details.

Assets 2

18 Apr 21:17

iampantherr

v0.12.1

20b625e

v0.12.1 — Tier 2: Reference Monitor + session_token Binding for Telemetry

Closes the two largest remaining access-control gaps from the v0.12.0 design review. Telemetry writes now have a single bypass-proof enforcement point that authenticates the writer's identity, not just verifies row integrity.

Highlights

HTTP API Reference Monitor — `POST /api/v1/telemetry/tool_call` and `/outcome` enforce session_token binding before any DB write. Pattern from Chin & Older 2011 Ch12 ("Reference Monitor" — exactly one enforcement point per protected resource, tamper-proof + always invoked + verifiable).
session_token binding — every telemetry write requires `Authorization: Bearer <session_token>`. Server asserts the token's bound `agent_id` matches the row's claimed `agent_id` (HTTP 403 on mismatch).
`ZC_TELEMETRY_MODE` env switch — `local` (default, unchanged), `api` (route through Reference Monitor), `dual` (write to both for migration).
Token cache + lifecycle — fetched lazily, cached 1 hour, re-fetched on 401, falls back to local mode if unreachable.

Security closes Tier 2 gaps

Gap	Before v0.12.1	After v0.12.1
#1 No bypass-proof enforcement	Each agent's MCP server opened the project DB directly	All writes route through the API; only the API process holds DB write authority
#2 `agent_id` was an unauthenticated string	Agent A could write rows claiming to be agent B	API verifies `body.agentId === token.aid`; forgery blocked with HTTP 403

Combined with v0.12.0's per-agent HMAC subkey (Tier 1 #1), telemetry rows are now integrity-protected (chain) AND authenticated (token-bound writer).

Red-team tests

RT-S2-02: alice's token cannot write a row claiming bob → 403
RT-S2-03: missing/malformed/empty Authorization header → 401
RT-S2-04: revoked token → 401
RT-S2-05: project-A token used against project-B → 401 (project-scoped capability per Ch11)
RT-S2-06: end-to-end via `recordToolCallViaApi` succeeds with valid token

Test summary

459/459 tests pass (449 baseline + 10 new Reference Monitor tests)
Stress test still chain ✓ OK under 10 concurrent writers × 100 calls (458 writes/sec)

Upgrade notes

Backward-compatible by default. Existing deployments continue using local-mode SQLite unless they set `ZC_TELEMETRY_MODE=api`.

For multi-agent production deployments:

Set `ZC_API_KEY` (already required for v0.9.0+ broadcast RBAC)
Set `ZC_TELEMETRY_MODE=api` in agent environments
Set `ZC_AGENT_ID` + `ZC_AGENT_ROLE` per agent (used for session_token issuance)
Rebuild the SC HTTP API Docker image — the new `/api/v1/telemetry/*` endpoints need the v0.12.1 code. The shipped image will need a refresh.

Deferred to v0.12.2

Postgres backend (`ChainedTablePostgres`) — second `ChainedTable` implementation
Tier 1 fix #2 (POSIX 0700/0600 hardening)
Tier 1 fix #3 (per-agent Postgres role with INSERT-only grant)
Cross-backend stress test
Docker image rebuild + publish

Sprint 3 then picks up Tier 3 — see `HARNESS_EVOLUTION_PLAN.md §8.6` (locked in with hard "DO NOT START Sprint 3 until..." gate).

See CHANGELOG.md for full details.

Assets 2

18 Apr 21:08

iampantherr

v0.12.0

e7954b0

v0.12.0 — Sprint 2 Prep: ChainedTable Abstraction + Per-Agent HMAC Subkey (Tier 1 #1)

Foundation release for the dual-backend telemetry roadmap. Ships the storage abstraction layer that v0.12.1 will plug Postgres into, and closes the largest pre-existing access-control gap in v0.11.0's hash-chain design.

Highlights

ChainedTable backend-agnostic abstraction with HKDF-derived per-agent HMAC subkey
Tier 1 access-control fix #1 closed: per-agent HMAC subkey blocks cross-agent row forgery (RT-S2-01 verifies)
Async public API (Option 4): recordToolCall, recordOutcome, and the 3 resolvers are all async — SQLite path stays sync internally; future backends drop in without API change
Removed _lastHashCache from v0.11.0 (was redundant with BEGIN IMMEDIATE, added a Heisenbug surface)

Security closes Tier 1 Gap #5 (Chin & Older 2011, Ch6+Ch7)

v0.11.0 used the raw machine secret as the HMAC key, making chains integrity-only. An insider with the machine secret could compute valid HMACs for any agent_id.

v0.12.0 derives per-agent subkeys: `HKDF-Expand(machine_secret, "zc-chain:" || agent_id, 32)`. Verifier reads each row's stored agent_id and derives the matching subkey — a row claiming the wrong identity fails HMAC verification.

Combined with v0.12.1's session_token binding, telemetry rows become authenticated, not just integrity-protected.

⚠️ BREAKING — chain verification

Existing v0.11.0 chains will fail to verify under v0.12.0. The HMAC key derivation changed (raw secret → HKDF subkey). `verifyToolCallChain` reports `brokenAt: 0, brokenKind: "hash-mismatch"` for any pre-upgrade row.

Migration: non-production deployments can truncate and restart. Production deployments should wait for v0.12.1's `scripts/migrate-v011-to-v012-chains.mjs` re-hash helper.

Test summary

449/449 tests pass (433 baseline + 16 new chained_table tests + RT-S2-01)
Stress test 10×100 still chain ✓ OK (regression from v0.11.0 + a7ed9a1 confirmed)
All 22 prior test files have async-cascade calls updated; no test logic changes

What ships next (v0.12.1)

Tier 2 fix #1: Reference Monitor (telemetry routes through HTTP API, single bypass-proof enforcement point per Chin & Older Ch12)
Tier 2 fix #2: session_token binding for telemetry writes
Postgres backend (`ChainedTablePostgres`) with single-statement INSERT + FOR UPDATE
`ZC_TELEMETRY_BACKEND=sqlite|postgres|dual` env selection
Remaining Tier 1 fixes (POSIX hardening + per-agent Postgres role)
Cross-backend stress test

Sprint 3 then picks up Tier 3 — see `HARNESS_EVOLUTION_PLAN.md §8.6`.

See CHANGELOG.md for full details.

Assets 2

Releases: iampantherr/SecureContext

v0.18.0 — Sprint 2 Baseline: Self-Improving Skill Engine

[0.18.0] — 2026-04-29 — Sprint 2 baseline: skill mutation engine + replay + agentskills.io interop

Added — skill subsystem (src/skills/)

Added — cron primitive (src/cron/)

Added — 3 SQLite migrations (20-22) and 3 PG migrations (6-8)

Added — 7 new MCP tools

Added — entrypoint scripts

Added — RT-S2-* security tests

Documentation

Test suite: 786/786 (was 645)

Migration notes

Architectural decisions ratified (D1-D6)

Sprint 2.5 deferrals

Uh oh!

v0.17.2 — Architectural Lints (L1+L3) + Learning-Loop Closure (L4)

[0.17.2] — 2026-04-20 — Architectural lints (L1+L3) + learning-loop closure (L4)

Added — L1: env-pinning linter (scripts/check-env-pinning.mjs)

Added — L3: ESLint flat config with @typescript-eslint/no-floating-promises

Added — L4: outcome → learnings JSONL auto-feedback (src/outcome_feedback.ts)

Test suite: 645/645 (+16 from v0.17.1)

Migration

Uh oh!

v0.17.1 — Agent-Idle Fixes + Recall Cache + Cost Correctness

[0.17.1] — 2026-04-20 — Agent-idle fixes (A+B+C+D) + recall cache + cost-correctness (Tier 1+2)

Added — src/recall_cache.ts (60s TTL + change-detection)

Added — Tier 1 pricing: computeToolCallCost() in src/pricing.ts

Added — Tier 2 infra-tool zero-cost (INFRA_TOOLS set)

Added — HTTP endpoint GET /api/v1/queue/stats-by-role

Fixed — outcomes resolver pipeline (3 latent bugs from v0.12.0+)

Fixed — learnings-indexer.mjs hook coverage gaps

Test suite: 629/629 (+12 from v0.17.0)

Migration

Uh oh!

v0.17.0 — Work-Stealing Queue + Model Router + Ownership Guard + Multi-Worker Pools

[0.17.0] — 2026-04-20 — Sprint 3 Phase 3: Work-Stealing Queue + Model Router + Ownership Guard + Multi-Worker Pools

Added — Postgres work-stealing queue (§8.2)

Added — 6 MCP tools exposing the queue

Added — Complexity-based model router (§8.5)

Added — File-ownership overlap guard at /api/v1/broadcast (§8.2)

Fixed — recallSharedChannel was silently dropping v0.15.0 §8.1 structured columns

Added — -WorkerCount N on start-agents.ps1 + role-tagged registration (A2A_dispatcher side)

Added — scripts/backfill-learnings.mjs (close the learning loop)

Test Suite

Migration

Uh oh!

v0.16.0 — Sprint 3 Phase 2: Postgres Backend + T3.1 SET LOCAL ROLE + T3.2 RLS

Three major adds

Postgres backend for telemetry/outcomes

Tier 3 fix T3.1 — per-query SET LOCAL ROLE

Tier 3 fix T3.2 — Row-Level Security on outcomes_pg

HTTP API forwards structured ASSIGN columns

Tests

Bugs found + fixed during integration

Known limitations

Upgrade notes

What's next

Uh oh!

v0.15.0 — Sprint 3 Phase 1: Structured ASSIGN + MAC Classification (Tier 3 part)

Two features

§8.1 Structured ASSIGN broadcast schema (additive, backward-compatible)

§8.6 T3.2 MAC-style classification on outcomes (Chin & Older 2011 Ch5+Ch13)

Tests

Live verification

Known limitations (deferred to v0.16.0)

Uh oh!

v0.14.0 — Native AST + Provenance Tagging + Louvain Community Detection

Three features

Phase A — Provenance tagging

Phase B — AST extractor (TS/JS/Python)

Phase C — Louvain community detection

Test summary

Two new MCP tools

Backward compatible

Recommended workflow for agents

What's next

Uh oh!

v0.13.0 — graphify Integration: Structural Knowledge Graph as a First-Class SC Capability

Three new MCP tools

Auto-index `GRAPH_REPORT.md`

Added — skill subsystem (`src/skills/`)

Added — cron primitive (`src/cron/`)

Added — L1: env-pinning linter (`scripts/check-env-pinning.mjs`)

Added — L3: ESLint flat config with `@typescript-eslint/no-floating-promises`

Added — L4: outcome → learnings JSONL auto-feedback (`src/outcome_feedback.ts`)

Added — `src/recall_cache.ts` (60s TTL + change-detection)

Added — Tier 1 pricing: `computeToolCallCost()` in `src/pricing.ts`

Added — Tier 2 infra-tool zero-cost (`INFRA_TOOLS` set)

Added — HTTP endpoint `GET /api/v1/queue/stats-by-role`

Fixed — `learnings-indexer.mjs` hook coverage gaps

Added — File-ownership overlap guard at `/api/v1/broadcast` (§8.2)

Fixed — `recallSharedChannel` was silently dropping v0.15.0 §8.1 structured columns

Added — `-WorkerCount N` on `start-agents.ps1` + role-tagged registration (A2A_dispatcher side)

Added — `scripts/backfill-learnings.mjs` (close the learning loop)