Releases: iampantherr/SecureContext
v0.18.0 — Sprint 2 Baseline: Self-Improving Skill Engine
[0.18.0] — 2026-04-29 — Sprint 2 baseline: skill mutation engine + replay + agentskills.io interop
The self-improving skill loop. Skills become first-class hash-protected
artifacts; replay against synthetic fixtures produces composite outcome
scores; mutators propose candidate variants; winners promote atomically.
Per-project skills override global at resolve time. Cross-project
promotion candidates surface via findGlobalPromotionCandidates.
This is the Sprint 2 baseline — verified end-to-end with both unit
tests and a live cross-project demo against Postgres. v0.18.1 (next)
adds the CLI-based runtime mutator + outcome-trigger guardrails + operator-
gated global promotion queue, all without requiring an Anthropic API key.
Added — skill subsystem (src/skills/)
types.ts(192 lines) — Skill, SkillRun, SkillMutation, MutationContext type graphloader.ts(323 lines) — markdown frontmatter parser + HMAC-SHA256 body signstorage.ts(259 lines) — SQLite CRUD + tamper detection (SkillTamperedError)storage_pg.ts(248 lines) — Postgres mirror for skills_pg / skill_runs_pg / skill_mutations_pgstorage_dual.ts(146 lines) — backend-aware dispatch (sqlite | postgres | dual)scoring.ts(246 lines) — composite outcome score (accuracy + cost + speed) + acceptancereplay.ts(234 lines) — synthetic-fixture replay harness with HMAC-verify gatemutator.ts(228 lines) — pluggable Mutator interface + helpersmutators/local_mock.ts(71 lines) — deterministic test mutatormutators/realtime_sonnet.ts(125 lines) — Anthropic Messages API directmutators/batch_sonnet.ts(159 lines) — Anthropic Batch API (50% discount)orchestrator.ts(256 lines) — full select→mutate→replay→promote cycleformat/agentskills_io.ts(144 lines) — agentskills.io interop import/export
Added — cron primitive (src/cron/)
scheduler.ts(190 lines) — in-process scheduler with persistence, daily/interval triggers, history bound
Added — 3 SQLite migrations (20-22) and 3 PG migrations (6-8)
skills/skills_pg— versioned hash-protected skill registry (UNIQUE active per name+scope)skill_runs/skill_runs_pg— execution telemetry with composite outcome scoreskill_mutations/skill_mutations_pg— proposal + replay + promotion ledger
Added — 7 new MCP tools
| Tool | Purpose |
|---|---|
zc_skill_list |
List active skills with recent score |
zc_skill_show |
Full skill detail (HMAC-verified) |
zc_skill_score |
Aggregate score + acceptance check |
zc_skill_run_replay |
Replay against fixtures via LocalDeterministicExecutor |
zc_skill_propose_mutation |
Run one mutation cycle on demand |
zc_skill_export |
Export as agentskills.io markdown |
zc_skill_import |
Accept agentskills.io markdown → store as skill |
Added — entrypoint scripts
scripts/run-nightly-mutations.mjs— OS cron entrypoint (Linux cron / Windows Task Scheduler)scripts/sprint2-cross-project-demo.mjs— live cross-project promotion demo (verified)scripts/sprint2-live-demo.mjs— single-project mutation cycle demo (verified)
Added — RT-S2-* security tests
RT-S2-05: ZC_MUTATOR_MODEL allowlist falls back to local-mock on unknown valuesRT-S2-07: pre-submission secret_scanner rejects API-key / AWS-key payloadsRT-S2-08: skill body HMAC mismatch → SkillTamperedError on storage readRT-S2-09: candidate body HMAC verified before replay; mismatch → marked failed
Documentation
docs/SKILLS_WALKTHROUGH.md(~250 lines) — comprehensive usage guide
Test suite: 786/786 (was 645)
- 132 new Sprint 2 unit tests
- 9 new PG-mirror integration tests (require live PG)
- All quality gates green: ESLint 0 errors, env-pinning linter 0 unclassified
- Live cross-project demo: 9/9 steps pass against real Postgres
Migration notes
- 3 new SQLite migrations (20-22) auto-apply on first run
- 3 new PG migrations (6-8) require
ZC_TELEMETRY_BACKEND=postgres|dualfor activation - New env var
ZC_MUTATOR_MODEL(allowlist-enforced; defaults tolocal-mock) - No breaking changes — Sprint 2 additions are additive
Architectural decisions ratified (D1-D6)
- D1: Storage = dual (SQLite per-project default + PG centralized; both supported in this release)
- D2: Skill scope = hierarchical (per-project overrides global at resolve time)
- D3: Replay benchmark source = synthetic fixtures first (real-historical replay deferred to Sprint 2.5)
- D4: Mutation engine = Sonnet 4.6 batch primary + realtime fallback + LocalMock for tests
- D5: Per-tool-call cost storage (skill_runs.total_cost rolls up)
- D6: Existing learnings/ JSONL kept; auto-feedback loop from v0.17.2 preserved
Sprint 2.5 deferrals
Tracked in C:\Users\Amit\AI_projects\.harness-planning\ARCHITECTURAL_LESSONS.md:
- S2.5-1 Subprocess sandbox executor (RT-S2-03/04)
- S2.5-2 Real-historical replay
- S2.5-3 Override confirmation prompt (RT-S2-06)
- S2.5-4 Cross-project auto-promotion
- S2.5-5 Compacted-segment HMAC (RT-S2-08 for compaction)
- S2.5-7 zc_unredact tool
- S2.5-8 Skill injection scanner (RT-S2-01 hardening)
v0.17.2 — Architectural Lints (L1+L3) + Learning-Loop Closure (L4)
[0.17.2] — 2026-04-20 — Architectural lints (L1+L3) + learning-loop closure (L4)
Pre-Sprint-2 hardening round. Closes three classes of bugs identified by
the v0.17.1 verification retrospective before the mutation-engine build
begins. All three are "catch future regressions automatically so we
don't keep rediscovering the same class of bug by luck":
Added — L1: env-pinning linter (scripts/check-env-pinning.mjs)
Static analysis script that walks src/**/*.ts for every process.env.ZC_*
reference, classifies each as CRITICAL / SHARED_PROPAGATED / OPERATIONAL,
and verifies CRITICAL vars are explicitly pinned in BOTH orchestrator +
worker launcher heredocs of A2A_dispatcher/start-agents.ps1.
Would have caught the v0.17.0 ZC_AGENT_ID pollution bug that silently
mis-attributed 16 consecutive tool_calls to the wrong agent_id (breaking
per-agent HKDF subkey isolation + RLS + log scoping).
- 14-case self-test (
scripts/check-env-pinning.test.mjs) covering happy
path, missing pin, unclassified var, shared-propagation warnings,
bracket-notation refs, missing dispatcher path. - Run via
npm run check:env(production) ornpm run check:env:test(selftest). - Exit 0 = all green, exit 1 = new var unclassified OR critical missing.
Added — L3: ESLint flat config with @typescript-eslint/no-floating-promises
Installed eslint@9 + typescript-eslint@8 with a minimal config focused
on the single most-load-bearing rule: no-floating-promises. When the
outcomes.ts module became async in v0.12.0, the posttool-outcomes.mjs
hook kept calling resolveGitCommitOutcome(...) without await — the
process exited before the async DB write completed. 9 months of
undetected outcome-data loss. The lint would have caught it on the
first write.
- Scanned src/ on install: found 3 real floating-promise violations
(2recordToolCallinserver.ts, 1reader.cancelinfetcher.ts).
All fixed with explicitvoidoperator + comments documenting intent. - Self-test (
scripts/test-lint-catches-floating-promise.mjs) creates
a synthetic TS file with an unawaited call, confirms ESLint fails on
it, and confirmsvoid+awaitboth silence the rule. 5/5 pass. - Run via
npm run lintornpm run lint:test.
Added — L4: outcome → learnings JSONL auto-feedback (src/outcome_feedback.ts)
Closes the learning loop. Previously, a failure becoming a learning
required agent discipline: (1) notice failure, (2) write to
failures.jsonl, (3) remember the format, (4) let the hook mirror. Four
points of failure, all behavioral.
Now: recordOutcome({outcomeKind: 'rejected' | 'failed' | 'insufficient' | 'errored' | 'reverted'}) atomically appends a structured JSON line
to <projectPath>/learnings/failures.jsonl. Successful outcomes
(shipped, accepted) with confidence ≥ 0.9 append to
learnings/experiments.jsonl. Future sessions retrieve via zc_search
without any agent discipline required.
Features:
- Best-effort; swallows errors (never affects the primary outcome row).
- Auto-creates
learnings/dir if missing (guard: projectPath must exist). - Symlink-escape guard: target must resolve inside
<projectPath>/learnings/. - Payload capped at 64 KB per line; oversized evidence → dropped with a marker.
- Concurrent writers don't corrupt — single
appendFileSyncper line.
16 unit tests covering every outcome-kind branch, security guards
(symlink escape, ghost projectPath), large-evidence truncation, rapid
concurrent appends, and downstream-consumer format (learnings-indexer
can mirror these rows into PG).
Live verified end-to-end: called recordOutcome with kind='rejected'
→ failures.jsonl gained 1 structured line tagged
"source":"auto-feedback-v0.17.1". Low-confidence accepted correctly
skipped. High-confidence shipped landed in experiments.jsonl.
Test suite: 645/645 (+16 from v0.17.1)
- New:
src/outcome_feedback.test.ts(16 tests) - New:
scripts/check-env-pinning.test.mjs(14 cases) - New:
scripts/test-lint-catches-floating-promise.mjs(5 cases)
Migration
- No schema changes. No behavior changes for existing outcomes — the
feedback module is additive. Projects with nolearnings/dir get one
auto-created on the first failure/success outcome. - Operators running CI should add
npm run check:env+npm run lint
to the pipeline.
v0.17.1 — Agent-Idle Fixes + Recall Cache + Cost Correctness
[0.17.1] — 2026-04-20 — Agent-idle fixes (A+B+C+D) + recall cache + cost-correctness (Tier 1+2)
Hotfix round addressing five issues found in live verification of v0.17.0:
(a) agents going idle after zc_summarize_session instead of draining the
task queue, (b) zc_recall_context dominating session cost at ~82% on Opus,
(c) tool-call cost accounting billed at the wrong rate (5× over-reported on
Opus), (d) infra-tool noise polluting the orchestrator's "do it myself vs.
delegate to Sonnet developer" cost comparisons, and (e) seven
architectural bugs surfaced by end-to-end data-flow tracing.
Added — src/recall_cache.ts (60s TTL + change-detection)
- In-memory cache for
zc_recall_contextkeyed by(project_path, agent_id).
TTL 60s; cache miss on any newworking_memory/broadcasts/
session_eventsrow. Repeat calls inside the window return the prior
response prefixed with(cached Xs ago)— saves ~800 output tokens per hit.
Estimated savings: ~$0.06/call on Opus, ~$0.012/call on Sonnet. force: truearg bypasses the cache when an agent explicitly wants fresh data.- Cache is scoped per
(project_hash, agent_id)— no cross-agent leakage. - Process-lifetime only; max 64 entries with FIFO prune.
- 11 unit tests.
Added — Tier 1 pricing: computeToolCallCost() in src/pricing.ts
Tool calls now billed from the LLM's perspective:
- Tool call args (what the LLM generated to invoke) → billed at model's output rate
- Tool response (what the LLM reads on its next turn) → billed at model's input rate
The naive computeCost() inverted these, over-reporting cost by ~5× on Opus
(output $75/Mtok vs. input $15/Mtok). For zc_recall_context:
- Before: 798 × $75/Mtok = $0.060 (treated as Opus output)
- After: 798 × $15/Mtok = $0.012 (Opus reads as input on next turn)
Matters because the Opus orchestrator uses cost tracking to decide "do I
handle this myself vs. delegate to the Sonnet developer" — inflated
numbers nudge toward unnecessary delegation.
Added — Tier 2 infra-tool zero-cost (INFRA_TOOLS set)
DB-assembly tools (zc_recall_context, zc_file_summary, zc_project_card,
zc_status) now return cost_usd=0. Rationale: their responses are
deterministic from DB state — no LLM, no Ollama, no external service — so
per-call work is negligible. Token counts still accurate so audits can
recompute via computeToolCallCost.
Override: set ZC_DISABLE_INFRA_ZERO_COST=1 when you want full cost
reconciliation against Anthropic invoices.
Added — HTTP endpoint GET /api/v1/queue/stats-by-role
Returns { role: { queued, claimed, done, failed } } for task_queue_pg.
Used by the A2A dispatcher's new checkWorkerWake (see A2A_dispatcher
v0.17.1) to poke idle workers when their role has claimable work.
Fixed — outcomes resolver pipeline (3 latent bugs from v0.12.0+)
getMostRecentToolCallForSessionwas SQLite-only. In Postgres mode
session lookups returned null →resolveGitCommitOutcome+
resolveFollowUpOutcomessilently no-op'd. Result: every outcome row
since v0.12.0 (when the function became async) failed to persist.posttool-outcomes.mjshook had the same SQLite-only query for session
id discovery. Fixed with the same PG lookup + SQLite fallback pattern.- Hook called
resolveGitCommitOutcome(...)withoutawait. Process
exited before the async resolver's DB write completed. 9 months of
undetected outcome-data loss (L3 in the architectural-lessons doc).
Fixed — learnings-indexer.mjs hook coverage gaps
- Previously matched only
Write|Edit|MultiEdit|NotebookEdit. Agents
usingecho ... >> learnings/X.jsonlvia Bash silently bypassed the
hook. Now matchesBashtoo and parses>>/>redirection
targets from the command. - Hook only wrote to SQLite; Postgres
learnings_pgpopulated only via
manualscripts/backfill-learnings.mjs. Now mirrors to PG when
ZC_TELEMETRY_BACKEND=postgres|dual. Module-resolution handles running
from~/.claude/hooks/with nonode_modulesviafile://fallback
to SC repo'snode_modules/pg. projectPathhashing normalized viarealpathSyncso forward-slash /
backslash variants on Windows hash consistently.
Test suite: 629/629 (+12 from v0.17.0)
- Added
src/recall_cache.test.ts(11 tests: cold-miss, hit, staleness,
cross-agent/project isolation, TTL, undefined-agent bucketing). - Added telemetry non-infra-tool cost test.
- Updated
postgres_backend.test.ts RT-S3-06+sprint1_integration.test.ts
for new cost formula.
Migration
- Pure code fixes — no schema changes.
- Historical
tool_calls_pgrows retain their oldcost_usdvalues; new
rows use corrected formula. - To use
-WorkerCount Nwith PG backend, ensure sc-api is rebuilt from
v0.17.1 source (adds/api/v1/queue/stats-by-roleendpoint).
v0.17.0 — Work-Stealing Queue + Model Router + Ownership Guard + Multi-Worker Pools
[0.17.0] — 2026-04-20 — Sprint 3 Phase 3: Work-Stealing Queue + Model Router + Ownership Guard + Multi-Worker Pools
Sprint 3 Phase 3 — the pieces that let multiple workers in the same role share one task queue without stepping on each other. Closes the "single worker per role" limit that v0.15.0/v0.16.0 left in place.
Added — Postgres work-stealing queue (§8.2)
task_queue_pgtable (migration id=5) with state CHECK constraint + routing index(project_hash, role, state, ts)+ partial heartbeat indexWHERE state='claimed'.src/task_queue.ts— seven operations backed byFOR UPDATE SKIP LOCKEDso N workers can race-claim atomically without blocking each other:enqueueTask()— idempotent (ON CONFLICT DO NOTHING)claimTask()— atomic primitive (UPDATE ... WHERE task_id = (SELECT ... FOR UPDATE SKIP LOCKED LIMIT 1))heartbeatTask()— workers must call every 30scompleteTask()/failTask()— terminal states (fail bumpsretries)reclaimStaleTasks(staleAfterSeconds=300)— sweep dead claims back to queuegetQueueStats()— counts by state
- 13 unit tests (
src/task_queue.test.ts) including:- RT-S4-01: 50 concurrent workers × 100 tasks → each task claimed EXACTLY once (no double-claim; core correctness property of
SKIP LOCKED) - RT-S4-02: 600s-stale heartbeat → reclaim back to queued + retries++
- RT-S4-03:
failTaskbumps retries + persists failure_reason - RT-S4-04: cross-role + cross-project scope isolation
- RT-S4-01: 50 concurrent workers × 100 tasks → each task claimed EXACTLY once (no double-claim; core correctness property of
Added — 6 MCP tools exposing the queue
zc_enqueue_task(orchestrator) ·zc_claim_task(worker) ·zc_heartbeat_task·zc_complete_task·zc_fail_task·zc_queue_stats- Worker
agent_idis sourced fromZC_AGENT_IDenv var so a multi-worker pool (e.g.developer-1/2/3allrole=developer) shares one queue keyed by(project_hash, role)and claims atomically. - 5 MCP integration tests (
src/task_queue_mcp.test.ts) covering end-to-end lifecycle, 3-worker race, fail path, stats aggregation, cross-project isolation.
Added — Complexity-based model router (§8.5)
src/indexing/model_router.ts—chooseModel(complexity 1-5)returns{model, tier, reason, estimatedInputCostPerMtok, inputClamped}:- 1-2 → Haiku 4.5 (trivial tasks, $0.25/Mtok)
- 3-4 → Sonnet 4.6 (standard work, $3.00/Mtok — cost/quality sweet spot)
- 5 → Opus 4.7 (hard reasoning, $15.00/Mtok)
- Env overrides:
ZC_MODEL_TIER_{HAIKU,SONNET,OPUS}resolved per call so operators can flip at runtime. - Safe defaults:
null/undefined/NaN/Infinity/ out-of-range → Sonnet withinputClamped=true. - 19 unit tests covering tier mapping, rounding, clamping edges, env overrides, result shape.
zc_choose_modelMCP tool wraps it.
Added — File-ownership overlap guard at /api/v1/broadcast (§8.2)
- HTTP API rejects
ASSIGNwhosefile_ownership_exclusiveoverlaps any in-flight (unmerged) ASSIGN's exclusive set → HTTP 409 Conflict withoverlapping_files+conflicting_broadcast_id. Prevents two workers being assigned the same file. - "In-flight" = ASSIGN whose
taskhas no subsequent MERGE in the last 200 broadcasts. - 5 integration tests (
src/ownership_guard.test.ts):- RT-S4-05: overlapping exclusive → 409
- RT-S4-06: disjoint exclusive → 200
- RT-S4-07: re-ASSIGN allowed after MERGE of the prior task
- Plus back-compat (no excl set) + non-ASSIGN types bypass guard
Fixed — recallSharedChannel was silently dropping v0.15.0 §8.1 structured columns
SQLite-path recallSharedChannel only projected legacy columns. All downstream consumers saw file_ownership_exclusive=undefined even when the DB column was populated — the ownership-guard work surfaced this hidden v0.15.0 gap. Now projects all 7 v0.15.0 §8.1 columns with NULL → undefined semantics.
Added — -WorkerCount N on start-agents.ps1 + role-tagged registration (A2A_dispatcher side)
- New
-WorkerCountparam (1-20, default 1). When > 1, expands each-Rolesentry into N numbered workers suffixed-1..-N:start-agents.ps1 -Roles developer -WorkerCount 3 # → spawns developer-1, developer-2, developer-3 # each with its own WT window, worktree, registration # all sharing role="developer" — one work-stealing queue
Get-AgentRolehelper strips-Nsuffix so$roleMeta+roles.jsondeep-prompt lookups still work.register.mjsaccepts--roleflag /ZC_AGENT_ROLEenv → writes_agent_roles[agentId]sidecar so dispatcher can route by role without breaking the existingagentId → panestring map.- Back-compat:
WorkerCount=1(default) preserves legacy plain names ("developer" not "developer-1"). - Env propagation fix: worker/orchestrator launch scripts now also propagate
ZC_POSTGRES_*+ZC_TELEMETRY_BACKENDso the agent's MCP server can reachtask_queue_pg(closes the longstanding v0.10.4 env-propagation follow-up).
Added — scripts/backfill-learnings.mjs (close the learning loop)
- The PostToolUse
learnings-indexer.mjshook only mirrors NEW Write/Edit events — prior<project>/learnings/*.jsonlrows never get indexed intolearnings/learnings_pg. So agents couldn'tzc_searchpast decisions/failures from earlier sessions. - New script scans
<project>/learnings/*.jsonl, categorizes by filename stem, idempotently upserts (viaUNIQUE), mirrors to PG whenZC_TELEMETRY_BACKEND=postgres|dual. - Verified on Test_Agent_Coordination: 6 rows backfilled (3 decisions + 3 metrics). Previously both SQLite and PG had 0 learnings rows despite JSONL content existing.
Test Suite
- 617/617 unit+integration tests pass (was 575 pre-v0.17.0; +42 new: 13 task_queue + 19 model_router + 5 ownership guard + 5 task_queue MCP).
- Live E2E on Test_Agent_Coordination with
-WorkerCount 3: agent calledzc_choose_model(verified 2→haiku, 4→sonnet, 5→opus tier mapping), enqueued 3 disjoint-ownership tasks viazc_enqueue_task, workers atomically claimed viazc_claim_task, committed actual file hardening (e.g.checkRequest(req)insrc/rate-limiter.jsthrowingTypeError: rate-limiter: req argument is required;harden: validate argv in indexcommitf25acf5a).
Migration
- Schema: migration id=5 (
task_queue_pg) is idempotent + additive — Postgres-only feature (no SQLite companion). - API: zero breaking changes. All new MCP tools are additive.
- Env for workers: if you run in HTTP/Postgres mode, restart agents via
start-agents.ps1so they pick up the updated launch scripts that propagateZC_POSTGRES_*. Until then,zc_enqueue_task/zc_claim_taskreturnPostgres pool unavailable.
v0.16.0 — Sprint 3 Phase 2: Postgres Backend + T3.1 SET LOCAL ROLE + T3.2 RLS
Sprint 3 Phase 2 — Postgres backend (deferred since v0.12.x) + both Tier 3 access-control fixes from §8.6 of the canonical plan. Closes the v0.15.0 limitation where structured ASSIGN fields were silently dropped in HTTP API mode.
Three major adds
Postgres backend for telemetry/outcomes
`ChainedTablePostgres` mirrors `ChainedTableSqlite` using `BEGIN; SELECT row_hash ... FOR UPDATE; INSERT; COMMIT` (Postgres analog of SQLite's BEGIN IMMEDIATE). Same chain content (HKDF-keyed HMAC) — rows byte-identical across backends, migration is a SQL copy.
Wired in via existing `ZC_TELEMETRY_BACKEND=sqlite|postgres|dual` env switch.
Tier 3 fix T3.1 — per-query SET LOCAL ROLE
Each agent now writes telemetry under a per-agent Postgres role (`zc_agent_`), lazily provisioned with minimum INSERT/SELECT/UPDATE grants. Each chained INSERT runs inside `BEGIN; SET LOCAL ROLE ; INSERT; COMMIT` — Postgres' `current_user` reflects the actual writing agent, not the pool's user.
Tier 3 fix T3.2 — Row-Level Security on outcomes_pg
4 RLS policies enforce read tiers (Chin & Older 2011 Ch5+13, Bell-LaPadula confidentiality):
- `public/internal` → any role
- `confidential` → registered agent
- `restricted` → ONLY `created_by_agent_id` (matched against `current_setting('zc.current_agent')`)
This is enforced INSIDE Postgres, not in app code. Even a compromised agent process with valid DB credentials cannot read other agents' restricted outcomes.
HTTP API forwards structured ASSIGN columns
Closes the v0.15.0 known limitation. `POST /api/v1/broadcast` now accepts and forwards all 7 v0.15.0 structured fields.
Tests
- 575/575 pass (565 + 10 new Postgres tests)
- Postgres tests run against real local Docker container, auto-skip when no PG reachable
- RT-S3-05 verified live: cross-agent read of `'restricted'` row blocked by Postgres RLS even with shared DB credentials
- RT-S3-06 verified live: chain hashes byte-identical across SQLite + Postgres (rows migrate without rehashing)
Bugs found + fixed during integration
- `provisionAgentRole` originally inside writer txn → grants invisible to SET LOCAL ROLE. Fixed via separate-connection provisioning.
- `SELECT FOR UPDATE` needs `UPDATE` privilege on most PG versions — added explicit GRANT.
- Missing `GRANT USAGE ON SCHEMA public` — required for table access.
Known limitations
- Existing `securecontext-api` Docker container is v0.8.0 — needs rebuild (`docker compose build sc-api && docker compose up -d sc-api`) to pick up v0.16.0 endpoints
- Live multi-agent test through `start-agents.ps1` with `ZC_TELEMETRY_BACKEND=postgres` requires that container rebuild — functionally validated via 10 unit tests against real Postgres + RT-S3-05 cross-agent RLS test
Upgrade notes
Backward-compatible by default. Don't set `ZC_TELEMETRY_BACKEND` and SQLite continues exactly as v0.15.0.
To enable Postgres backend:
- Set `ZC_POSTGRES_PASSWORD` (or `ZC_POSTGRES_URL`)
- Set `ZC_TELEMETRY_BACKEND=postgres` (or `=dual` for parity verification)
- Pool's owning role needs `CREATEROLE` privilege (bundled `scuser` already has it)
- Rebuild + redeploy the Docker `securecontext-api` container
What's next
v0.17.0 — §8.2-8.5 work-stealing queue + worker pool spawning + file-ownership enforcement + complexity-based model routing. Uses the Postgres backend shipped here.
See CHANGELOG.md for full details.
v0.15.0 — Sprint 3 Phase 1: Structured ASSIGN + MAC Classification (Tier 3 part)
First slice of Sprint 3. Foundation pieces that don't require Postgres backend.
Two features
§8.1 Structured ASSIGN broadcast schema (additive, backward-compatible)
7 new optional fields on `zc_broadcast` for type=ASSIGN:
- `acceptance_criteria` (testable assertions)
- `complexity_estimate` (1-5)
- `file_ownership_exclusive` + `file_ownership_read_only` (path-traversal-filtered)
- `task_dependencies` (broadcast IDs that must MERGE first)
- `required_skills`
- `estimated_tokens`
Existing ASSIGN broadcasts work unchanged (backward-compat). Dispatcher in v0.17.0 will consume these for tier routing + file-ownership enforcement.
§8.6 T3.2 MAC-style classification on outcomes (Chin & Older 2011 Ch5+Ch13)
Classification labels: `public` / `internal` / `confidential` / `restricted` with read-filter:
- `'restricted'` rows readable ONLY by `created_by_agent_id` — closes the cross-agent leak gate from §8.6 T3.2
`resolveUserPromptOutcome` now auto-tags `'restricted'` with the agent's identity (sentiment about user messages belongs to the originating agent only).
Tests
- 565/565 pass (541 baseline + 24 new)
- RT-S3-02: cross-agent read of `'restricted'` row blocked
- RT-S3-03: legacy rows get `'internal'` default; CHECK blocks NULL
- RT-S3-04: SQL injection via classification value blocked by CHECK constraint
- Edge cases: complexity clamping, oversize cap, path traversal, integer-only deps, downgrade of restricted-without-creator
Live verification
Real Claude CLI agent on Test_Agent_Coordination processed broadcast #1037 (4 tool_calls). Local-mode broadcastFact verified all 7 structured fields round-trip through SQLite.
Known limitations (deferred to v0.16.0)
- HTTP API mode: existing api-server (Docker container) doesn't yet know about structured ASSIGN columns. Local mode works fully.
- T3.1 per-agent Postgres role: deferred since it depends on Postgres backend landing first (per §8.6 acceptance criteria).
- v0.17.0 will land §8.2-8.5: work-stealing queue, worker pool spawning, file-ownership enforcement, complexity-based routing.
See CHANGELOG.md for full details.
v0.14.0 — Native AST + Provenance Tagging + Louvain Community Detection
The "deeper internal capabilities" release. Three features that complement v0.13.0's graphify integration — bringing similar structural-understanding capabilities natively to SC's KB even when graphify isn't available.
Three features
Phase A — Provenance tagging
Every `working_memory` and `source_meta` row now carries a `provenance` flag (Chin & Older 2011 Ch6+Ch7 'speaks-for' formalism — every claim carries its trust chain):
- EXTRACTED — read directly from a primary source
- INFERRED — produced by an LLM
- AMBIGUOUS — multiple plausible readings
- UNKNOWN — legacy default
API additive (backward compat). Promotion/downgrade via re-assert. Migrations 16+17 with CHECK constraint. RT-S3-01 verifies SQL injection blocked.
Phase B — AST extractor (TS/JS/Python)
Regex-based deterministic L0/L1 for code files without an LLM call. ~80% LLM cost reduction on indexing for code-heavy projects.
Live samples from the agent run:
- `rate-limiter.js` → "REST API Rate Limiter Middleware. Contains 1 class, 1 function."
- `search.js` → "Task Search — Fuzzy Matching... Contains 2 functions, 1 import."
Why regex first, tree-sitter later: tree-sitter requires per-language WASM grammars (~500KB each) that aren't bundled. Regex covers 80/20 case at zero install friction. Interface designed for v0.15.0 swap with no breaking change.
Phase C — Louvain community detection
`zc_kb_cluster` + `zc_kb_community_for` MCP tools cluster KB sources by graph topology (no embeddings needed). For "what's related to X" questions, two files that import each other are obviously related — no embedding call needed.
Live verification: clustered 26 sources from Test_Agent_Coordination into 5 communities (sizes 6+6+5+2+1).
(Algorithm note: Louvain not Leiden — Leiden isn't published as npm package. Same family, similar quality.)
Test summary
- 541/541 tests pass (470 baseline + 71 new)
- Live agent run: All three features fired correctly with real Claude CLI agents on Test_Agent_Coordination
- Edge cases covered: empty files, syntax-broken files, very-large >5MB files, comments-only, abstract classes, generator functions, default exports, Python all, async def, decorators
Two new MCP tools
| Tool | Purpose |
|---|---|
| `zc_kb_cluster()` | Run Louvain over KB; persist communities |
| `zc_kb_community_for(source)` | Look up a source's community + community-mates |
Backward compatible
All existing code paths unchanged:
- `rememberFact` and `indexContent` keep old signatures (provenance is new optional last arg)
- AST is automatic for code extensions (no API change)
- Migrations 16+17 are defensive (idempotent)
Recommended workflow for agents
| Question | Right tool |
|---|---|
| "What's the architecture of this project?" | `zc_kb_cluster` first, drill in with `zc_kb_community_for` |
| "What's related to file X?" | `zc_kb_community_for("file:src/X.ts")` |
| "Summarize this code file" | `zc_file_summary` (now AST-extracted if TS/JS/Python) |
What's next
Sprint 3 picks up Tier 3 access-control fixes — see `HARNESS_EVOLUTION_PLAN.md §8.6` (locked with hard "DO NOT START" gate).
See CHANGELOG.md for full details.
v0.13.0 — graphify Integration: Structural Knowledge Graph as a First-Class SC Capability
SC + graphify stacked. SC now proxies to graphify (29.7k★, AI coding assistant skill) so agents can navigate the structural knowledge graph alongside SC's persistent state + telemetry. They solve different problems and stack multiplicatively for token savings on architectural questions.
Three new MCP tools
- `zc_graph_query(query)` — natural-language query over the structural graph (god nodes, communities, relationships)
- `zc_graph_path(from, to)` — shortest path between two named nodes
- `zc_graph_neighbors(node)` — immediate neighbors of a named node
All three return helpful hints when graphify isn't set up — they're inert until `pip install graphifyy && /graphify .` is run.
Auto-index `GRAPH_REPORT.md`
`zc_index_project` now auto-detects `graphify-out/GRAPH_REPORT.md` and indexes it into SC's KB so agents discover it via normal `zc_search` without needing to know graphify exists.
Token savings (combined SC + graphify)
| Question | Without either | SC alone | graphify alone | Both stacked |
|---|---|---|---|---|
| Architectural ("how does auth work") | ~25k | ~2k | ~500 orient | ~1.5k |
| State / history | N/A | ~1.5k | N/A | ~1.5k |
| Specific implementation | N/A | ~800 | N/A | ~800 |
Tests
- 470/470 pass (459 baseline + 11 new graph_proxy tests)
- Live subprocess path not unit-tested (requires Python + graphifyy in CI; covered by manual integration)
How to enable
```bash
One-time
pip install graphifyy && graphify install
Per project
/graphify .
In your AI assistant
zc_graph_query "how does the auth flow connect to the database?"
```
If graphify isn't installed, SC works exactly as before. The new tools just return hints.
Recommended workflow
| Question type | Right tool |
|---|---|
| Architectural / structural | `zc_graph_query` first, then `zc_search` for precise content |
| State / history | `zc_recall_context` |
| Specific implementation | `zc_search` |
| What's connected to X | `zc_graph_neighbors` |
Deferred to v0.14.0
The deeper internal capabilities (complement graphify rather than replace):
- Native AST tree-sitter pre-pass for code files (LLM-free L0 — ~50% indexing cost reduction)
- EXTRACTED / INFERRED / AMBIGUOUS provenance tagging (Chin & Older 2011 "speaks-for" formalism — every claim carries its trust chain)
- Leiden community detection over SC's KB (graph topology beats vector similarity for some queries)
Then Sprint 3 picks up Tier 3 access-control fixes — see `HARNESS_EVOLUTION_PLAN.md §8.6`.
See CHANGELOG.md for full details.
v0.12.1 — Tier 2: Reference Monitor + session_token Binding for Telemetry
Closes the two largest remaining access-control gaps from the v0.12.0 design review. Telemetry writes now have a single bypass-proof enforcement point that authenticates the writer's identity, not just verifies row integrity.
Highlights
- HTTP API Reference Monitor — `POST /api/v1/telemetry/tool_call` and `/outcome` enforce session_token binding before any DB write. Pattern from Chin & Older 2011 Ch12 ("Reference Monitor" — exactly one enforcement point per protected resource, tamper-proof + always invoked + verifiable).
- session_token binding — every telemetry write requires `Authorization: Bearer <session_token>`. Server asserts the token's bound `agent_id` matches the row's claimed `agent_id` (HTTP 403 on mismatch).
- `ZC_TELEMETRY_MODE` env switch — `local` (default, unchanged), `api` (route through Reference Monitor), `dual` (write to both for migration).
- Token cache + lifecycle — fetched lazily, cached 1 hour, re-fetched on 401, falls back to local mode if unreachable.
Security closes Tier 2 gaps
| Gap | Before v0.12.1 | After v0.12.1 |
|---|---|---|
| #1 No bypass-proof enforcement | Each agent's MCP server opened the project DB directly | All writes route through the API; only the API process holds DB write authority |
| #2 `agent_id` was an unauthenticated string | Agent A could write rows claiming to be agent B | API verifies `body.agentId === token.aid`; forgery blocked with HTTP 403 |
Combined with v0.12.0's per-agent HMAC subkey (Tier 1 #1), telemetry rows are now integrity-protected (chain) AND authenticated (token-bound writer).
Red-team tests
- RT-S2-02: alice's token cannot write a row claiming bob → 403
- RT-S2-03: missing/malformed/empty Authorization header → 401
- RT-S2-04: revoked token → 401
- RT-S2-05: project-A token used against project-B → 401 (project-scoped capability per Ch11)
- RT-S2-06: end-to-end via `recordToolCallViaApi` succeeds with valid token
Test summary
- 459/459 tests pass (449 baseline + 10 new Reference Monitor tests)
- Stress test still chain ✓ OK under 10 concurrent writers × 100 calls (458 writes/sec)
Upgrade notes
Backward-compatible by default. Existing deployments continue using local-mode SQLite unless they set `ZC_TELEMETRY_MODE=api`.
For multi-agent production deployments:
- Set `ZC_API_KEY` (already required for v0.9.0+ broadcast RBAC)
- Set `ZC_TELEMETRY_MODE=api` in agent environments
- Set `ZC_AGENT_ID` + `ZC_AGENT_ROLE` per agent (used for session_token issuance)
- Rebuild the SC HTTP API Docker image — the new `/api/v1/telemetry/*` endpoints need the v0.12.1 code. The shipped image will need a refresh.
Deferred to v0.12.2
- Postgres backend (`ChainedTablePostgres`) — second `ChainedTable` implementation
- Tier 1 fix #2 (POSIX 0700/0600 hardening)
- Tier 1 fix #3 (per-agent Postgres role with INSERT-only grant)
- Cross-backend stress test
- Docker image rebuild + publish
Sprint 3 then picks up Tier 3 — see `HARNESS_EVOLUTION_PLAN.md §8.6` (locked in with hard "DO NOT START Sprint 3 until..." gate).
See CHANGELOG.md for full details.
v0.12.0 — Sprint 2 Prep: ChainedTable Abstraction + Per-Agent HMAC Subkey (Tier 1 #1)
Foundation release for the dual-backend telemetry roadmap. Ships the storage abstraction layer that v0.12.1 will plug Postgres into, and closes the largest pre-existing access-control gap in v0.11.0's hash-chain design.
Highlights
ChainedTablebackend-agnostic abstraction with HKDF-derived per-agent HMAC subkey- Tier 1 access-control fix #1 closed: per-agent HMAC subkey blocks cross-agent row forgery (RT-S2-01 verifies)
- Async public API (Option 4):
recordToolCall,recordOutcome, and the 3 resolvers are allasync— SQLite path stays sync internally; future backends drop in without API change - Removed
_lastHashCachefrom v0.11.0 (was redundant withBEGIN IMMEDIATE, added a Heisenbug surface)
Security closes Tier 1 Gap #5 (Chin & Older 2011, Ch6+Ch7)
v0.11.0 used the raw machine secret as the HMAC key, making chains integrity-only. An insider with the machine secret could compute valid HMACs for any agent_id.
v0.12.0 derives per-agent subkeys: `HKDF-Expand(machine_secret, "zc-chain:" || agent_id, 32)`. Verifier reads each row's stored agent_id and derives the matching subkey — a row claiming the wrong identity fails HMAC verification.
Combined with v0.12.1's session_token binding, telemetry rows become authenticated, not just integrity-protected.
⚠️ BREAKING — chain verification
Existing v0.11.0 chains will fail to verify under v0.12.0. The HMAC key derivation changed (raw secret → HKDF subkey). `verifyToolCallChain` reports `brokenAt: 0, brokenKind: "hash-mismatch"` for any pre-upgrade row.
Migration: non-production deployments can truncate and restart. Production deployments should wait for v0.12.1's `scripts/migrate-v011-to-v012-chains.mjs` re-hash helper.
Test summary
- 449/449 tests pass (433 baseline + 16 new chained_table tests + RT-S2-01)
- Stress test 10×100 still chain ✓ OK (regression from v0.11.0 + a7ed9a1 confirmed)
- All 22 prior test files have async-cascade calls updated; no test logic changes
What ships next (v0.12.1)
- Tier 2 fix #1: Reference Monitor (telemetry routes through HTTP API, single bypass-proof enforcement point per Chin & Older Ch12)
- Tier 2 fix #2: session_token binding for telemetry writes
- Postgres backend (`ChainedTablePostgres`) with single-statement INSERT + FOR UPDATE
- `ZC_TELEMETRY_BACKEND=sqlite|postgres|dual` env selection
- Remaining Tier 1 fixes (POSIX hardening + per-agent Postgres role)
- Cross-backend stress test
Sprint 3 then picks up Tier 3 — see `HARNESS_EVOLUTION_PLAN.md §8.6`.
See CHANGELOG.md for full details.