Releases: boshu2/agentops
v2.37.2
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
This hotfix hardens AgentOps' validation and execution surfaces across hooks,
Codex runtime artifacts, and release gating. It also adds proof-backed swarm
evidence checks and safer worker git defaults, while tightening several CLI,
compile, and harvest edges that surfaced during release prep.
What's New
- Proof-backed swarm validation — release and validation gates now treat
swarm evidence as a first-class contract. - Safer multi-agent git behavior — worker sessions now respect a
lead-only git guard in the hook chain. - More recoverable compile and harvest runs — compile adds reset and
repair controls, and harvest explains near-miss exclusions instead of
silently dropping them.
All Changes
Added
- Swarm-evidence schema and validator, wired into validation and release gates
- Lead-only worker git guard in the hook chain for multi-agent sessions
- Compile runtime preference plus
--resetand--repair, and harvest
near-miss reporting
Changed
- Release, pre-push, and Codex/runtime validation now cover more hook,
evidence, and artifact surfaces before publish - Next-work backlog bookkeeping and Codex/runtime docs were normalized to
better match shipped behavior
Fixed
- Pre-mortem gate ambiguity now fails closed instead of open
ao rpi serve --run-id,ao mine --dry-run, compile and harvest edge
handling, CI fixture drift, shellcheck drift, and Codex artifact metadata
drift
Full changelog
Added
- Swarm evidence validation — AgentOps now ships a swarm-evidence schema and validator, and wires that proof surface into validation and release gates.
- Lead-only worker git guard — worker sessions now have an explicit lead-only git guard in the hook chain, reducing accidental write authority in multi-agent runs.
- Compile and harvest operator controls —
ao compileadds runtime preference plus--resetand--repaircontrols, while harvest now reports excluded low-confidence candidates and top near-misses.
Changed
- Release and pre-push validation — local release, pre-push, and command coverage gates now validate more of the hook, evidence, and Codex runtime surface before publish.
- Codex/runtime artifacts and docs — compile, evolve, post-mortem, swarm, and related runtime docs and artifacts were decomposed and synchronized to better match shipped behavior.
- Flywheel backlog bookkeeping — next-work aggregates, consumed markers, and enum normalization were cleaned up so carry-forward work is recorded consistently.
Fixed
- Pre-mortem gate ambiguity — the crank pre-mortem gate now denies ambiguous state by default instead of failing open.
- CLI and shell reliability edges —
ao rpi serve --run-idnow accepts legacy 8-hex IDs,ao mine --dry-runemits a single clean JSON payload, and bash invocations are sanitized to bypass unsafe shell aliases. - Compile, harvest, and release drift — compile repair defaults, malformed frontmatter salvage, YAML parse error surfacing, CI fixture drift, shellcheck drift, and Codex artifact metadata drift were corrected.
Full Changelog: v2.37.1...v2.37.2
v2.37.1
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
Dream now leaves behind actionable morning work instead of just a short
overnight summary. This hotfix adds ranked morning packets, an opt-in long-haul
corroboration pass for weak results, and a reliable headless Claude path for
Dream Council, so the overnight loop is both more useful and more dependable.
What's New
- Actionable morning packets — Dream can turn overnight evidence into
ranked next steps with suggested commands and handoff metadata. - Smarter long-haul mode — extra overnight time is only spent when the
first pass is weak, and it is used to corroborate the strongest packet. - Reliable headless Claude council — Dream Council now works against the
real Claude JSON contract instead of the broken spawn path that degraded or
timed out runs.
All Changes
Added
- Dream morning packets that carry ranked next steps, evidence, target files,
and queue or bead handoff metadata - Yield telemetry and an adaptive long-haul mode that can corroborate weak
overnight output before handing it off
Changed
- Overnight runs now prefer cheaper evidence corroboration before slower
council fan-out, so strong runs stay short and only weak output pays the
extra runtime cost
Fixed
- Headless Claude council spawning and result normalization for Dream
- Overnight close-loop reporting so Dream writes real report artifacts instead
of placeholder statuses - Retrieval-quality release gating so local release checks can fall back to
checked-in eval data when a local manifest is missing
Full changelog
Added
- Dream morning packets — Dream can now emit ranked morning work packets with evidence, target files, exact follow-up commands, and queue/bead handoff metadata.
- Dream yield telemetry and long-haul corroboration — overnight reports now record packet-confidence telemetry and can trigger a bounded long-haul corroboration pass when the first pass produces weak morning output.
Changed
- Dream decision flow — overnight runs now prefer cheaper evidence corroboration before slower council fan-out, so strong runs stay short and extended runtime is reserved for genuinely weak output.
Fixed
- Headless Claude Dream council — Dream now uses Claude's working JSON output contract for headless council runs and normalizes the returned envelope before validation.
- Dream close-loop and report surfaces — overnight runs now write real close-loop callbacks and post-loop report artifacts instead of leaving placeholder
pendingsteps. - Retrieval ratchet release gate fallback — the retrieval-quality release check now falls back to checked-in eval data when a local manifest is absent.
Full Changelog: v2.37.0...v2.37.1
v2.37.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
This release pushes AgentOps further toward a repo-native knowledge workspace.
With Karpathy's LLM-wiki getting attention, it's clearer that AgentOps was
already converging on a similar shape. This release takes some of the pieces he
executed better, especially index-first wiki navigation and knowledge
ergonomics, and integrates them into the existing .agents workflow.
At the same time, AgentOps now has a real Windows install path, a first-class
ao compile command, and a local LLM pipeline that can turn session
transcripts into reviewable wiki pages. Dream and evolve also connect more
tightly, so overnight knowledge passes can feed the daytime improvement loop
instead of living beside it.
What's New
- Windows is a real target — PowerShell install plus blocking native
Windows smoke coverage. - Knowledge compilation is a CLI command —
ao compilemoves the compiler
out of skill-only territory. - Transcript-to-wiki pipeline —
ao forgecan redact, summarize, review,
and store local LLM session pages in a browsable.agentswiki. - Dream feeds evolve —
ao evolve --dream-firstand--dream-onlylet
knowledge passes run before or instead of code cycles. - Stronger planning and retrieval guardrails — beads audit and cluster
surfaces, retrieval evaluation fixtures, and a CI ratchet make stale or weak
retrieval behavior easier to catch.
All Changes
Added
- Windows installer and native Windows smoke validation for install, doctor,
and overnight-sensitive paths ao compilecommand with docs and tests- Local LLM forge pipeline with redaction, structural review, and Dream
integration .agentswiki scaffolding with INDEX, LOG, and searchable wiki directories- New beads audit and clustering helpers, status quality signals, and retrieval
quality ratchets
Changed
- Dream can now run as a knowledge-first sub-cycle inside evolve
- Search and injection rank knowledge more intelligently using content dedup,
index boosts, and stability weighting - Public docs and workflow guidance now line up better with the current
operational-layer and context-compiler story
Fixed
- Windows overnight liveness detection
- Release retag safety and audit artifact validation
- Post-mortem closure audit handling for evidence-only and path-heavy cases
- Several Codex and RPI reliability edge cases around lifecycle restarts, JSON
writes, proof paths, and bridge validation
Full changelog
Added
- Windows install and smoke coverage —
scripts/install-ao.ps1adds a first-class Windows install path, and the blockingwindows-smokegate exercises PowerShell install, localao doctor, and Windows-sensitive Go packages. - Compile command —
ao compilemakes knowledge compilation a first-class CLI surface with docs and tests. - Local LLM forge pipeline —
ao forgecan now redact, summarize, structurally review, and queue transcript-derived wiki pages with Dream worker integration. - Dream curator and evolve sub-cycle — Dream gained a local curator adapter plus
ao evolve --dream-first|--dream-only, allowing overnight knowledge passes to feed the daytime improvement loop. .agentswiki surfaces — INDEX, LOG, wiki directories, and search integration formalize.agents/as a Karpathy-style knowledge wiki with index-first navigation.- Operational quality surfaces — beads audit/cluster commands, swarm preflight advice, status quality signals, retrieval eval queries, and a retrieval-quality CI ratchet broaden release-time proof.
Changed
- Knowledge scoring and search behavior — inject now deduplicates by content hash, boosts indexed pages, weights stability, and search can pull Dream vault and wiki sources with stronger local recall.
- Overnight and RPI internals — overnight, lifecycle, search, inject, harvest, and RPI flows were decomposed into smaller helpers while tightening proof paths, mixed-mode provenance, and worktree cleanup.
- Public framing and contributor docs — README, philosophy, planning/post-mortem docs, and reference surfaces now better match the context-compiler and operational-layer story.
Fixed
- Windows overnight liveness — Windows process checks no longer rely on Unix
signal(0)semantics. - Dream RunLoop status invariants — live-tree hash coverage now exercises every terminal RunLoop status, and
degradedreflects the current rollback semantics. - Release retag safety — release tooling now preserves annotated tags, validates audit artifact manifests and refs, and cancels stale reruns before duplicate publish attempts.
- Post-mortem and closure audits — metadata links, evidence-only closure packets, parser-path handling, and closure packet evidence modes were normalized.
- Codex and runtime reliability — same-thread lifecycle restart, root-scoped fallback reads, JSON config writes, bridge contract validation, and next-work proof-path handling were hardened.
Full Changelog: v2.36.0...v2.37.0
v2.36.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
This release turns Dream from a concept into a usable operator surface.
AgentOps can now run private overnight cycles against your real local
.agents/ corpus, produce a morning report, compare runner outputs, and
bootstrap bedtime setup without pretending GitHub Actions is your personal
memory engine. The public docs and onboarding story were also rebuilt around
the operational-layer framing, so the way AgentOps explains itself now matches
the product you install.
The 2026-04-11 refresh retags v2.36.0 to include the recovered RPI wave and the
new evolve/autodev v2 operator surface. The release now includes ao evolve,
root PROGRAM.md execution policy support, stale-scope bead tooling, RPI
discovery artifacts, Dream RunLoop invariant tests, and the release/CI hardening
needed to ship those changes cleanly.
What's New
- Evolve v2 operator command —
ao evolveexposes the autonomous
improvement loop directly in the CLI with cycle limits, pinned queues,
beads-only mode, quality mode, compile warmup, and strict-quality passthrough. - Autodev program contracts — root
PROGRAM.mdgives the loop a repo-local
contract for mutable scope, immutable scope, validation commands, escalation
policy, and stop conditions. - RPI recovery and stale-scope tooling —
ao beads verify|lint|harvest,
RPI discovery artifacts, and stale-scope checks make recovered or prior-session
plans harder to execute against stale evidence. - Private overnight runs —
ao overnight start,run, andreportgive
you a real local Dream loop with morning summaries instead of a CI-only
placeholder. - Dream setup and council scaffolding —
ao overnight setuphelps bootstrap
keep-awake and scheduler assistance, while the report contract supports
runner comparison, tension, and next-action synthesis. - Live flywheel proof — the nightly dream-cycle now records retrieval-bench
results and can draft skill candidates from repeated patterns, so compounding
is visible instead of implied. - Cleaner first-run experience — fresh repos now get an onboarding welcome
that routes them into research, implementation, or validation faster. - Sharper public story — README, docs, comparisons, and linked surfaces now
consistently explain AgentOps as bookkeeping, validation, primitives, and
flows for coding agents.
All Changes
Added
- Evolve v2 CLI command and autodev
PROGRAM.mdoperating contract - Beads stale-scope verification, linting, and harvesting commands
- RPI discovery artifact support and recovered-wave integration points
- Dream RunLoop invariant and failed-summary contract regression coverage
- Private overnight Dream commands, shared Dream config, and morning report
contracts - Live retrieval proof in the nightly dream-cycle, plus skill-draft generation
from repeated patterns - Fresh-repo onboarding welcome, docs-site navigation, and comparison pages
- Behavioral-discipline guidance and strategic-doc validation references
Changed
- Plan and pre-mortem skills now invoke bead stale-scope verification for aged,
full-complexity, or prior-session bead inputs - Council
--mixeddocumentation now makes silent fallback a hard contract
violation when Codex is unavailable - Public framing across README, onboarding, docs, and comparisons now matches
the operational-layer story - Dream docs now separate the private local engine from the public GitHub proof
harness - Codex packaging, runtime smoke coverage, and checked-in skill artifacts are
aligned around native hooks
Fixed
- Windows Codex installer coverage, golangci-lint v2 pinning, and recovered RPI
validation blockers - Security-toolchain false positives around deterministic seeded fixture
generation - Shared stale-scope reference placement for strict skill integrity validation
- Release-gate regressions across Pages docs, compile-skill headless
instructions, pre-push shim tests, and headless runtime smoke - Stale Codex install references, compile-skill artifact drift, and plugin
metadata mismatches - A few runtime-proof rough edges that blocked cleaner nightly and smoke-test
evidence
Full changelog
Added
- Evolve operator command —
ao evolvenow exposes the v2 autonomous improvement loop directly in the CLI, including--max-cycles,--queue,--beads-only,--quality,--compile, and strict-quality passthrough flags. - Autodev program contract — root
PROGRAM.mdgives evolve/autodev a repo-local operating contract with mutable and immutable scope, validation commands, escalation policy, and stop conditions. - Beads stale-scope tooling —
ao beads verify|lint|harvestadds first-class stale-citation checks for bead-driven planning and RPI recovery. - RPI discovery artifacts — RPI can now persist and consume discovery artifacts, with tests and docs covering the
--discovery-artifactpath. - Dream RunLoop invariant coverage —
TestRunLoop_LiveTreeHashInvariant_AllStatuseslocks theIsCorpusCompounded()and live-tree mutation invariant across deterministically reproducible terminal statuses, with remaining fixture statuses tracked inna-1iv. - Dream failed-summary contract coverage — regression tests now lock the
finalizeOvernightSummarycontract for MEASURE consecutive-failure halts and persisted iteration history. - Dream operator mode —
ao overnight start|run|report|setupadds a private overnight lane with shareddream.*config, keep-awake defaults, scheduler/bootstrap guidance, council-ready runner packets, and DreamScape-style morning summaries - Nightly live retrieval proof — the dream-cycle now runs
ao retrieval-bench --live --json, emits retrieval proof in nightly summaries, and keeps a visible artifact trail for flywheel health - Pattern-to-skill drafts — repeated patterns can now generate review-only skill drafts under
.agents/skill-drafts/during flywheel close-loop - Fresh-repo onboarding welcome — new session-start routing helps first-time repos enter discovery, implementation, or validation without needing the full RPI lane first
- Docs-site and contribution proof surfaces — GitHub Pages navigation, comparison pages, behavioral-discipline guidance, strategic-doc validation patterns, and a first-skill guide expand the public proof surface
Changed
- RPI wave recovery integrated — recovered RPI wave work landed across Dream, council, stale-scope planning, discovery artifacts, CI hardening, and Codex runtime surfaces.
- Council
--mixedstrict contract documented —skills/council/references/cli-spawning.mddocuments that/council --mixedrequires Codex CLI and emits a hard error instead of silently falling back to Claude-only. - Plan and pre-mortem skill bodies decomposed — focused reference files now carry the detailed pre-decomposition, scope-mode, mandatory-check, output, wave-matrix, and task-creation guidance while keeping the top-level skills within lint budgets.
- Bead-input pre-flight wired into planning skills —
/planand/pre-morteminvokeao beads verify <bead-id>for full-complexity, aged, or prior-session bead inputs before decomposition or validation. - Operational-layer framing — README, onboarding, docs, comparisons, and linked surfaces now consistently explain AgentOps as bookkeeping, validation, primitives, and flows for coding agents
- Dream runtime positioning — the public GitHub nightly is now documented as a proof harness, while
ao overnightis documented as the private local compounding engine - Codex default path — native hooks, install copy, runtime smoke coverage, and checked-in Codex artifacts are aligned around the native-plugin path on supported Codex versions
- Validation guidance — behavioral-discipline and strategic-doc review are now first-class references alongside code review and runtime validation
Fixed
- Windows Codex installer — Codex installation now has a Windows path instead of assuming Unix shell behavior.
- golangci-lint v2 contract — the local lint wrapper and CI configuration now pin the v2 behavior expected by the repository.
- security-toolchain-gate CI — deterministic fixture generation in
cli/internal/overnight/fixture/gen_fixture.gois annotated as a non-cryptographic seeded-random use, avoiding a false-positive semgrep blocker. - Recovered RPI validation blockers — validation drift from the recovered RPI wave was cleared before retagging the release.
- Stale-scope reference placement — shared stale-scope validation guidance now lives under
skills/shared/references/soheal.sh --strictcan resolve it consistently. - Release and CI drift — resolved docs-site Liquid/frontmatter issues, headless runtime smoke portability problems, pre-push shim test drift, and compile-skill headless command drift caught during release prep
- Codex install and artifact drift — fixed stale slash-command references, refreshed checked-in artifact metadata, added a Codex compile wrapper, and corrected plugin/marketplace mismatches exercised by smoke coverage
- Runtime proof stability — promoted Codex runtime smoke into the blocking smoke path and fixed related shellcheck and install-surface rough edges
Removed
- DevOps-rooted tagline — public framing no longer leads with the old DevOps-layer tagline; the Three Ways lineage remains supporting doctrine instead of the category label
...
v2.35.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Added
- Codex native hooks — AgentOps hooks now install natively into Codex CLI v0.115.0+ via
~/.codex/hooks.json; 8 hooks wired (session-start, inject, flywheel-close, prompt-nudge, quality-signals, go-test-precommit, commit-review, ratchet-advance); installer enablescodex_hooksfeature and upgrades from hookless fallback to native hook runtime - Knowledge compiler skill — renamed athena →
/compilewith Karpathy-style incremental compilation, pluggable LLM backend (AGENTOPS_COMPILE_RUNTIME=ollama|claude), interlinked markdown wiki output at.agents/compiled/ - App struct dependency injection —
Appstruct carriesExecCommand,LookPath,RandReader,Stdout,Stderrseams; gc bridge, events, executor, context relevance, tracker health, and stream modules accept injected dependencies instead of mutable package-level vars - Test shuffle in CI —
-shuffle=onadded tovalidate.ymlandMakefiletest targets, exposing and fixing 6 ordering-dependent tests (cobra flag leaks, maturity var leaks, env var leaks)
Changed
- CLI internal extraction (waves 5-13) — business logic extracted from
cmd/aomonolith into 15internal/domain packages (rpi,search,context,quality,goals,lifecycle,bridge,forge,mine,plans,knowledge,storage,pool,taxonomy,worker) using Options struct pattern for dependency injection - Goals test migration — 7 goals test files moved from
cmd/aotointernal/goalsas external test package (goals_test) witht.Parallel()and directgoals.Run*()calls replacing cobra command wiring - Test isolation —
resetCommandStatenow saves/restores 10 maturity globals;resetFlagChangesRecursiveresets flag values to defaults; RPILoop and toolchain tests clearAGENTOPS_RPI_RUNTIME*env vars viat.Setenv
Fixed
- Defrag test flag leak —
TestDefragOutputDirFlagusedcmd.Flags().Lookup("output")which matched the root persistent--outputflag; changed tocmd.LocalFlags().Lookup("output") - Goroutine leak false positive —
TestRunGoals_GoroutineLeakusedgoleak.VerifyNonewhich caught goroutines from parallel tests; switched togoleak.IgnoreCurrent()to only detect leaks within the test itself - Secret scan false positives — excluded
.gc/directory andGetenv/os.Environpatterns from secret pattern scan - Codex skill validation — added
output_contractas valid schema key,cross-vendor/knowledgeas valid tiers, fixed$/prefix in codex forge/post-mortem/scenario skills - Scenario CLI snippets — replaced non-existent
--source/--scopeflags with valid--statusvariants
Removed
- Coverage percentage CI gates — removed
coverage-ratchetjob,check-cmdao-coverage-floor.sh,.coverage-baseline.json, and associated BATS tests; percentage gates blocked CI during architectural refactors without catching bugs fire.go— FIRE loop (find-ignite-reap-escalate) superseded by gc sling + bead dispatch;formatAgehelper moved toinject_predecessor.gorpi_workers.go— per-worker health display superseded by gc agent health patrol;ao rpi workerssubcommand removed from CLI and docs
Full Changelog: v2.34.0...v2.35.0
v2.34.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Added
- Stage 4 Behavioral Validation — new validation tier between council/vibe and production:
- Holdout scenarios stored in
.agents/holdout/with PreToolUse isolation hook preventing implementing agents from seeing evaluation criteria - Satisfaction scoring (0.0-1.0 probabilistic) in verdict schema v4, replacing boolean-only PASS/FAIL
- Agent-built behavioral specs generated during
/implementStep 5c /scenarioskill for authoring and managing holdout scenariosao scenario init|list|validateCLI commands (4 subcommands, 11 tests)- STEP 1.8 in
/validationpipeline evaluating holdout scenarios + agent specs schemas/scenario.v1.schema.jsondefining the holdout scenario format
- Holdout scenarios stored in
- Flywheel gate command —
ao flywheel gatechecks readiness for retrieval-expansion work (research closure, rho threshold, holdout precision@K) - Citation confidence scoring —
citationEventIsHighConfidencewith bucketed confidence (0/0.5/0.7/0.9) gates MemRL rewards on match quality - Retrieval bench refactor — train/holdout splits, section-aware scoring (
scoreBenchSections), manifest-based benchmark cases - Proof-backed next-work visibility —
classifyNextWorkCompletionProofunifies completed-run, execution-packet, and evidence-only-closure proof types; context explain and stigmergic packet now report proof-backed suppressions - Three-gap contract proof gates — lifecycle gap mapping gates added to GOALS.md
- Cross-vendor execution —
--mixedflag for Claude + Codex council judges - Gas City bridge — gc as default executor for RPI phase execution with L1-L3 tests
- 149 L2 integration tests — AI-native test shape ("L2 first, L1 always") validated at scale; coverage floor raised 78.8% → 81.0%
- Test coverage hardening — GPG commit-signing fixes, root-skip guards for containerized CI, 350+ lines of vibecheck detector/metrics tests, maturity.go empty-content bugfix
Changed
- Codex parity hook —
codex-parity-warn.shnow supports opt-in blocking mode viaAGENTOPS_CODEX_PARITY_BLOCK=1(exit 2 instead of advisory) - 12-factor doctrine — compressed from 474 to 114 lines, reframed as supporting lens rather than product definition
- Skill count — 65 → 66 (added
/scenario) - Research skill — now persists reusable findings to
.agents/findings/registry.jsonlwith finding-compiler refresh - Closure integrity audit — accepts durable closure packets without scoped-file sections as valid evidence
- Proof-backed legacy entries —
shouldSkipLegacyFailedEntryusesCompletionEvidencefield (proof-only, no heuristic fallback) readQueueEntries— returns all non-consumed entries; proof filtering is downstream viashouldSkipLegacyFailedEntry
Fixed
- 6 CI failure categories resolved in one commit (f1b83b2)
- Cobra test registration —
scenarioandflywheel gateadded to expectedCmds - Citation feedback test — assertion corrected for recorded confidence preference (0.5 not 0.7)
- RPI hardening — UAT version pre-flight, goals history filter, proof-backed suppression, fail-closed gates, cross-epic handoff contamination, bare ag- prefix guard
- Branch consolidation — 10 stale Codex branches analyzed, cherry-picked (9 commits, ~3,500 lines), and deleted; 25 orphaned worktrees pruned
- git rerere enabled — conflict resolution memory for future merges
Full Changelog: v2.33.0...v2.34.0
v2.33.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
This release tightens execution hygiene, retrieval quality, and release operations. You can now benchmark retrieval quality from the CLI, run persona-based adversarial validation with /red-team, and let Crank surface stale or mergeable bead backlogs before burning worker time. Release prep is also less opinionated now that the enforced cadence gate is gone.
What's New
- Backlog hygiene gates — Crank and related scripts now surface stale or mergeable bead backlogs before execution starts.
- Retrieval benchmarking —
ao retrieval-benchadds benchmark corpora, live mode, global scope, and nightly regression coverage. - Adversarial validation —
/red-teamadds persona-based probing for docs and skills, with checked-in Codex runtime artifacts. - Software factory lane — the CLI and startup flow now expose a dedicated software-factory operator surface.
- Release timing freedom — release prep no longer blocks on a minimum wait between tags.
For the full categorized diff, expand the changelog section below.
Full changelog
Added
- Backlog hygiene gates — added
bd-audit.sh,bd-cluster.sh, and Crank/Codex guidance for cleaning stale or mergeable beads before execution - Retrieval benchmarking and global scope — added
ao retrieval-bench, benchmark corpora,--live,--global, and nightly IR regression coverage /red-teamadversarial validation — added a persona-based validation skill plus checked-in Codex runtime artifacts- Software factory operator lane — added a CLI/operator surface and Claude factory startup routing for software-factory workflows
- Flywheel maintenance utilities — added global garbage purge tooling and nightly retrieval benchmarking for knowledge quality tracking
Changed
- Release policy — removed the enforced release cadence gate so releases no longer block on a minimum wait between tags
- Knowledge operator surfaces — plan and validation now wire knowledge operator surfaces directly into execution flow
- Proof and runtime docs — goals, RPI docs, and contributor guidance now reflect the expanded proof surfaces and hookless runtime behavior
Fixed
- Codex artifact parity — restored checked-in Codex parity for red-team and cleaned Codex runtime metadata/frontmatter drift across crank, forge, post-mortem, release, and swarm artifacts
- Retrieval quality — replaced exact-substring filtering with token-level matching and tuned penalty, deduplication, and OR-fallback behavior
- Harvest metadata preservation — promotion now preserves source metadata and fills missing maturity, utility, and type fields safely
- Release tooling — release artifact directories are created safely and audit artifacts now resolve against release tag names
- Documentation and link drift — repaired the post-mortem Codex link and aligned runtime docs around the newer startup and lifecycle flows
Full Changelog: v2.32.0...v2.33.0
v2.32.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
Knowledge activation and session intelligence are the headline features in this release. A new skill and CLI surfaces let the agent consume cross-domain knowledge at runtime — ranking, assembling, and explaining the context it injects into each session. The session intelligence engine adds trust policies and explainability so you can see exactly why certain knowledge was selected. The pre-push gate now runs 9 checks that previously required CI, giving faster local feedback before you push.
What's New
- Knowledge activation — new
/knowledge-activationskill andaoCLI surfaces activate cross-domain knowledge at runtime with ranked intelligence context and operator surface consumption - Session intelligence engine — complete runtime engine with explainability, trust policy enforcement, and ranked context assembly
- Runtime selection —
ao rpi servesupports explicit runtime selection for Claude and Codex execution modes - Faster local validation — 9 CI-only checks migrated to the pre-push gate for immediate feedback
All Changes
Added
- Knowledge activation skill with CLI surfaces and runtime operator consumption
- Session intelligence runtime engine with explainability and ranked context
- Runtime selection for
ao rpi serve - Quality signals hook with telemetry test coverage
- Nine checks shifted from CI-only to local pre-push gate
- Inject stability warnings, signal tests, and status dashboard improvements
Changed
- README rewritten with product-minded gain-framing and Strunk-style prose
- Philosophy doc and observations section added to README
- Repo front doors and codex artifact guidance aligned
- Retry budgets, stability flags, and orchestration patterns applied from Claude Code architecture lessons
- Homebrew formula updated to v2.31.0 with pre-built binaries
Fixed
- Post-mortem closure integrity file parsing normalized
- CI failures resolved across codex refs, test pairing, hook coverage, docs parity, and codex lifecycle
- Lookup now scans nested global knowledge directories
- Test stubs added for new pre-push checks
Dependencies
- codecov/codecov-action bumped from 5 to 6
- DavidAnson/markdownlint-cli2-action bumped from 22 to 23
Full changelog
Added
- Knowledge activation skill — new
/knowledge-activationskill and CLI surfaces for activating cross-domain knowledge at runtime, with operator surface consumption and ranked intelligence context - Session intelligence engine — complete runtime engine with explainability, ranked context assembly, and trust policy enforcement
- Runtime selection for
ao rpi serve— serve now supports explicit runtime selection for Claude and Codex execution modes - Quality signals hook — new
quality-signals.shhook with test coverage for session quality telemetry - Pre-push gate expansion — 9 checks migrated from CI-only to the local pre-push gate for faster feedback
- Inject stability warnings and status dashboard — closed 3 harvest items with signal tests and dashboard improvements
Changed
- README refresh — product-minded rewrite with gain-framing and Strunk-style prose fixes
- Philosophy doc — new
docs/philosophy.mdand observations section added to README - Documentation alignment — repo front doors and codex artifact guidance unified across entry points
- Claude Code architecture lessons — retry budgets, stability flags, quality signals, and orchestration patterns applied to skills
- Homebrew formula — updated to v2.31.0 with pre-built binaries
Fixed
- Post-mortem closure integrity — normalized file parsing for closure integrity audits
- CI reliability — resolved CI failures across codex refs, test pairing, hook coverage, worktree handling, docs parity, hook portability, and codex lifecycle
- Lookup nested scanning —
ao lookupnow scans nested global knowledge directories correctly - Pre-push test stubs — added test stubs for new pre-push checks, skip non-shell in shellcheck
Dependencies
- Bumped
codecov/codecov-actionfrom 5 to 6 - Bumped
DavidAnson/markdownlint-cli2-actionfrom 22 to 23
Full Changelog: v2.31.0...v2.32.0
v2.31.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
Highlights
Nine new lifecycle skills let the agent handle bootstrapping, dependency audits, design reviews, performance analysis, refactoring, code review, scaffolding, and testing without manual invocation. A new ao harvest command pulls learnings from sibling workspaces so knowledge compounds across your entire multi-agent fleet, not just one repo. Context debugging is easier with ao context packet, and the hook system now formally supports both Claude Code and Codex runtimes.
What's New
- 9 lifecycle skills — bootstrap, deps, design, harvest, perf, refactor, review, scaffold, and test are now part of the RPI workflow with automatic invocation and mechanical gates
- Cross-rig knowledge harvesting —
ao harvestextracts and catalogs learnings from sibling crew workspaces so insights travel between agents - Context packet inspector —
ao context packetlets you debug what inter-session handoff state the agent actually sees - Dual-runtime hook support — Hooks now have a formal runtime contract covering Claude Code, Codex, and manual execution modes
All Changes
Added
- Nine lifecycle skills wired into the RPI workflow with auto-invocation
- Cross-rig knowledge consolidation via
ao harvest - Context packet inspection via
ao context packet - Hook runtime contract with Claude/Codex/manual event mapping
- Research provenance tracking on pending learnings
- Context declarations for inject, provenance, and rpi skills
- Evidence-backed output templates for goals and product commands
Changed
- Documentation reframed around three-gap context lifecycle model
- Hook docs updated with runtime modes table for dual-runtime support
Fixed
- Four pre-existing CI failures resolved
- Lookup retrieval gaps that caused empty results
- Embedded file sync on first session start
- Closure integrity with 24h grace window for evidence timing
- Skill lint compliance across vibe, post-mortem, crank, and plan
- Codex tool naming rule and five Claude-era tool references
- ASCII diagram consistency across 23 documentation files
- Fork exhaustion in validation script replaced with lightweight parser
Full changelog
Added
- 9 lifecycle skills — bootstrap, deps, design, harvest, perf, refactor, review, scaffold, and test skills wired into RPI with auto-invocation and mechanical gates
ao harvest— cross-rig knowledge consolidation extracts and catalogs learnings from sibling crew workspacesao context packet— inspect stigmergic context packets for debugging inter-session handoff state- Hook runtime contract — formal Claude/Codex/manual event mapping with runtime-aware hook tooling
- Evidence-driven skill enrichment — production meta-knowledge, anti-patterns, flywheel metrics, and normalization defect detection baked into 9 skill reference files
- Research provenance — pending learnings now carry full research provenance for discoverability and citation tracking
- Context declarations — inject, provenance, and rpi skills declare their context requirements explicitly
- Goals and product output templates —
/goalsand/productproduce evidence-backed structured output
Changed
- Three-gap context lifecycle contract — README, PRODUCT.md, positioning docs, and operational guides reframed around the context lifecycle model
- Dual-runtime hook documentation — runtime modes table and troubleshooting updated for Claude + Codex hook coexistence
Fixed
- CI reliability — resolved 4 pre-existing CI failures, restored headless runtime preflight, repaired codex parity drift checks
ao lookupretrieval — fixed retrieval gaps that caused lookup to return no results- Embedded sync — using-agentops SKILL.md and
.agents/.gitignorenow written correctly on first session start - Closure integrity — 24h grace window for close-before-commit evidence, normalized file parsing
- Skill lint compliance — vibe, post-mortem, crank, and plan skills trimmed or restructured to stay under 800-line limit
- Codex tool naming — added CLAUDE_TOOL_NAMING rule and fixed 5 Claude-era tool references in codex skills
- ASCII diagram consistency — aligned box-drawing characters across 23 documentation files
- Fork exhaustion prevention — replaced jq with awk in validate-go-fast to prevent fork bombs on large repos
Full Changelog: v2.30.0...v2.31.0
v2.30.0
brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance
v2.30.0 — Codex hookless lifecycle, PROGRAM.md workflows, and stronger long-running RPI runs
Highlights
AgentOps now handles Codex hookless sessions more cleanly, gives autonomous workflows a clearer PROGRAM.md contract, and makes long-running RPI runs much easier to inspect. This release also hardens the local release and validation path itself, so the same gate stack you rely on for shipping is more trustworthy under headless and generated-artifact-heavy workflows.
What's New
- Hookless Codex lifecycle support — Codex sessions can now run through startup, follow-up, validation, and closeout without depending on legacy hook assumptions.
PROGRAM.mdfor autonomous work — Autodev and evolve flows now share a concrete program contract instead of relying on looser ad hoc context.- Artifact-aware long RPI runs — Mission control now shows run artifacts and evaluator output so you can inspect what happened during multi-phase autonomous runs.
- More reliable release validation — Headless runtime checks, reverse-engineer hygiene, and release-gate coverage are more deterministic.
All Changes
Added
- Hookless Codex lifecycle support across CLI commands and skill orchestration
- A first-class
PROGRAM.mdcontract for autodev and evolve-driven workflows - Artifact and evaluator visibility for long-running RPI sessions
Changed
- Codex bundle maintenance, lifecycle guidance, and release validation coverage around the expanded Codex execution path
Fixed
- Codex RPI scope and closeout issues that caused follow-up and validation drift
- Release-gate regressions in headless runtime validation and learning coherence
- Reverse-engineer repo scans so generated or temporary trees no longer contaminate detected CLI surfaces
Full changelog
Added
- Codex hookless lifecycle support —
ao codexruntime commands, lifecycle fallback, and Codex skill orchestration now cover hookless sessions end to end - PROGRAM.md autodev contract — Added a first-class
PROGRAM.mdcontract for autodev flows and taught/evolveand related RPI paths to use it - Long-running RPI artifact visibility — Mission control now exposes run artifacts and evaluator output so long-running RPI sessions are replayable and easier to inspect
Changed
- Codex runtime maintenance flow — Refreshed Codex bundle hashes, lifecycle guards, runtime docs, and release validation coverage around the expanded Codex execution path
Fixed
- Codex RPI scoping and closeout — Tightened objective scope, epic scope, closeout ownership, and validation gaps in the Codex RPI lifecycle
- Release gate reliability — Restored headless runtime coverage, runtime-aware Claude inventory checks, and release-gate coherence validation
- Reverse-engineer repo hygiene — Repo-mode reverse engineer now ignores generated and temp trees when identifying CLI and module surfaces
Full Changelog: v2.29.0...v2.30.0