Skip to content

Releases: boshu2/agentops

v2.37.2

16 Apr 00:22

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


Highlights

This hotfix hardens AgentOps' validation and execution surfaces across hooks,
Codex runtime artifacts, and release gating. It also adds proof-backed swarm
evidence checks and safer worker git defaults, while tightening several CLI,
compile, and harvest edges that surfaced during release prep.

What's New

  • Proof-backed swarm validation — release and validation gates now treat
    swarm evidence as a first-class contract.
  • Safer multi-agent git behavior — worker sessions now respect a
    lead-only git guard in the hook chain.
  • More recoverable compile and harvest runs — compile adds reset and
    repair controls, and harvest explains near-miss exclusions instead of
    silently dropping them.

All Changes

Added

  • Swarm-evidence schema and validator, wired into validation and release gates
  • Lead-only worker git guard in the hook chain for multi-agent sessions
  • Compile runtime preference plus --reset and --repair, and harvest
    near-miss reporting

Changed

  • Release, pre-push, and Codex/runtime validation now cover more hook,
    evidence, and artifact surfaces before publish
  • Next-work backlog bookkeeping and Codex/runtime docs were normalized to
    better match shipped behavior

Fixed

  • Pre-mortem gate ambiguity now fails closed instead of open
  • ao rpi serve --run-id, ao mine --dry-run, compile and harvest edge
    handling, CI fixture drift, shellcheck drift, and Codex artifact metadata
    drift

Full changelog


Full changelog

Added

  • Swarm evidence validation — AgentOps now ships a swarm-evidence schema and validator, and wires that proof surface into validation and release gates.
  • Lead-only worker git guard — worker sessions now have an explicit lead-only git guard in the hook chain, reducing accidental write authority in multi-agent runs.
  • Compile and harvest operator controlsao compile adds runtime preference plus --reset and --repair controls, while harvest now reports excluded low-confidence candidates and top near-misses.

Changed

  • Release and pre-push validation — local release, pre-push, and command coverage gates now validate more of the hook, evidence, and Codex runtime surface before publish.
  • Codex/runtime artifacts and docs — compile, evolve, post-mortem, swarm, and related runtime docs and artifacts were decomposed and synchronized to better match shipped behavior.
  • Flywheel backlog bookkeeping — next-work aggregates, consumed markers, and enum normalization were cleaned up so carry-forward work is recorded consistently.

Fixed

  • Pre-mortem gate ambiguity — the crank pre-mortem gate now denies ambiguous state by default instead of failing open.
  • CLI and shell reliability edgesao rpi serve --run-id now accepts legacy 8-hex IDs, ao mine --dry-run emits a single clean JSON payload, and bash invocations are sanitized to bypass unsafe shell aliases.
  • Compile, harvest, and release drift — compile repair defaults, malformed frontmatter salvage, YAML parse error surfacing, CI fixture drift, shellcheck drift, and Codex artifact metadata drift were corrected.

Full Changelog: v2.37.1...v2.37.2

v2.37.1

15 Apr 04:24

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


Highlights

Dream now leaves behind actionable morning work instead of just a short
overnight summary. This hotfix adds ranked morning packets, an opt-in long-haul
corroboration pass for weak results, and a reliable headless Claude path for
Dream Council, so the overnight loop is both more useful and more dependable.

What's New

  • Actionable morning packets — Dream can turn overnight evidence into
    ranked next steps with suggested commands and handoff metadata.
  • Smarter long-haul mode — extra overnight time is only spent when the
    first pass is weak, and it is used to corroborate the strongest packet.
  • Reliable headless Claude council — Dream Council now works against the
    real Claude JSON contract instead of the broken spawn path that degraded or
    timed out runs.

All Changes

Added

  • Dream morning packets that carry ranked next steps, evidence, target files,
    and queue or bead handoff metadata
  • Yield telemetry and an adaptive long-haul mode that can corroborate weak
    overnight output before handing it off

Changed

  • Overnight runs now prefer cheaper evidence corroboration before slower
    council fan-out, so strong runs stay short and only weak output pays the
    extra runtime cost

Fixed

  • Headless Claude council spawning and result normalization for Dream
  • Overnight close-loop reporting so Dream writes real report artifacts instead
    of placeholder statuses
  • Retrieval-quality release gating so local release checks can fall back to
    checked-in eval data when a local manifest is missing

Full changelog


Full changelog

Added

  • Dream morning packets — Dream can now emit ranked morning work packets with evidence, target files, exact follow-up commands, and queue/bead handoff metadata.
  • Dream yield telemetry and long-haul corroboration — overnight reports now record packet-confidence telemetry and can trigger a bounded long-haul corroboration pass when the first pass produces weak morning output.

Changed

  • Dream decision flow — overnight runs now prefer cheaper evidence corroboration before slower council fan-out, so strong runs stay short and extended runtime is reserved for genuinely weak output.

Fixed

  • Headless Claude Dream council — Dream now uses Claude's working JSON output contract for headless council runs and normalizes the returned envelope before validation.
  • Dream close-loop and report surfaces — overnight runs now write real close-loop callbacks and post-loop report artifacts instead of leaving placeholder pending steps.
  • Retrieval ratchet release gate fallback — the retrieval-quality release check now falls back to checked-in eval data when a local manifest is absent.

Full Changelog: v2.37.0...v2.37.1

v2.37.0

14 Apr 22:11

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


Highlights

This release pushes AgentOps further toward a repo-native knowledge workspace.
With Karpathy's LLM-wiki getting attention, it's clearer that AgentOps was
already converging on a similar shape. This release takes some of the pieces he
executed better, especially index-first wiki navigation and knowledge
ergonomics, and integrates them into the existing .agents workflow.

At the same time, AgentOps now has a real Windows install path, a first-class
ao compile command, and a local LLM pipeline that can turn session
transcripts into reviewable wiki pages. Dream and evolve also connect more
tightly, so overnight knowledge passes can feed the daytime improvement loop
instead of living beside it.

What's New

  • Windows is a real target — PowerShell install plus blocking native
    Windows smoke coverage.
  • Knowledge compilation is a CLI commandao compile moves the compiler
    out of skill-only territory.
  • Transcript-to-wiki pipelineao forge can redact, summarize, review,
    and store local LLM session pages in a browsable .agents wiki.
  • Dream feeds evolveao evolve --dream-first and --dream-only let
    knowledge passes run before or instead of code cycles.
  • Stronger planning and retrieval guardrails — beads audit and cluster
    surfaces, retrieval evaluation fixtures, and a CI ratchet make stale or weak
    retrieval behavior easier to catch.

All Changes

Added

  • Windows installer and native Windows smoke validation for install, doctor,
    and overnight-sensitive paths
  • ao compile command with docs and tests
  • Local LLM forge pipeline with redaction, structural review, and Dream
    integration
  • .agents wiki scaffolding with INDEX, LOG, and searchable wiki directories
  • New beads audit and clustering helpers, status quality signals, and retrieval
    quality ratchets

Changed

  • Dream can now run as a knowledge-first sub-cycle inside evolve
  • Search and injection rank knowledge more intelligently using content dedup,
    index boosts, and stability weighting
  • Public docs and workflow guidance now line up better with the current
    operational-layer and context-compiler story

Fixed

  • Windows overnight liveness detection
  • Release retag safety and audit artifact validation
  • Post-mortem closure audit handling for evidence-only and path-heavy cases
  • Several Codex and RPI reliability edge cases around lifecycle restarts, JSON
    writes, proof paths, and bridge validation

Full changelog


Full changelog

Added

  • Windows install and smoke coveragescripts/install-ao.ps1 adds a first-class Windows install path, and the blocking windows-smoke gate exercises PowerShell install, local ao doctor, and Windows-sensitive Go packages.
  • Compile commandao compile makes knowledge compilation a first-class CLI surface with docs and tests.
  • Local LLM forge pipelineao forge can now redact, summarize, structurally review, and queue transcript-derived wiki pages with Dream worker integration.
  • Dream curator and evolve sub-cycle — Dream gained a local curator adapter plus ao evolve --dream-first|--dream-only, allowing overnight knowledge passes to feed the daytime improvement loop.
  • .agents wiki surfaces — INDEX, LOG, wiki directories, and search integration formalize .agents/ as a Karpathy-style knowledge wiki with index-first navigation.
  • Operational quality surfaces — beads audit/cluster commands, swarm preflight advice, status quality signals, retrieval eval queries, and a retrieval-quality CI ratchet broaden release-time proof.

Changed

  • Knowledge scoring and search behavior — inject now deduplicates by content hash, boosts indexed pages, weights stability, and search can pull Dream vault and wiki sources with stronger local recall.
  • Overnight and RPI internals — overnight, lifecycle, search, inject, harvest, and RPI flows were decomposed into smaller helpers while tightening proof paths, mixed-mode provenance, and worktree cleanup.
  • Public framing and contributor docs — README, philosophy, planning/post-mortem docs, and reference surfaces now better match the context-compiler and operational-layer story.

Fixed

  • Windows overnight liveness — Windows process checks no longer rely on Unix signal(0) semantics.
  • Dream RunLoop status invariants — live-tree hash coverage now exercises every terminal RunLoop status, and degraded reflects the current rollback semantics.
  • Release retag safety — release tooling now preserves annotated tags, validates audit artifact manifests and refs, and cancels stale reruns before duplicate publish attempts.
  • Post-mortem and closure audits — metadata links, evidence-only closure packets, parser-path handling, and closure packet evidence modes were normalized.
  • Codex and runtime reliability — same-thread lifecycle restart, root-scoped fallback reads, JSON config writes, bridge contract validation, and next-work proof-path handling were hardened.

Full Changelog: v2.36.0...v2.37.0

v2.36.0

11 Apr 22:15

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


Highlights

This release turns Dream from a concept into a usable operator surface.
AgentOps can now run private overnight cycles against your real local
.agents/ corpus, produce a morning report, compare runner outputs, and
bootstrap bedtime setup without pretending GitHub Actions is your personal
memory engine. The public docs and onboarding story were also rebuilt around
the operational-layer framing, so the way AgentOps explains itself now matches
the product you install.

The 2026-04-11 refresh retags v2.36.0 to include the recovered RPI wave and the
new evolve/autodev v2 operator surface. The release now includes ao evolve,
root PROGRAM.md execution policy support, stale-scope bead tooling, RPI
discovery artifacts, Dream RunLoop invariant tests, and the release/CI hardening
needed to ship those changes cleanly.

What's New

  • Evolve v2 operator commandao evolve exposes the autonomous
    improvement loop directly in the CLI with cycle limits, pinned queues,
    beads-only mode, quality mode, compile warmup, and strict-quality passthrough.
  • Autodev program contracts — root PROGRAM.md gives the loop a repo-local
    contract for mutable scope, immutable scope, validation commands, escalation
    policy, and stop conditions.
  • RPI recovery and stale-scope toolingao beads verify|lint|harvest,
    RPI discovery artifacts, and stale-scope checks make recovered or prior-session
    plans harder to execute against stale evidence.
  • Private overnight runsao overnight start, run, and report give
    you a real local Dream loop with morning summaries instead of a CI-only
    placeholder.
  • Dream setup and council scaffoldingao overnight setup helps bootstrap
    keep-awake and scheduler assistance, while the report contract supports
    runner comparison, tension, and next-action synthesis.
  • Live flywheel proof — the nightly dream-cycle now records retrieval-bench
    results and can draft skill candidates from repeated patterns, so compounding
    is visible instead of implied.
  • Cleaner first-run experience — fresh repos now get an onboarding welcome
    that routes them into research, implementation, or validation faster.
  • Sharper public story — README, docs, comparisons, and linked surfaces now
    consistently explain AgentOps as bookkeeping, validation, primitives, and
    flows for coding agents.

All Changes

Added

  • Evolve v2 CLI command and autodev PROGRAM.md operating contract
  • Beads stale-scope verification, linting, and harvesting commands
  • RPI discovery artifact support and recovered-wave integration points
  • Dream RunLoop invariant and failed-summary contract regression coverage
  • Private overnight Dream commands, shared Dream config, and morning report
    contracts
  • Live retrieval proof in the nightly dream-cycle, plus skill-draft generation
    from repeated patterns
  • Fresh-repo onboarding welcome, docs-site navigation, and comparison pages
  • Behavioral-discipline guidance and strategic-doc validation references

Changed

  • Plan and pre-mortem skills now invoke bead stale-scope verification for aged,
    full-complexity, or prior-session bead inputs
  • Council --mixed documentation now makes silent fallback a hard contract
    violation when Codex is unavailable
  • Public framing across README, onboarding, docs, and comparisons now matches
    the operational-layer story
  • Dream docs now separate the private local engine from the public GitHub proof
    harness
  • Codex packaging, runtime smoke coverage, and checked-in skill artifacts are
    aligned around native hooks

Fixed

  • Windows Codex installer coverage, golangci-lint v2 pinning, and recovered RPI
    validation blockers
  • Security-toolchain false positives around deterministic seeded fixture
    generation
  • Shared stale-scope reference placement for strict skill integrity validation
  • Release-gate regressions across Pages docs, compile-skill headless
    instructions, pre-push shim tests, and headless runtime smoke
  • Stale Codex install references, compile-skill artifact drift, and plugin
    metadata mismatches
  • A few runtime-proof rough edges that blocked cleaner nightly and smoke-test
    evidence

Full changelog


Full changelog

Added

  • Evolve operator commandao evolve now exposes the v2 autonomous improvement loop directly in the CLI, including --max-cycles, --queue, --beads-only, --quality, --compile, and strict-quality passthrough flags.
  • Autodev program contract — root PROGRAM.md gives evolve/autodev a repo-local operating contract with mutable and immutable scope, validation commands, escalation policy, and stop conditions.
  • Beads stale-scope toolingao beads verify|lint|harvest adds first-class stale-citation checks for bead-driven planning and RPI recovery.
  • RPI discovery artifacts — RPI can now persist and consume discovery artifacts, with tests and docs covering the --discovery-artifact path.
  • Dream RunLoop invariant coverageTestRunLoop_LiveTreeHashInvariant_AllStatuses locks the IsCorpusCompounded() and live-tree mutation invariant across deterministically reproducible terminal statuses, with remaining fixture statuses tracked in na-1iv.
  • Dream failed-summary contract coverage — regression tests now lock the finalizeOvernightSummary contract for MEASURE consecutive-failure halts and persisted iteration history.
  • Dream operator modeao overnight start|run|report|setup adds a private overnight lane with shared dream.* config, keep-awake defaults, scheduler/bootstrap guidance, council-ready runner packets, and DreamScape-style morning summaries
  • Nightly live retrieval proof — the dream-cycle now runs ao retrieval-bench --live --json, emits retrieval proof in nightly summaries, and keeps a visible artifact trail for flywheel health
  • Pattern-to-skill drafts — repeated patterns can now generate review-only skill drafts under .agents/skill-drafts/ during flywheel close-loop
  • Fresh-repo onboarding welcome — new session-start routing helps first-time repos enter discovery, implementation, or validation without needing the full RPI lane first
  • Docs-site and contribution proof surfaces — GitHub Pages navigation, comparison pages, behavioral-discipline guidance, strategic-doc validation patterns, and a first-skill guide expand the public proof surface

Changed

  • RPI wave recovery integrated — recovered RPI wave work landed across Dream, council, stale-scope planning, discovery artifacts, CI hardening, and Codex runtime surfaces.
  • Council --mixed strict contract documentedskills/council/references/cli-spawning.md documents that /council --mixed requires Codex CLI and emits a hard error instead of silently falling back to Claude-only.
  • Plan and pre-mortem skill bodies decomposed — focused reference files now carry the detailed pre-decomposition, scope-mode, mandatory-check, output, wave-matrix, and task-creation guidance while keeping the top-level skills within lint budgets.
  • Bead-input pre-flight wired into planning skills/plan and /pre-mortem invoke ao beads verify <bead-id> for full-complexity, aged, or prior-session bead inputs before decomposition or validation.
  • Operational-layer framing — README, onboarding, docs, comparisons, and linked surfaces now consistently explain AgentOps as bookkeeping, validation, primitives, and flows for coding agents
  • Dream runtime positioning — the public GitHub nightly is now documented as a proof harness, while ao overnight is documented as the private local compounding engine
  • Codex default path — native hooks, install copy, runtime smoke coverage, and checked-in Codex artifacts are aligned around the native-plugin path on supported Codex versions
  • Validation guidance — behavioral-discipline and strategic-doc review are now first-class references alongside code review and runtime validation

Fixed

  • Windows Codex installer — Codex installation now has a Windows path instead of assuming Unix shell behavior.
  • golangci-lint v2 contract — the local lint wrapper and CI configuration now pin the v2 behavior expected by the repository.
  • security-toolchain-gate CI — deterministic fixture generation in cli/internal/overnight/fixture/gen_fixture.go is annotated as a non-cryptographic seeded-random use, avoiding a false-positive semgrep blocker.
  • Recovered RPI validation blockers — validation drift from the recovered RPI wave was cleared before retagging the release.
  • Stale-scope reference placement — shared stale-scope validation guidance now lives under skills/shared/references/ so heal.sh --strict can resolve it consistently.
  • Release and CI drift — resolved docs-site Liquid/frontmatter issues, headless runtime smoke portability problems, pre-push shim test drift, and compile-skill headless command drift caught during release prep
  • Codex install and artifact drift — fixed stale slash-command references, refreshed checked-in artifact metadata, added a Codex compile wrapper, and corrected plugin/marketplace mismatches exercised by smoke coverage
  • Runtime proof stability — promoted Codex runtime smoke into the blocking smoke path and fixed related shellcheck and install-surface rough edges

Removed

  • DevOps-rooted tagline — public framing no longer leads with the old DevOps-layer tagline; the Three Ways lineage remains supporting doctrine instead of the category label

...

Read more

v2.35.0

07 Apr 13:56

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


Added

  • Codex native hooks — AgentOps hooks now install natively into Codex CLI v0.115.0+ via ~/.codex/hooks.json; 8 hooks wired (session-start, inject, flywheel-close, prompt-nudge, quality-signals, go-test-precommit, commit-review, ratchet-advance); installer enables codex_hooks feature and upgrades from hookless fallback to native hook runtime
  • Knowledge compiler skill — renamed athena → /compile with Karpathy-style incremental compilation, pluggable LLM backend (AGENTOPS_COMPILE_RUNTIME=ollama|claude), interlinked markdown wiki output at .agents/compiled/
  • App struct dependency injectionApp struct carries ExecCommand, LookPath, RandReader, Stdout, Stderr seams; gc bridge, events, executor, context relevance, tracker health, and stream modules accept injected dependencies instead of mutable package-level vars
  • Test shuffle in CI-shuffle=on added to validate.yml and Makefile test targets, exposing and fixing 6 ordering-dependent tests (cobra flag leaks, maturity var leaks, env var leaks)

Changed

  • CLI internal extraction (waves 5-13) — business logic extracted from cmd/ao monolith into 15 internal/ domain packages (rpi, search, context, quality, goals, lifecycle, bridge, forge, mine, plans, knowledge, storage, pool, taxonomy, worker) using Options struct pattern for dependency injection
  • Goals test migration — 7 goals test files moved from cmd/ao to internal/goals as external test package (goals_test) with t.Parallel() and direct goals.Run*() calls replacing cobra command wiring
  • Test isolationresetCommandState now saves/restores 10 maturity globals; resetFlagChangesRecursive resets flag values to defaults; RPILoop and toolchain tests clear AGENTOPS_RPI_RUNTIME* env vars via t.Setenv

Fixed

  • Defrag test flag leakTestDefragOutputDirFlag used cmd.Flags().Lookup("output") which matched the root persistent --output flag; changed to cmd.LocalFlags().Lookup("output")
  • Goroutine leak false positiveTestRunGoals_GoroutineLeak used goleak.VerifyNone which caught goroutines from parallel tests; switched to goleak.IgnoreCurrent() to only detect leaks within the test itself
  • Secret scan false positives — excluded .gc/ directory and Getenv/os.Environ patterns from secret pattern scan
  • Codex skill validation — added output_contract as valid schema key, cross-vendor/knowledge as valid tiers, fixed $/ prefix in codex forge/post-mortem/scenario skills
  • Scenario CLI snippets — replaced non-existent --source/--scope flags with valid --status variants

Removed

  • Coverage percentage CI gates — removed coverage-ratchet job, check-cmdao-coverage-floor.sh, .coverage-baseline.json, and associated BATS tests; percentage gates blocked CI during architectural refactors without catching bugs
  • fire.go — FIRE loop (find-ignite-reap-escalate) superseded by gc sling + bead dispatch; formatAge helper moved to inject_predecessor.go
  • rpi_workers.go — per-worker health display superseded by gc agent health patrol; ao rpi workers subcommand removed from CLI and docs

Full Changelog: v2.34.0...v2.35.0

v2.34.0

05 Apr 17:50

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


Added

  • Stage 4 Behavioral Validation — new validation tier between council/vibe and production:
    • Holdout scenarios stored in .agents/holdout/ with PreToolUse isolation hook preventing implementing agents from seeing evaluation criteria
    • Satisfaction scoring (0.0-1.0 probabilistic) in verdict schema v4, replacing boolean-only PASS/FAIL
    • Agent-built behavioral specs generated during /implement Step 5c
    • /scenario skill for authoring and managing holdout scenarios
    • ao scenario init|list|validate CLI commands (4 subcommands, 11 tests)
    • STEP 1.8 in /validation pipeline evaluating holdout scenarios + agent specs
    • schemas/scenario.v1.schema.json defining the holdout scenario format
  • Flywheel gate commandao flywheel gate checks readiness for retrieval-expansion work (research closure, rho threshold, holdout precision@K)
  • Citation confidence scoringcitationEventIsHighConfidence with bucketed confidence (0/0.5/0.7/0.9) gates MemRL rewards on match quality
  • Retrieval bench refactor — train/holdout splits, section-aware scoring (scoreBenchSections), manifest-based benchmark cases
  • Proof-backed next-work visibilityclassifyNextWorkCompletionProof unifies completed-run, execution-packet, and evidence-only-closure proof types; context explain and stigmergic packet now report proof-backed suppressions
  • Three-gap contract proof gates — lifecycle gap mapping gates added to GOALS.md
  • Cross-vendor execution--mixed flag for Claude + Codex council judges
  • Gas City bridge — gc as default executor for RPI phase execution with L1-L3 tests
  • 149 L2 integration tests — AI-native test shape ("L2 first, L1 always") validated at scale; coverage floor raised 78.8% → 81.0%
  • Test coverage hardening — GPG commit-signing fixes, root-skip guards for containerized CI, 350+ lines of vibecheck detector/metrics tests, maturity.go empty-content bugfix

Changed

  • Codex parity hookcodex-parity-warn.sh now supports opt-in blocking mode via AGENTOPS_CODEX_PARITY_BLOCK=1 (exit 2 instead of advisory)
  • 12-factor doctrine — compressed from 474 to 114 lines, reframed as supporting lens rather than product definition
  • Skill count — 65 → 66 (added /scenario)
  • Research skill — now persists reusable findings to .agents/findings/registry.jsonl with finding-compiler refresh
  • Closure integrity audit — accepts durable closure packets without scoped-file sections as valid evidence
  • Proof-backed legacy entriesshouldSkipLegacyFailedEntry uses CompletionEvidence field (proof-only, no heuristic fallback)
  • readQueueEntries — returns all non-consumed entries; proof filtering is downstream via shouldSkipLegacyFailedEntry

Fixed

  • 6 CI failure categories resolved in one commit (f1b83b2)
  • Cobra test registrationscenario and flywheel gate added to expectedCmds
  • Citation feedback test — assertion corrected for recorded confidence preference (0.5 not 0.7)
  • RPI hardening — UAT version pre-flight, goals history filter, proof-backed suppression, fail-closed gates, cross-epic handoff contamination, bare ag- prefix guard
  • Branch consolidation — 10 stale Codex branches analyzed, cherry-picked (9 commits, ~3,500 lines), and deleted; 25 orphaned worktrees pruned
  • git rerere enabled — conflict resolution memory for future merges

Full Changelog: v2.33.0...v2.34.0

v2.33.0

02 Apr 22:56

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


Highlights

This release tightens execution hygiene, retrieval quality, and release operations. You can now benchmark retrieval quality from the CLI, run persona-based adversarial validation with /red-team, and let Crank surface stale or mergeable bead backlogs before burning worker time. Release prep is also less opinionated now that the enforced cadence gate is gone.

What's New

  • Backlog hygiene gates — Crank and related scripts now surface stale or mergeable bead backlogs before execution starts.
  • Retrieval benchmarkingao retrieval-bench adds benchmark corpora, live mode, global scope, and nightly regression coverage.
  • Adversarial validation/red-team adds persona-based probing for docs and skills, with checked-in Codex runtime artifacts.
  • Software factory lane — the CLI and startup flow now expose a dedicated software-factory operator surface.
  • Release timing freedom — release prep no longer blocks on a minimum wait between tags.

For the full categorized diff, expand the changelog section below.


Full changelog

Added

  • Backlog hygiene gates — added bd-audit.sh, bd-cluster.sh, and Crank/Codex guidance for cleaning stale or mergeable beads before execution
  • Retrieval benchmarking and global scope — added ao retrieval-bench, benchmark corpora, --live, --global, and nightly IR regression coverage
  • /red-team adversarial validation — added a persona-based validation skill plus checked-in Codex runtime artifacts
  • Software factory operator lane — added a CLI/operator surface and Claude factory startup routing for software-factory workflows
  • Flywheel maintenance utilities — added global garbage purge tooling and nightly retrieval benchmarking for knowledge quality tracking

Changed

  • Release policy — removed the enforced release cadence gate so releases no longer block on a minimum wait between tags
  • Knowledge operator surfaces — plan and validation now wire knowledge operator surfaces directly into execution flow
  • Proof and runtime docs — goals, RPI docs, and contributor guidance now reflect the expanded proof surfaces and hookless runtime behavior

Fixed

  • Codex artifact parity — restored checked-in Codex parity for red-team and cleaned Codex runtime metadata/frontmatter drift across crank, forge, post-mortem, release, and swarm artifacts
  • Retrieval quality — replaced exact-substring filtering with token-level matching and tuned penalty, deduplication, and OR-fallback behavior
  • Harvest metadata preservation — promotion now preserves source metadata and fills missing maturity, utility, and type fields safely
  • Release tooling — release artifact directories are created safely and audit artifacts now resolve against release tag names
  • Documentation and link drift — repaired the post-mortem Codex link and aligned runtime docs around the newer startup and lifecycle flows

Full Changelog: v2.32.0...v2.33.0

v2.32.0

02 Apr 02:00

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


Highlights

Knowledge activation and session intelligence are the headline features in this release. A new skill and CLI surfaces let the agent consume cross-domain knowledge at runtime — ranking, assembling, and explaining the context it injects into each session. The session intelligence engine adds trust policies and explainability so you can see exactly why certain knowledge was selected. The pre-push gate now runs 9 checks that previously required CI, giving faster local feedback before you push.

What's New

  • Knowledge activation — new /knowledge-activation skill and ao CLI surfaces activate cross-domain knowledge at runtime with ranked intelligence context and operator surface consumption
  • Session intelligence engine — complete runtime engine with explainability, trust policy enforcement, and ranked context assembly
  • Runtime selectionao rpi serve supports explicit runtime selection for Claude and Codex execution modes
  • Faster local validation — 9 CI-only checks migrated to the pre-push gate for immediate feedback

All Changes

Added

  • Knowledge activation skill with CLI surfaces and runtime operator consumption
  • Session intelligence runtime engine with explainability and ranked context
  • Runtime selection for ao rpi serve
  • Quality signals hook with telemetry test coverage
  • Nine checks shifted from CI-only to local pre-push gate
  • Inject stability warnings, signal tests, and status dashboard improvements

Changed

  • README rewritten with product-minded gain-framing and Strunk-style prose
  • Philosophy doc and observations section added to README
  • Repo front doors and codex artifact guidance aligned
  • Retry budgets, stability flags, and orchestration patterns applied from Claude Code architecture lessons
  • Homebrew formula updated to v2.31.0 with pre-built binaries

Fixed

  • Post-mortem closure integrity file parsing normalized
  • CI failures resolved across codex refs, test pairing, hook coverage, docs parity, and codex lifecycle
  • Lookup now scans nested global knowledge directories
  • Test stubs added for new pre-push checks

Dependencies

  • codecov/codecov-action bumped from 5 to 6
  • DavidAnson/markdownlint-cli2-action bumped from 22 to 23

Full changelog


Full changelog

Added

  • Knowledge activation skill — new /knowledge-activation skill and CLI surfaces for activating cross-domain knowledge at runtime, with operator surface consumption and ranked intelligence context
  • Session intelligence engine — complete runtime engine with explainability, ranked context assembly, and trust policy enforcement
  • Runtime selection for ao rpi serve — serve now supports explicit runtime selection for Claude and Codex execution modes
  • Quality signals hook — new quality-signals.sh hook with test coverage for session quality telemetry
  • Pre-push gate expansion — 9 checks migrated from CI-only to the local pre-push gate for faster feedback
  • Inject stability warnings and status dashboard — closed 3 harvest items with signal tests and dashboard improvements

Changed

  • README refresh — product-minded rewrite with gain-framing and Strunk-style prose fixes
  • Philosophy doc — new docs/philosophy.md and observations section added to README
  • Documentation alignment — repo front doors and codex artifact guidance unified across entry points
  • Claude Code architecture lessons — retry budgets, stability flags, quality signals, and orchestration patterns applied to skills
  • Homebrew formula — updated to v2.31.0 with pre-built binaries

Fixed

  • Post-mortem closure integrity — normalized file parsing for closure integrity audits
  • CI reliability — resolved CI failures across codex refs, test pairing, hook coverage, worktree handling, docs parity, hook portability, and codex lifecycle
  • Lookup nested scanningao lookup now scans nested global knowledge directories correctly
  • Pre-push test stubs — added test stubs for new pre-push checks, skip non-shell in shellcheck

Dependencies

  • Bumped codecov/codecov-action from 5 to 6
  • Bumped DavidAnson/markdownlint-cli2-action from 22 to 23

Full Changelog: v2.31.0...v2.32.0

v2.31.0

31 Mar 14:29

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


Highlights

Nine new lifecycle skills let the agent handle bootstrapping, dependency audits, design reviews, performance analysis, refactoring, code review, scaffolding, and testing without manual invocation. A new ao harvest command pulls learnings from sibling workspaces so knowledge compounds across your entire multi-agent fleet, not just one repo. Context debugging is easier with ao context packet, and the hook system now formally supports both Claude Code and Codex runtimes.

What's New

  • 9 lifecycle skills — bootstrap, deps, design, harvest, perf, refactor, review, scaffold, and test are now part of the RPI workflow with automatic invocation and mechanical gates
  • Cross-rig knowledge harvestingao harvest extracts and catalogs learnings from sibling crew workspaces so insights travel between agents
  • Context packet inspectorao context packet lets you debug what inter-session handoff state the agent actually sees
  • Dual-runtime hook support — Hooks now have a formal runtime contract covering Claude Code, Codex, and manual execution modes

All Changes

Added

  • Nine lifecycle skills wired into the RPI workflow with auto-invocation
  • Cross-rig knowledge consolidation via ao harvest
  • Context packet inspection via ao context packet
  • Hook runtime contract with Claude/Codex/manual event mapping
  • Research provenance tracking on pending learnings
  • Context declarations for inject, provenance, and rpi skills
  • Evidence-backed output templates for goals and product commands

Changed

  • Documentation reframed around three-gap context lifecycle model
  • Hook docs updated with runtime modes table for dual-runtime support

Fixed

  • Four pre-existing CI failures resolved
  • Lookup retrieval gaps that caused empty results
  • Embedded file sync on first session start
  • Closure integrity with 24h grace window for evidence timing
  • Skill lint compliance across vibe, post-mortem, crank, and plan
  • Codex tool naming rule and five Claude-era tool references
  • ASCII diagram consistency across 23 documentation files
  • Fork exhaustion in validation script replaced with lightweight parser

Full changelog


Full changelog

Added

  • 9 lifecycle skills — bootstrap, deps, design, harvest, perf, refactor, review, scaffold, and test skills wired into RPI with auto-invocation and mechanical gates
  • ao harvest — cross-rig knowledge consolidation extracts and catalogs learnings from sibling crew workspaces
  • ao context packet — inspect stigmergic context packets for debugging inter-session handoff state
  • Hook runtime contract — formal Claude/Codex/manual event mapping with runtime-aware hook tooling
  • Evidence-driven skill enrichment — production meta-knowledge, anti-patterns, flywheel metrics, and normalization defect detection baked into 9 skill reference files
  • Research provenance — pending learnings now carry full research provenance for discoverability and citation tracking
  • Context declarations — inject, provenance, and rpi skills declare their context requirements explicitly
  • Goals and product output templates/goals and /product produce evidence-backed structured output

Changed

  • Three-gap context lifecycle contract — README, PRODUCT.md, positioning docs, and operational guides reframed around the context lifecycle model
  • Dual-runtime hook documentation — runtime modes table and troubleshooting updated for Claude + Codex hook coexistence

Fixed

  • CI reliability — resolved 4 pre-existing CI failures, restored headless runtime preflight, repaired codex parity drift checks
  • ao lookup retrieval — fixed retrieval gaps that caused lookup to return no results
  • Embedded sync — using-agentops SKILL.md and .agents/.gitignore now written correctly on first session start
  • Closure integrity — 24h grace window for close-before-commit evidence, normalized file parsing
  • Skill lint compliance — vibe, post-mortem, crank, and plan skills trimmed or restructured to stay under 800-line limit
  • Codex tool naming — added CLAUDE_TOOL_NAMING rule and fixed 5 Claude-era tool references in codex skills
  • ASCII diagram consistency — aligned box-drawing characters across 23 documentation files
  • Fork exhaustion prevention — replaced jq with awk in validate-go-fast to prevent fork bombs on large repos

Full Changelog: v2.30.0...v2.31.0

v2.30.0

25 Mar 03:47

Choose a tag to compare

brew update && brew upgrade agentops · bash <(curl -fsSL https://raw.githubusercontent.com/boshu2/agentops/main/scripts/install.sh) · checksums · verify provenance


v2.30.0 — Codex hookless lifecycle, PROGRAM.md workflows, and stronger long-running RPI runs

Highlights

AgentOps now handles Codex hookless sessions more cleanly, gives autonomous workflows a clearer PROGRAM.md contract, and makes long-running RPI runs much easier to inspect. This release also hardens the local release and validation path itself, so the same gate stack you rely on for shipping is more trustworthy under headless and generated-artifact-heavy workflows.

What's New

  • Hookless Codex lifecycle support — Codex sessions can now run through startup, follow-up, validation, and closeout without depending on legacy hook assumptions.
  • PROGRAM.md for autonomous work — Autodev and evolve flows now share a concrete program contract instead of relying on looser ad hoc context.
  • Artifact-aware long RPI runs — Mission control now shows run artifacts and evaluator output so you can inspect what happened during multi-phase autonomous runs.
  • More reliable release validation — Headless runtime checks, reverse-engineer hygiene, and release-gate coverage are more deterministic.

All Changes

Added

  • Hookless Codex lifecycle support across CLI commands and skill orchestration
  • A first-class PROGRAM.md contract for autodev and evolve-driven workflows
  • Artifact and evaluator visibility for long-running RPI sessions

Changed

  • Codex bundle maintenance, lifecycle guidance, and release validation coverage around the expanded Codex execution path

Fixed

  • Codex RPI scope and closeout issues that caused follow-up and validation drift
  • Release-gate regressions in headless runtime validation and learning coherence
  • Reverse-engineer repo scans so generated or temporary trees no longer contaminate detected CLI surfaces

Full changelog


Full changelog

Added

  • Codex hookless lifecycle supportao codex runtime commands, lifecycle fallback, and Codex skill orchestration now cover hookless sessions end to end
  • PROGRAM.md autodev contract — Added a first-class PROGRAM.md contract for autodev flows and taught /evolve and related RPI paths to use it
  • Long-running RPI artifact visibility — Mission control now exposes run artifacts and evaluator output so long-running RPI sessions are replayable and easier to inspect

Changed

  • Codex runtime maintenance flow — Refreshed Codex bundle hashes, lifecycle guards, runtime docs, and release validation coverage around the expanded Codex execution path

Fixed

  • Codex RPI scoping and closeout — Tightened objective scope, epic scope, closeout ownership, and validation gaps in the Codex RPI lifecycle
  • Release gate reliability — Restored headless runtime coverage, runtime-aware Claude inventory checks, and release-gate coherence validation
  • Reverse-engineer repo hygiene — Repo-mode reverse engineer now ignores generated and temp trees when identifying CLI and module surfaces

Full Changelog: v2.29.0...v2.30.0