Roadmap

Code Agent++ is evolving from “generate files that help an agent read a repo” into a Code Agent Enhancement Layer / Agent Reliability Layer. It does not compete with Codex, Claude Code, Cursor, OpenCode, or MiMoCode. Those tools own code execution. Code Agent++ owns the external reliability layer around them: context, boundaries, evidence, impact, regression protection, hallucination checks, and repair/finalize decisions.

The roadmap is organized around the harness lifecycle:

Before execution -> During execution -> After execution -> Loop improvement

North Star

Make existing coding agents safer, more verifiable, and less regression-prone in complex repositories.

The long-term product shape:

User task
  -> Code Agent++ Context / Boundary / Regression preparation
  -> choose executor: Codex / Claude Code / Cursor / OpenCode / MiMoCode
  -> code agent edits code
  -> Code Agent++ collects diff / trace / test evidence
  -> Guard modules evaluate the run
  -> Loop Guard decides finalize / repair / repack / block / human review

v0.2: Context Guard Foundation

Goal: make agents guess less before editing.

Repository scanner.
Static file index.
Symbol and dependency extraction.
File and module dependency graph.
Importance ranking.
Minimal AGENTS.md generation.
Manual/generated/composed AGENTS.md architecture.
Task plan and task pack.
Related tests detection.
Token savings and actual output token reports.
Readiness score with dimensions and hard caps.

Status: implemented foundation.

v0.3: Boundary / Evidence / Impact Guards

Goal: make edits bounded, reviewable, and verifiable.

Contracts for architecture, module boundaries, commands, tests, and safety.
code-agent-plusplus validate-contracts.
code-agent-plusplus policy --fail-on forbidden|required|risk.
Execution trace with manual / command / CI evidence.
code-agent-plusplus trace run for command-captured evidence.
Exit code and command evidence recording.
Test selection for files and diffs.
Change impact report with direct and transitive dependents.
code-agent-plusplus verify --diff.
Freshness / drift / manifest checks.

Status: implemented foundation.

v0.4: Loop Guard and Runtime State

Goal: stop trusting the agent’s “done” claim and make the next action explicit.

Runtime state persisted under .agent-context/runs/<task-id>/state.json.
Loop decisions with priority, confidence, blocking state, and signals.
Trace-aware loop controller.
Stale evidence detection after later edits.
Repair planner that can request missing tests, contract repair, context refresh, or wider impact analysis.
Finalize gate through policy and loop reports.

Status: implemented foundation; orchestrate now runs multiple bounded iterations, while richer autonomous repair planning remains ongoing.

v0.5: Executor Adapter Layer

Goal: make Code Agent++ work as an external control plane for multiple code agents.

AgentExecutor interface for external coding agents:

export interface AgentExecutor {
  name: "opencode" | "mimocode" | "codex" | "claude-code" | "cursor" | "mock";
  run(input: { repo: string; task: string; prompt: string; agent?: string; outputDir: string; env?: Record<string, string> }): Promise<{
    exitCode: number;
    eventsPath?: string;
    finalText?: string;
    changedFiles: string[];
    diffPath: string;
  }>;
}

code-agent-plusplus agent run "<task>" . --executor opencode
code-agent-plusplus agent run "<task>" . --executor mimocode
Mock executor for CI and deterministic tests.
Generic --executor-command adapter for Codex, Claude Code, Cursor, OpenCode, MiMoCode, and other scriptable code agents.
One-shot flow through code-agent-plusplus agent run: pack -> run agent -> collect diff -> policy/tests/impact/verify.
Multi-loop harness flow through code-agent-plusplus orchestrate: pack -> run agent -> evaluate -> repair/repack/finalize/block.

Status: mock executor, generic command adapter, and OpenCode stdout/transcript/fallback event normalizer implemented; MiMoCode, Codex, and Claude native event normalizers planned.

v0.6: Hallucination Guard

Goal: make repository evidence the source of truth for APIs, commands, config, and conventions.

Implemented MVP checks:

Missing file references.
Missing symbols or exports.
Nonexistent package scripts or test commands.
Nonexistent config keys and environment variables.
Missing dependencies.

Implemented outputs:

.agent-context/hallucination/<task-id>.json
.agent-context/runs/<task-id>/hallucination.md
policy findings for missing commands, missing symbols, missing local import files, missing dependencies, missing config keys, and missing file references.
evidence references and repair suggestions.
“verify existence first” prompts

Planned expansion:

APIs or paths that contradict local conventions.
Framework-specific route/config checks.
Agent-specific transcript parsers beyond the current OpenCode foundation.

Status: deterministic Hallucination Guard MVP implemented; semantic convention checks remain planned.

v0.7: Regression Guard

Goal: prevent agents from reintroducing old bugs.

Planned inputs:

fix history
issue / PR notes
previous bug patterns
regression tests
fragile modules
historical failure cases

Planned outputs:

anti-regression notes in task packs
required regression tests
historical risk findings
repair prompts when old bug patterns reappear

Status: planned.

v0.8: MCP and Agent-Native Runtime

Goal: let coding agents call Code Agent++ as a native reliability backend.

MCP tools for build, plan, pack, retrieve, tests, impact, verify, evaluate, repair, finalize.
OpenCode / MiMoCode / MiMoCodex MCP usage guide.
Agent-led mode documentation: code agent calls Code Agent++ tools, with documented limitations that gates are advisory unless the host agent follows them.
Harness-led mode documentation: Code Agent++ invokes the executor and owns verification.
Codex and Claude Code adapters.
Cursor integration guide.
Unified retriever adapters for static, ripgrep, LightRAG, embedding, and hybrid retrieval.

Status: MCP scaffold and core tools implemented; per-client validation planned.

v0.9: Orchestrator Loop

Goal: make Code Agent++ the runtime controller and the code agent a replaceable executor.

code-agent-plusplus orchestrate "<task>" . --executor opencode --executor-command "opencode run --format json {prompt}" --max-loops 3 --checkpoint git-worktree --fail-on required
code-agent-plusplus orchestrate "<task>" . --executor mimocode --executor-command "mimocode run {prompt}" --max-loops 3 --checkpoint git-worktree --fail-on required
Flow: user task -> plan/pack -> choose executor -> execute -> collect diff/trace/test evidence -> guards -> decision.
Decisions: finalize, repair, repack, block, rollback, require human review.
Multi-iteration loop runner with per-iteration artifacts under .agent-context/runs/<task-id>/iterations/<nnn>/.
Native OpenCode event parsing for opencode run --format json, transcript files, and stdout/stderr fallback.
Native MiMoCode / Codex / Claude event parsing.
Checkpoint patch integration through --checkpoint git-worktree; destructive rollback is intentionally not automatic.

Status: multi-loop orchestrator implemented with mock executor, generic command adapter, OpenCode event normalizer, per-iteration artifacts, decision gates, and checkpoint patch output; MiMoCode, Codex, Claude event normalizers and isolated executor worktrees remain planned.

v1.0: Agent Harness Benchmark

Goal: prove the reliability layer improves coding-agent behavior.

Compare:

no context
AGENTS.md only
context pack
loop-enabled harness
harness + Guard modules

Measure:

wrong file edits
test failures
steps per task
token usage
stale evidence reuse
hallucinated APIs / commands
regression reintroduction
repair loops
human-review blocks

First targets:

OpenCode
MiMoCode / MiMoCodex
Codex CLI
Claude Code
Cursor

Status: deterministic benchmark harness implemented; real-agent benchmark planned.

Longer-Term Language Analysis

Keep TypeScript/JavaScript on the TypeScript Compiler API for project-aware semantics.
Strengthen Python with Tree-sitter plus stdlib ast fallback.
Add Go through tree-sitter-go plus go.mod metadata.
Add Rust through tree-sitter-rust plus Cargo.toml metadata.
Add Java through tree-sitter-java plus Maven/Gradle metadata.
Add C/C++ through tree-sitter-cpp plus compile_commands.json.

Completed Foundation

Repository scanner.
Static file index.
Symbol and dependency extraction.
File and module dependency graph.
Importance ranking.
AGENTS.md generation.
Manual/generated/composed AGENTS architecture.
Readiness score.
Token savings.
RAG export and retrieval protocol.
Task context, impact, test selection, and benchmark foundations.
Incremental cache for repeated builds and MCP/editor sessions.
Harness-led orchestrate command.
agent run executor wrapper.
Mock executor and generic executor command adapter.
Multi-loop orchestrator iterations with prompt, executor events, diff, trace, policy, verify, loop, and decision artifacts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roadmap

North Star

v0.2: Context Guard Foundation

v0.3: Boundary / Evidence / Impact Guards

v0.4: Loop Guard and Runtime State

v0.5: Executor Adapter Layer

v0.6: Hallucination Guard

v0.7: Regression Guard

v0.8: MCP and Agent-Native Runtime

v0.9: Orchestrator Loop

v1.0: Agent Harness Benchmark

Longer-Term Language Analysis

Completed Foundation

FilesExpand file tree

roadmap.md

Latest commit

History

roadmap.md

File metadata and controls

Roadmap

North Star

v0.2: Context Guard Foundation

v0.3: Boundary / Evidence / Impact Guards

v0.4: Loop Guard and Runtime State

v0.5: Executor Adapter Layer

v0.6: Hallucination Guard

v0.7: Regression Guard

v0.8: MCP and Agent-Native Runtime

v0.9: Orchestrator Loop

v1.0: Agent Harness Benchmark

Longer-Term Language Analysis

Completed Foundation