Code Agent++ is evolving from “generate files that help an agent read a repo” into a Code Agent Enhancement Layer / Agent Reliability Layer. It does not compete with Codex, Claude Code, Cursor, OpenCode, or MiMoCode. Those tools own code execution. Code Agent++ owns the external reliability layer around them: context, boundaries, evidence, impact, regression protection, hallucination checks, and repair/finalize decisions.
The roadmap is organized around the harness lifecycle:
Before execution -> During execution -> After execution -> Loop improvementMake existing coding agents safer, more verifiable, and less regression-prone in complex repositories.
The long-term product shape:
User task
-> Code Agent++ Context / Boundary / Regression preparation
-> choose executor: Codex / Claude Code / Cursor / OpenCode / MiMoCode
-> code agent edits code
-> Code Agent++ collects diff / trace / test evidence
-> Guard modules evaluate the run
-> Loop Guard decides finalize / repair / repack / block / human reviewGoal: make agents guess less before editing.
- Repository scanner.
- Static file index.
- Symbol and dependency extraction.
- File and module dependency graph.
- Importance ranking.
- Minimal
AGENTS.mdgeneration. - Manual/generated/composed
AGENTS.mdarchitecture. - Task plan and task pack.
- Related tests detection.
- Token savings and actual output token reports.
- Readiness score with dimensions and hard caps.
Status: implemented foundation.
Goal: make edits bounded, reviewable, and verifiable.
- Contracts for architecture, module boundaries, commands, tests, and safety.
code-agent-plusplus validate-contracts.code-agent-plusplus policy --fail-on forbidden|required|risk.- Execution trace with manual / command / CI evidence.
code-agent-plusplus trace runfor command-captured evidence.- Exit code and command evidence recording.
- Test selection for files and diffs.
- Change impact report with direct and transitive dependents.
code-agent-plusplus verify --diff.- Freshness / drift / manifest checks.
Status: implemented foundation.
Goal: stop trusting the agent’s “done” claim and make the next action explicit.
- Runtime state persisted under
.agent-context/runs/<task-id>/state.json. - Loop decisions with priority, confidence, blocking state, and signals.
- Trace-aware loop controller.
- Stale evidence detection after later edits.
- Repair planner that can request missing tests, contract repair, context refresh, or wider impact analysis.
- Finalize gate through policy and loop reports.
Status: implemented foundation; orchestrate now runs multiple bounded iterations, while richer autonomous repair planning remains ongoing.
Goal: make Code Agent++ work as an external control plane for multiple code agents.
AgentExecutorinterface for external coding agents:
export interface AgentExecutor {
name: "opencode" | "mimocode" | "codex" | "claude-code" | "cursor" | "mock";
run(input: { repo: string; task: string; prompt: string; agent?: string; outputDir: string; env?: Record<string, string> }): Promise<{
exitCode: number;
eventsPath?: string;
finalText?: string;
changedFiles: string[];
diffPath: string;
}>;
}code-agent-plusplus agent run "<task>" . --executor opencodecode-agent-plusplus agent run "<task>" . --executor mimocode- Mock executor for CI and deterministic tests.
- Generic
--executor-commandadapter for Codex, Claude Code, Cursor, OpenCode, MiMoCode, and other scriptable code agents. - One-shot flow through
code-agent-plusplus agent run:pack -> run agent -> collect diff -> policy/tests/impact/verify. - Multi-loop harness flow through
code-agent-plusplus orchestrate:pack -> run agent -> evaluate -> repair/repack/finalize/block.
Status: mock executor, generic command adapter, and OpenCode stdout/transcript/fallback event normalizer implemented; MiMoCode, Codex, and Claude native event normalizers planned.
Goal: make repository evidence the source of truth for APIs, commands, config, and conventions.
Implemented MVP checks:
- Missing file references.
- Missing symbols or exports.
- Nonexistent package scripts or test commands.
- Nonexistent config keys and environment variables.
- Missing dependencies.
Implemented outputs:
.agent-context/hallucination/<task-id>.json.agent-context/runs/<task-id>/hallucination.md- policy findings for missing commands, missing symbols, missing local import files, missing dependencies, missing config keys, and missing file references.
- evidence references and repair suggestions.
- “verify existence first” prompts
Planned expansion:
- APIs or paths that contradict local conventions.
- Framework-specific route/config checks.
- Agent-specific transcript parsers beyond the current OpenCode foundation.
Status: deterministic Hallucination Guard MVP implemented; semantic convention checks remain planned.
Goal: prevent agents from reintroducing old bugs.
Planned inputs:
- fix history
- issue / PR notes
- previous bug patterns
- regression tests
- fragile modules
- historical failure cases
Planned outputs:
- anti-regression notes in task packs
- required regression tests
- historical risk findings
- repair prompts when old bug patterns reappear
Status: planned.
Goal: let coding agents call Code Agent++ as a native reliability backend.
- MCP tools for build, plan, pack, retrieve, tests, impact, verify, evaluate, repair, finalize.
- OpenCode / MiMoCode / MiMoCodex MCP usage guide.
- Agent-led mode documentation: code agent calls Code Agent++ tools, with documented limitations that gates are advisory unless the host agent follows them.
- Harness-led mode documentation: Code Agent++ invokes the executor and owns verification.
- Codex and Claude Code adapters.
- Cursor integration guide.
- Unified retriever adapters for static, ripgrep, LightRAG, embedding, and hybrid retrieval.
Status: MCP scaffold and core tools implemented; per-client validation planned.
Goal: make Code Agent++ the runtime controller and the code agent a replaceable executor.
code-agent-plusplus orchestrate "<task>" . --executor opencode --executor-command "opencode run --format json {prompt}" --max-loops 3 --checkpoint git-worktree --fail-on requiredcode-agent-plusplus orchestrate "<task>" . --executor mimocode --executor-command "mimocode run {prompt}" --max-loops 3 --checkpoint git-worktree --fail-on required- Flow:
user task -> plan/pack -> choose executor -> execute -> collect diff/trace/test evidence -> guards -> decision. - Decisions:
finalize,repair,repack,block,rollback,require human review. - Multi-iteration loop runner with per-iteration artifacts under
.agent-context/runs/<task-id>/iterations/<nnn>/. - Native OpenCode event parsing for
opencode run --format json, transcript files, and stdout/stderr fallback. - Native MiMoCode / Codex / Claude event parsing.
- Checkpoint patch integration through
--checkpoint git-worktree; destructive rollback is intentionally not automatic.
Status: multi-loop orchestrator implemented with mock executor, generic command adapter, OpenCode event normalizer, per-iteration artifacts, decision gates, and checkpoint patch output; MiMoCode, Codex, Claude event normalizers and isolated executor worktrees remain planned.
Goal: prove the reliability layer improves coding-agent behavior.
Compare:
- no context
AGENTS.mdonly- context pack
- loop-enabled harness
- harness + Guard modules
Measure:
- wrong file edits
- test failures
- steps per task
- token usage
- stale evidence reuse
- hallucinated APIs / commands
- regression reintroduction
- repair loops
- human-review blocks
First targets:
- OpenCode
- MiMoCode / MiMoCodex
- Codex CLI
- Claude Code
- Cursor
Status: deterministic benchmark harness implemented; real-agent benchmark planned.
- Keep TypeScript/JavaScript on the TypeScript Compiler API for project-aware semantics.
- Strengthen Python with Tree-sitter plus stdlib
astfallback. - Add Go through
tree-sitter-goplusgo.modmetadata. - Add Rust through
tree-sitter-rustplusCargo.tomlmetadata. - Add Java through
tree-sitter-javaplus Maven/Gradle metadata. - Add C/C++ through
tree-sitter-cpppluscompile_commands.json.
- Repository scanner.
- Static file index.
- Symbol and dependency extraction.
- File and module dependency graph.
- Importance ranking.
AGENTS.mdgeneration.- Manual/generated/composed AGENTS architecture.
- Readiness score.
- Token savings.
- RAG export and retrieval protocol.
- Task context, impact, test selection, and benchmark foundations.
- Incremental cache for repeated builds and MCP/editor sessions.
- Harness-led
orchestratecommand. agent runexecutor wrapper.- Mock executor and generic executor command adapter.
- Multi-loop orchestrator iterations with prompt, executor events, diff, trace, policy, verify, loop, and decision artifacts.