Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions flake.nix
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@
workspacePaths = [
./packages/acp-adapter
./packages/agent-core
./packages/agent-core-v2
./packages/server
./packages/server-e2e
./packages/kaos
Expand All @@ -85,6 +86,7 @@
workspaceNames = [
"@moonshot-ai/acp-adapter"
"@moonshot-ai/agent-core"
"@moonshot-ai/agent-core-v2"
"@moonshot-ai/server"
"@moonshot-ai/server-e2e"
"@moonshot-ai/kaos"
Expand Down
42 changes: 42 additions & 0 deletions packages/agent-core-v2/AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# agent-core-v2 Agent Guide

> New agent engine built on the DI Scope architecture — work-in-progress port of `packages/agent-core`. Design: `plan/PLAN.md`. Porting status: `GAP_ANALYSIS.md`.

## Comment conventions

- **Header only, external role only.** Comments live solely in the top-of-file `/** */` block — never beside functions, methods, or statements. Say what the module exposes and the responsibility it owns; the code is the source of truth for how it works, so do not narrate implementation steps, enumerate every export, or note porting / skeleton status.
- **Identity line first.** Start with `` `<domain>` domain (Ln) — <one-line role>. `` Keep an existing `(cross-cutting)` label as-is; barrels omit the layer (`` `<domain>` domain barrel — … ``). Write the role as a responsibility ("drives the turn lifecycle"), not a symbol list ("turn driver + context + loop runner").
- **Impl files add collaborators + scope; contract files add the public contract + scope.** For impls, list every imported cross-domain collaborator as a role ("persists records through `records`") — declared dependencies count even if not yet wired in this WIP port; infrastructure imports (`_base/**`) are not collaborators. Read scope from `registerScopedService(LifecycleScope.X, …)`.

### Examples

Impl (`src/session/sessionService.ts`):

```ts
/**
* `session` domain (L6) — `ISessionService` implementation.
*
* Owns the session's child-agent set and session-level operations; drives
* agent lifecycle through `agent-lifecycle`, broadcasts through `event`,
* persists session metadata through `records`, and records activity through
* `session-activity`. Bound at Session scope.
*/
```

Barrel (`src/session/index.ts`):

```ts
/**
* `session` domain barrel — re-exports the session facade contract
* (`session`) and its scoped service (`sessionService`). Importing this
* barrel registers the `ISessionService` binding into the scope registry.
*/
```

## Docs

Per-domain references live in `docs/`.

- [`docs/flag.md`](docs/flag.md) — Read **before gating behavior behind a feature flag**: defining/registering a flag in `FLAG_DEFINITIONS`, checking `IFlagService.enabled(id)`, wiring the `[experimental]` config section, or deciding whether a flag is Core-scope vs. per-session.
- [`docs/errors.md`](docs/errors.md) — Read **before raising errors from a domain**: defining a co-located `XxxError`, registering a code in `ErrorCodes`/`ERROR_INFO`, translating external errors (provider/HTTP, fs, MCP) at the boundary, or (de)serializing errors across RPC/SDK with `toErrorPayload`/`fromErrorPayload`.
- [`docs/di-testing.md`](docs/di-testing.md) — Read **before writing or touching any DI/Scope test**: picking the right harness (`InstantiationService` vs `TestInstantiationService` vs `createScopedTestHost`), declaring deps with `@IService`, stubbing collaborators, and teardown via `DisposableStore`.
701 changes: 701 additions & 0 deletions packages/agent-core-v2/GAP_ANALYSIS.md

Large diffs are not rendered by default.

117 changes: 117 additions & 0 deletions packages/agent-core-v2/docs/di-testing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# DI testing

> Conventions for testing services built on the DI × Scope architecture. Mirrors the way VS Code tests `src/vs/platform/instantiation` and its consumers: declare dependencies with `@IService` decorators, build the container through the public API, stub collaborators through `TestInstantiationService`.

`@IService` parameter decorators run under vitest (the build uses `experimentalDecorators`), so test fixtures declare dependencies exactly like production code. There is **no** `param()` helper, no manual `(Id as …)(Ctor, '', 0)`, and no capturing `accessor` inside a constructor to synchronously `.get()` a peer — those are all workarounds for a decorator transform we already have.

## Three kinds of tests

Pick the helper by *what is under test*, not by habit.

| Under test | Lives in | Helper | Build the container with |
|---|---|---|---|
| The container / Scope machinery itself | `test/di/*` | the plain `InstantiationService` / `Scope` API | flat: `new InstantiationService(new ServiceCollection([Id, new SyncDescriptor(Impl)]))`; scoped: `createCoreScope()` + `registerScopedService()` |
| A real domain service (unit) | `test/<domain>/*` | `TestInstantiationService` | `disposables.add(new TestInstantiationService())` + `ix.stub(...)` + `ix.createInstance(Sut)` |
| Cross-scope wiring (integration) | `test/<domain>/*` or `test/di/*` | `createScopedTestHost` | `createScopedTestHost([[ILog, stub]])` → `host.child(LifecycleScope.Session, 's1', …)` |

Rule of thumb: testing the **container** → use the container; testing a **service** → use `TestInstantiationService`; only reach for the scope host when *which layer a service lives in* is itself the thing being asserted. Never `new` a production service in a unit test and paper over its dependencies with `undefined as never`.

## Declaring dependencies

Always use `@IService` constructor decorators — in fixtures and in production services alike.

```ts
// ✅
class Consumer {
constructor(@IGreeter private readonly greeter: IGreeter) {}
}

// ❌ no param() helper, no inline cast
class Consumer {
constructor(private readonly greeter: IGreeter) {}
}
param(IGreeter, Consumer, 0);
```

This holds for cycle tests too. Declare the loop with real constructor dependencies (`ServiceLoop1(@IService2)` ↔ `ServiceLoop2(@IService1)`); do not capture `accessor` inside a constructor and call `.get(peer)` to force an edge.

Because the decorator runs when the class is defined, the `createDecorator` identifier must be initialized **before** the class that uses it. Declare the identifier, then the class:

```ts
const IDep = createDecorator<IDep>('dep');
class Consumer {
constructor(@IDep private readonly dep: IDep) {}
}
```

For two services that depend on each other (a cycle), declare both identifiers first, then both classes, so neither class references an uninitialized binding.

Declare fixtures at module top, interface + decorator + implementation co-located, and keep `_serviceBrand` on the interface when it represents a real service:

```ts
const IGreeter = createDecorator<IGreeter>('greeter');
interface IGreeter {
readonly _serviceBrand: undefined;
greet(): string;
}
class Greeter implements IGreeter {
declare readonly _serviceBrand: undefined;
greet(): string { return 'hi'; }
}
```

Pure throwaway fixtures may omit `_serviceBrand`.

## Domain service unit tests

`TestInstantiationService` (from `#/_base/di/test`) is the default harness. It is an `InstantiationService` that also implements `ServicesAccessor`, so you can `.get()` directly, and it owns sinon so `dispose()` restores stubs.

```ts
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import { DisposableStore } from '#/_base/di/lifecycle';
import { TestInstantiationService } from '#/_base/di/test';

describe('FlagService', () => {
let disposables: DisposableStore;
let ix: TestInstantiationService;

beforeEach(() => {
disposables = new DisposableStore();
ix = disposables.add(new TestInstantiationService());
ix.stub(ILogService, { log() {} });
ix.stub(IConfigService, { get: () => ({}), onDidChange: () => () => {} });
});
afterEach(() => disposables.dispose());

it('reads a flag', () => {
const svc = disposables.add(ix.createInstance(FlagService));
expect(svc.isEnabled('x')).toBe(false);
});
});
```

Stubbing:

- stub a whole service with a partial: `ix.stub(IId, { method() { return … } })`;
- stub / assert a single method: `ix.stub(IId, 'method', value)` returns a sinon stub; `ix.spy(IId, 'method')` returns a spy;
- replace with a prebuilt instance or descriptor: `ix.set(IId, instance)` / `ix.set(IId, new SyncDescriptor(Impl))`;
- when a collaborator's behavior must vary per test, model it as a `Test*Service` subclass whose methods read suite-scoped `let` variables (the `configurationValue` / `updateArgs` pattern in VS Code) rather than rebuilding the container each test.

Create the system-under-test through DI (`ix.createInstance(Sut)`) so its `@IService` dependencies are resolved from the container, exactly as in production.

## Lifecycle / teardown

One `DisposableStore` per suite. Add the container, the system-under-test, and any event subscriptions to it; dispose in `afterEach`.

```ts
beforeEach(() => { disposables = new DisposableStore(); /* … */ });
afterEach(() => disposables.dispose());
```

Scope-host tests call `host.dispose()` in `afterEach` (or at the end of the `it`). Do not scatter bare `ix.dispose()` / `core.dispose()` calls through test bodies — route teardown through the store so ordering is deterministic and nothing leaks when a test fails mid-way.

## Assertions and naming

- One behavior per `it`; describe observable behavior (`child shadows parent registration`), not implementation (`calls _getOrCreateServiceInstance`).
- For cycles, assert `CyclicDependencyError` and its `path` array (e.g. `['A', 'B', 'A']`), not merely `toThrow`.
- For disposal order, capture events in an array and assert the sequence (`['C', 'B', 'A']` — children before parents).
67 changes: 67 additions & 0 deletions packages/agent-core-v2/docs/errors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# errors

> Error infrastructure for agent-core-v2: base classes, the public code registry, wire serialization, and the conventions domains follow when raising errors.

Design is borrowed from VSCode (`src/vs/base/common/errors.ts` + per-service error classes) with one addition: a central **code registry** for the RPC/SDK boundary. Mechanism is centralized in `_base/errors`; error *classes* are decentralized (co-located per domain); error *codes* are decentralized too but aggregated into one registry with metadata.

## Where things live

- `src/_base/errors/errors.ts`: base classes — `KimiError`, `CancellationError`, `ExpectedError`, `ErrorNoTelemetry`, `BugIndicatingError`, `NotImplementedError`.
- `src/_base/errors/codes.ts`: `ErrorCodes` registry, `ErrorCode` type, `ERROR_INFO` metadata (`title` / `retryable` / `public` / `action`), `errorInfo(code)`.
- `src/_base/errors/serialize.ts`: `ErrorPayload`, `isCodedError`, `toErrorPayload`, `fromErrorPayload`, `makeErrorPayload`.
- `src/_base/errors/errorMessage.ts`: `toErrorMessage(error, verbose?)` for logs/CLI.
- `src/_base/errors/unexpectedError.ts`: `onUnexpectedError` / `setUnexpectedErrorHandler` / `safelyCallListener` (global handler).
- `src/_base/di/errors.ts`: DI-only `CyclicDependencyError` (kept separate; the DI layer exposes no general error taxonomy, like VSCode).

## Conventions (hard rules)

- **Throw a coded error, not a bare string.** Define a domain error that `extends KimiError` and carries a `code`. `throw new Error('x')` only for unreachable guards; use `NotImplementedError('feature')` for stubs.
- **Co-locate the error class with the domain's interfaces.** `ToolError` lives in `tool/tool.ts` next to `IToolService`, not in a separate `*Errors.ts` and not in `_base/errors`.
- **One `code` per failure mode.** Codes read `domain.reason` (e.g. `tool.unknown_tool`). Adding a code is minor; renaming/removing one is a major (breaks SDK clients).
- **Register codes centrally.** After defining a domain's `XxxErrorCode` const, spread it into `ErrorCodes` in `codes.ts` and add an `ERROR_INFO` entry per code.
- **Translate foreign errors at the boundary.** Provider/HTTP, fs, MCP errors are caught at the domain boundary and re-thrown as the domain's coded error. `_base/errors` never imports a business domain.
- **Branch on `code`, never `instanceof`, across the wire.** Class identity does not survive serialization. In-process, `instanceof KimiError` / `isCodedError` are fine.

## Adding a domain error (recipe)

In `<domain>/<domain>.ts`:

```ts
import { KimiError, type ErrorCode } from '#/_base/errors';

export const ToolErrorCode = {
UnknownTool: 'tool.unknown_tool',
ExecutionFailed: 'tool.execution_failed',
} as const;
export type ToolErrorCode = (typeof ToolErrorCode)[keyof typeof ToolErrorCode];

export class ToolError extends KimiError {
constructor(code: ToolErrorCode, message: string, details?: Record<string, unknown>) {
super(code as ErrorCode, message, { details });
this.name = 'ToolError';
}
}
```

Then in `src/_base/errors/codes.ts`, spread `...ToolErrorCode` into `ErrorCodes` and add an `ERROR_INFO` entry for `tool.unknown_tool` and `tool.execution_failed`.

## Serialization & boundary translation

- `toErrorPayload(error)`: `CancellationError` → `canceled`; any coded error (incl. deserialized shapes) → its code + `retryable` from `ERROR_INFO`; anything else → `internal`.
- `fromErrorPayload(payload)`: rehydrates a `KimiError` for in-process `instanceof` / `isCodedError` use at the SDK/RPC boundary.
- `isCodedError(error)`: structural guard (checks `code` against `ERROR_INFO`), so it works for both `KimiError` instances and plain objects revived from a payload.
- Foreign-error mapping lives in the domain that owns the foreign dependency, e.g. kosong maps `APIStatusError` (429/401/…) → `KosongError` codes at its client boundary. A `registerErrorNormalizer` escape hatch is intentionally **not** provided until a second use case appears.

## Deliberately omitted

- No `IErrorWithActions` / action buttons — there is no notification surface in agent-core; add when one exists.
- No class registry / revival — payloads carry `code` + data only; rehydration always yields a base `KimiError`.
- No `IllegalArgumentError` / `NotSupportedError` yet — add a base class when a second throw site needs it.

## References

- `packages/agent-core-v2/src/_base/errors/` — implementation.
- `packages/agent-core/src/errors/` — v1 source this was ported from.
- `packages/agent-core-v2/GAP_ANALYSIS.md` §2.2 — gap closure note (`_base/errors`).
- `packages/agent-core-v2/GAP_ANALYSIS.md` §2.6 — RPC/SDK boundary that motivates the code registry.
- VSCode upstream: `src/vs/base/common/errors.ts`, `src/vs/base/common/errorMessage.ts`, `src/vs/platform/files/common/files.ts` (`FileOperationError` + `FileOperationResult`), `src/vs/platform/userDataSync/common/userDataSync.ts` (`UserDataSyncError` + code enum + normalizer).
86 changes: 86 additions & 0 deletions packages/agent-core-v2/docs/flag.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# flag

> Experimental feature-flag gating for agent-core-v2 — a Core-scope `IFlagService` resolver plus an exported `FlagRegistry` catalog, backed by the `[experimental]` config section.

Gates not-yet-public features behind `IFlagService.enabled(id)`, per the repository hard rule that unreleased behavior must be flag-gated. Ported from `packages/agent-core/src/flags/**`; v1 was a process-global `FlagResolver` singleton, v2 is a scoped DI service with no implicit global state.

## Layout

- `src/flag/registry.ts` — `FLAG_DEFINITIONS`, `FlagId`, `FlagDefinition`, `FlagRegistry` (catalog), `ExperimentalConfigSchema` / `ExperimentalConfig` (zod).
- `src/flag/flag.ts` — `IFlagService` token + resolver types (`ExperimentalFlagMap`, `ExperimentalFlagConfig`, `ExperimentalFlagSource`, `ExperimentalFeatureState`).
- `src/flag/flagService.ts` — `FlagService` impl + `MASTER_ENV` (`KIMI_CODE_EXPERIMENTAL_FLAG`) + `EXPERIMENTAL_SECTION` (`experimental`); self-registers at Core scope.
- `src/flag/index.ts` — barrel; re-exported by `src/index.ts` at the L3 block.

## Public surface

- `IFlagService` (DI token, Core scope): `enabled(id)`, `explain(id)`, `snapshot()`, `enabledIds()`, `explainAll()`, `setConfigOverrides(overrides)`, `registry`.
- `FlagRegistry`: `get(id)`, `list()`, `definitions` — read-only catalog for hosts/UI to enumerate flags without resolving them.
- `FlagService`: exported for tests and hosts that construct it directly.

## Resolution precedence

Highest wins; env is read live on every call (nothing cached):

1. L1 master env `KIMI_CODE_EXPERIMENTAL_FLAG` truthy → every flag on.
2. L2 per-feature `def.env` (e.g. `KIMI_CODE_EXPERIMENTAL_MICRO_COMPACTION`) → forces on/off.
3. L3 `[experimental]` config section per-flag override.
4. L4 registry `default`.

`explain(id)` returns the winning `source` (`master-env` | `env` | `config` | `default`) plus the effective `configValue`.

## Config integration

- `FlagService` registers the `[experimental]` section into `IConfigRegistry` at construction (`registerSection('experimental', ExperimentalConfigSchema)`) and reads overrides from `IConfigService`.
- It subscribes `IConfigService.onDidChange` and refreshes overrides whenever the `experimental` domain changes, so config edits apply live.
- `IConfigRegistry.registerSection` throws if a domain is registered twice — `experimental` is owned exclusively by `FlagService`.
- `setConfigOverrides(overrides)` is an imperative escape hatch for tests and hosts without an `IConfigService`; hosts on `IConfigService` should set the `[experimental]` section instead.

Config shape mirrors v1:

```toml
[experimental]
micro_compaction = false
```

Keys are intentionally loose (`z.record(z.string(), z.boolean())`), so obsolete flags stay inert config.

## Add a flag

Append to `FLAG_DEFINITIONS` in `src/flag/registry.ts`:

```ts
{ id: 'my_feature', title: 'My feature', description: '...', env: 'KIMI_CODE_EXPERIMENTAL_MY_FEATURE', default: false, surface: 'both' }
```

- Keep the `as const satisfies` — it derives the `FlagId` union that gives `enabled()` autocomplete and typo-checking.
- `env` must start with `KIMI_CODE_EXPERIMENTAL_`, be unique, and not equal `KIMI_CODE_EXPERIMENTAL_FLAG`.
- `id` must not be `flag`.
- `surface`: `core` | `tui` | `both` (documentation/grouping only; not used in resolution).

## Consume a flag

Inject `IFlagService` and gate on it. It is resolvable from any scope (Core ancestor):

```ts
constructor(@IFlagService private readonly flags: IFlagService) {}
// ...
if (!this.flags.enabled('micro_compaction')) return;
```

Current consumer: `compaction` (L4) gates `micro_compaction`.

## Layering & scope

- Domain `flag` is registered at **L3** (`scripts/check-domain-layers.mjs` → `['flag', 3]`). It imports only `config` (L2) downward.
- It cannot live in `_base` (L0): registering/reading the config section requires importing `config`, and L0 must not import L2.
- Scope: `Core` (`registerScopedService(LifecycleScope.Core, IFlagService, FlagService, Delayed, 'flag')`). Env + config are process-global inputs, so there is no per-session/agent state.
- Tests construct `FlagService` directly with a real `ConfigRegistry`/`ConfigService` and an injected env map (`test/flag/flag.test.ts`).

## References

- `packages/agent-core-v2/src/flag/` — implementation.
- `packages/agent-core-v2/test/flag/flag.test.ts` — precedence + config subscription tests.
- `packages/agent-core/src/flags/` — v1 source this was ported from.
- `plan/PLAN.md` §2/§3 — domain placement (`flag` at L3, not `_base/flags`).
- `packages/agent-core-v2/GAP_ANALYSIS.md` §2.1 — gap closure note.
- Root `AGENTS.md` — experimental-feature gating rule.
Loading
Loading