CLAUDE.md — AgentSpec Project Guide

This file is read by Claude Code to understand the project structure, principles, and conventions.

Project Vision

AgentSpec is the universal manifest standard for AI agents. One agent.yaml file captures everything: model, memory, tools, MCP, prompts, guardrails, evaluation, observability, and compliance.

Three properties:

Zero control plane — just a file + SDK, no server required
Extends existing standards — MCP-compatible, AGENTS.md-compatible, A2A/AgentCard exportable
Framework-agnostic — generates LangGraph, CrewAI, Mastra, AutoGen code via adapters

Repository Structure

agentspec/
├── packages/
│   ├── sdk/                    # @agentspec/sdk — core: load, health, audit, generate
│   │   └── src/
│   │       ├── schema/         # Zod schema (single source of truth)
│   │       ├── loader/         # YAML parser + $env/$secret/$file resolvers
│   │       ├── health/         # Health check engine
│   │       ├── audit/          # Compliance rules engine
│   │       └── generate/       # Adapter registry
│   ├── cli/                    # @agentspec/cli — agentspec CLI
│   │   └── src/commands/       # validate, health, audit, init, generate, export
│   └── adapter-langgraph/      # @agentspec/adapter-langgraph
│       └── src/generators/     # agent.py, requirements.txt, guardrails.py
├── schemas/v1/                 # agent.schema.json (IDE autocomplete)
├── examples/gymcoach/          # GymCoach migration example
├── docs/                       # Documentation site
└── CLAUDE.md                   # This file

Design Principles

0. Thin orchestrator + named helpers (preferred style for all new code)

This is the default way to write functions in this codebase. Prefer many small, named functions over one large function — even before the code gets long.

When writing new code, decompose by intent first:

If a block of logic has a name (even just in a comment), make it a function.
Orchestrators read like a pipeline of named steps; they contain no implementation details.
Helpers are pure or near-pure: explicit inputs, explicit output, no side effects on shared state.

Rule: if you can label a code block with a comment like // Phase 3: score results, that label is the function name — extract it.

Template (applied throughout this codebase):

// ── Internal interfaces (module-private) ─────────────────────────────────────
interface PhaseAResult { ... }
interface PhaseBResult { ... }

// ── Private helpers ────────────────────────────────────────────────────────────
function phaseA(input: Input): PhaseAResult { ... }
function phaseB(intermediate: PhaseAResult): PhaseBResult { ... }
function phaseC(a: PhaseAResult, b: PhaseBResult): FinalResult { ... }

// ── Public orchestrator ────────────────────────────────────────────────────────
export function doThing(input: Input): FinalResult {
  const a = phaseA(input)
  const b = phaseB(a)
  return phaseC(a, b)
}

Applied examples in this repo:

File	Orchestrator	Extracted helpers
`sdk/src/audit/index.ts`	`runAudit()`	`resolveActiveRules` · `collectSuppressions` · `executeRuleChecks` · `computeScoring` · `computeProvedScore`
`sdk/src/health/index.ts`	`runHealthCheck()`	`runSubagentChecks` · `runEvalChecks` · `computeHealthStatus`
`cli/src/commands/audit.ts`	action closure	`fetchProofRecords` · `printScoreSummary` · `formatEvidenceBreakdown`
`cli/src/commands/generate.ts`	action closure	`validateFramework` · `handleK8sGeneration` · `handleLLMGeneration` · `writePushModeEnv`
`cli/src/commands/evaluate.ts`	action closure	`resolveChatEndpoint` · `runInference` · `determineCiGateExit`
`cli/src/commands/scan.ts`	action closure	`collectAndValidateSourceFiles` · `validateScanResponse`
`sdk/src/agent/reporter.ts`	`startPushMode()`	`_pushHeartbeat` (private method)

Helpers are always module-private (not exported) unless reuse across files is proven necessary. Internal interface types for inter-helper data shapes are also module-private.

1. Zod as single source of truth

The packages/sdk/src/schema/manifest.schema.ts is the canonical definition.

Types are inferred from Zod with z.infer<>
JSON Schema for IDE autocomplete is exported from the same Zod schema
Never manually maintain separate TypeScript types — derive them

2. SOLID

Single Responsibility: each module does one thing (load, check, audit, generate)
Open/Closed: new framework adapters = new package, no core changes
Liskov: all FrameworkAdapter implementations are interchangeable
Interface Segregation: HealthCheck, AuditRule, FrameworkAdapter are minimal interfaces
Dependency Inversion: core SDK depends on abstractions, not concrete adapters

3. TDD — tests first

Write tests before implementation:

Write a failing test in src/__tests__/
Implement the minimum code to pass
Refactor

Run tests: pnpm test (workspace-level)

4. No runtime magic

All references ($env:, $secret:, $file:, $func:) are resolved explicitly via resolveRef(). No implicit global state. No singletons. No hidden configuration.

5. Fail fast and clearly

Missing env vars → throw with clear remediation message
Invalid manifest → ZodError with path and fix suggestion
Missing adapter → throw with install command

6. Reference syntax (do not change)

Syntax	Meaning
`$env:VAR`	Env var (fails if missing by default)
`$secret:name`	Secret manager (Vault/AWS/GCP/Azure)
`$file:path`	File relative to agent.yaml
`$func:now_iso`	Built-in function

7. agent.yaml is the spec; the SDK makes it live-verifiable (core principle)

The agent.yaml is the single source of truth for everything an agent declares: model, tools, services, memory, guardrails, evaluation, subagents.

Agents that integrate @agentspec/sdk expose a standard introspection endpoint:

GET /agentspec/health → HealthReport (live runtime state)

The sidecar discovers this endpoint and bridges the gap between the declared spec and runtime reality across all diagnostic endpoints:

/health/ready — manifest + live checks merged /explore — runtime capabilities (live tool/service/model status) /gap — manifest declarations vs runtime reality (the delta)

Agents that do NOT integrate the SDK continue to work: the sidecar falls back to static manifest analysis. Live SDK data is always preferred when available.

Core invariant: a user should be able to answer these questions from the sidecar alone:

Is the agent healthy? (all declared dependencies reachable, model key valid)
What can it do? (declared tools + their live registration status)
What is wrong? (gap between spec and runtime, with remediation)

Adding a New Framework Adapter

To add a new adapter (e.g. CrewAI):

Create packages/adapter-crewai/
Implement FrameworkAdapter interface from @agentspec/sdk
Call registerAdapter(adapter) at module load (side-effect import)
Export from src/index.ts
Users install it and import it before calling generateAdapter()

The adapter MUST produce valid, runnable code from the manifest fields.

When using Claude Code to generate an adapter:

Read the manifest schema at packages/sdk/src/schema/manifest.schema.ts
Map spec.model.provider → the framework's LLM class
Map spec.tools[] → the framework's tool format
Map spec.memory → the framework's memory/checkpointer
Map spec.guardrails → input/output validation middleware
Always generate requirements.txt / package.json and .env.example

Compliance Rule Packs

Rules live in packages/sdk/src/audit/rules/:

model.rules.ts — model resilience (fallback, version pinning, cost controls)
security.rules.ts — OWASP LLM Top 10
memory.rules.ts — memory hygiene (PII scrub, TTL, audit log)
evaluation.rules.ts — evaluation coverage

To add a new rule:

Add to the appropriate rules file
Implement the AuditRule interface
Add to the pack name in AuditRule.pack
Write a test in sdk/src/__tests__/audit.test.ts

Health Check Categories

Checks live in packages/sdk/src/health/checks/:

env.check.ts — env var presence + file refs
model.check.ts — model API HTTP reachability
mcp.check.ts — MCP server connectivity
memory.check.ts — Redis/Postgres TCP connectivity
service.check.ts — spec.requires.services TCP port reachability (no driver deps, uses net.createConnection)

The HealthCheck.category union type (packages/sdk/src/health/index.ts) supports:

Category	Source	Description
`env`	SDK	Env var presence checks
`file`	SDK	File ref resolution checks
`model`	SDK	Model API HTTP reachability
`model-fallback`	SDK	Fallback model reachability
`mcp`	SDK	MCP server connectivity
`memory`	SDK	Memory backend TCP checks
`subagent`	SDK	Sub-agent file/A2A checks
`eval`	SDK	Eval dataset file checks
`service`	SDK	`spec.requires.services` TCP connectivity
`tool`	Reporter	Registered tool handler availability (agent-side)

To add a new check category:

Create packages/sdk/src/health/checks/<category>.check.ts
Export an async run<Category>Checks() function returning HealthCheck[]
Import and call in packages/sdk/src/health/index.ts

CLI Commands

Command	Description
`agentspec validate <file>`	Schema validation only (no I/O)
`agentspec health <file>`	Runtime health checks
`agentspec audit <file>`	Compliance scoring
`agentspec init [dir]`	Interactive manifest wizard
`agentspec generate <file> --framework <fw>`	Code generation
`agentspec export <file> --format agentcard`	Export to A2A/AgentCard

Tech Stack

Concern	Tool
Language	TypeScript (Node 20+)
Monorepo	pnpm workspaces
Schema	Zod v3
YAML	js-yaml
CLI framework	commander + @clack/prompts
Build	tsup
Testing	vitest
HTTP	native fetch (Node 18+)

Operating Modes

Two canonical modes for querying live agent runtime data:

Mode	When	URL	Data
Sidecar	Local dev / per-agent port-forward	`http://localhost:4001`	Live (fresh on each request)
Operator	K8s cluster with Operator deployed	`https://agentspec.mycompany.com`	Stored (last heartbeat)

Sidecar endpoints: GET /gap, GET /proof, GET /health/ready, GET /explore Operator endpoints: GET /api/v1/agents/{name}/gap, /proof, /health

Key distinction: Operator mode uses one URL for all agents (no per-agent port-forward). Port-forward is a transport detail — in sidecar mode it's per-agent (to port 4001); in operator mode it's per-cluster (to the operator service).

See docs/concepts/operating-modes.md for the full guide, VS Code config, and MCP examples.

Key Files

File	Purpose
`packages/sdk/src/schema/manifest.schema.ts`	Zod schema — single source of truth
`packages/sdk/src/loader/resolvers.ts`	$env/$secret/$file/$func resolution
`packages/sdk/src/health/index.ts`	Health check orchestrator
`packages/sdk/src/audit/index.ts`	Audit rules engine
`packages/sdk/src/generate/index.ts`	Adapter registry
`packages/adapter-langgraph/src/index.ts`	LangGraph adapter (auto-registers)
`packages/cli/src/cli.ts`	CLI entrypoint
`examples/gymcoach/agent.yaml`	Full GymCoach manifest example

Generating Adapters with Claude Code

Claude Code can generate a new framework adapter from scratch. Provide this prompt:

Generate a @agentspec/adapter-<framework> package for AgentSpec. Read the manifest schema at packages/sdk/src/schema/manifest.schema.ts. Follow the same pattern as packages/adapter-langgraph/src/index.ts. Generate files: agent.py (or equivalent), requirements.txt, .env.example. Map all manifest fields: model, tools, memory, guardrails, observability. Auto-register with registerAdapter() on import.

The SDK's CLAUDE.md (in packages/sdk/) has detailed adapter generation instructions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md — AgentSpec Project Guide

Project Vision

Repository Structure

Design Principles

0. Thin orchestrator + named helpers (preferred style for all new code)

1. Zod as single source of truth

2. SOLID

3. TDD — tests first

4. No runtime magic

5. Fail fast and clearly

6. Reference syntax (do not change)

7. agent.yaml is the spec; the SDK makes it live-verifiable (core principle)

Adding a New Framework Adapter

Compliance Rule Packs

Health Check Categories

CLI Commands

Tech Stack

Operating Modes

Key Files

Generating Adapters with Claude Code

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md — AgentSpec Project Guide

Project Vision

Repository Structure

Design Principles

0. Thin orchestrator + named helpers (preferred style for all new code)

1. Zod as single source of truth

2. SOLID

3. TDD — tests first

4. No runtime magic

5. Fail fast and clearly

6. Reference syntax (do not change)

7. agent.yaml is the spec; the SDK makes it live-verifiable (core principle)

Adding a New Framework Adapter

Compliance Rule Packs

Health Check Categories

CLI Commands

Tech Stack

Operating Modes

Key Files

Generating Adapters with Claude Code