Skip to content

feat: unified agent profile abstraction for cross-agent comparison #1134

@christso

Description

@christso

Objective

Add a first-class configuration layer for portable cross-agent comparison — a shared set of knobs that work across agent targets, with fallback to native target-specific configuration.

Problem

AgentV has rich target/provider support, but when comparing agents side-by-side (e.g., Claude Code vs Codex vs Copilot), each target has its own config shape. There's no clean way to say "run all agents with the same model tier and reasoning level" without maintaining separate target configs that may drift.

This matters for:

  • Cross-agent benchmarks — fair apples-to-apples comparison
  • Configuration management — one shared config instead of N per-agent configs
  • Onboarding — new users can start with a portable config and specialize later

Design sketch

Unified profile (shared knobs)

# profiles/balanced.yaml
kind: agent_profile
name: balanced
mode: unified

unified:
  model_tier: high          # maps to each agent's best model
  reasoning_level: medium
  
  skills:
    - path: ./skills/repo-search
  
  workspace_instructions:
    path: ./AGENTS.md        # injected as CLAUDE.md, AGENTS.md, etc. per agent
  
  mcp_servers:
    - name: filesystem
      transport: stdio
      command: ["npx", "-y", "@modelcontextprotocol/server-filesystem", "/workspace"]

Native fallback (agent-specific)

# profiles/claude-custom.yaml
kind: agent_profile
name: claude-custom
mode: native
target: claude

native:
  settings_json: |
    {"model": "opus", "permissions": {"allow_all": true}}
  mcp_json: |
    {"servers": [...]}

Usage in EVAL.yaml

targets:
  - name: claude-balanced
    use_target: claude
    profile: ./profiles/balanced.yaml
  
  - name: codex-balanced
    use_target: codex
    profile: ./profiles/balanced.yaml

Related issues

Key decisions needed

  1. Profile format: YAML file? Inline in targets.yaml? Section in agentv.config.ts?
  2. Model mapping: How to map model_tier: high to each agent's actual model name?
  3. Skills/MCP portability: Which agents support skills and MCP injection?
  4. Scope: Start with model + reasoning only, or include skills/MCP/workspace in v1?

Non-goals

  • Not replacing existing target configuration — profiles are an optional layer on top
  • Not building agent adapter/plugin hooks — use existing provider infrastructure

Acceptance signals

  • A single profile config can be shared across 2+ agent targets in the same eval
  • Each agent target correctly maps unified fields to its native config
  • Fallback to native mode works for agent-specific features not in the unified schema
  • Cross-agent comparison with agentv compare shows meaningful side-by-side results

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions