sbroenne · sbroenne · Feb 20, 2026 · Feb 20, 2026
@@ -0,0 +1,114 @@
+# Copilot Instructions for pytest-codingagents
+
+## Build, Test & Lint Commands
+
+```bash
+# Install all dependencies (including dev and docs extras)
+uv sync --all-extras
+
+# Unit tests (fast, no credentials needed)
+uv run pytest tests/unit/ -v
+
+# Run a single unit test file
+uv run pytest tests/unit/test_event_mapper.py -v
+
+# Run a single test by name
+uv run pytest tests/unit/test_result.py::test_name -v
+
+# Integration tests (require GitHub Copilot credentials via GITHUB_TOKEN or `gh` CLI auth)
+uv run pytest tests/ -v -m copilot
+
+# Run one integration test file for a specific model
+uv run pytest tests/test_basic.py -k "gpt-5.2" -v
+
+# Lint
+uv run ruff check src tests
+
+# Format
+uv run ruff format src tests
+
+# Type check
+uv run pyright src
+
+# Multi-file integration run with per-file HTML reports
+uv run python scripts/run_all.py
+```
+
+## Architecture
+
+This is a **pytest plugin** (`pytest11` entry point) that provides a test harness for empirically validating GitHub Copilot agent configurations.
+
+### Data Flow
+
+```
+CopilotAgent (frozen config dataclass)
+  → runner.run_copilot(agent, prompt)
+    → GitHub Copilot SDK client + session
+      → SDK SessionEvent stream
+        → EventMapper.process_event()  (38+ event types → structured data)
+          → Turn / ToolCall accumulation
+            → CopilotResult (turns, success, usage, reasoning, subagents)
+              → copilot_run fixture stashes result for pytest-aitest
+                → HTML report with AI-powered insights
+```
+
+### Key Modules (`src/pytest_codingagents/`)
+
+| Module | Role |
+|--------|------|
+| `plugin.py` | Pytest plugin entry point; registers fixtures and `pytest_aitest_analysis_prompt` hook |
+| `copilot/agent.py` | `CopilotAgent` frozen dataclass; `build_session_config()` maps user fields → SDK TypedDict |
+| `copilot/runner.py` | `run_copilot()` — manages SDK client lifecycle, streams events, returns `CopilotResult` |
+| `copilot/events.py` | `EventMapper` — translates raw SDK events into `Turn`/`ToolCall` objects |
+| `copilot/result.py` | `CopilotResult`, `UsageInfo`, `SubagentInvocation`; re-exports `Turn`/`ToolCall` from `pytest_aitest` |
+| `copilot/fixtures.py` | `copilot_run` and `ab_run` pytest fixtures |
+| `copilot/agents.py` | `load_custom_agent()` — parses `.agent.md` YAML frontmatter files |
+| `copilot/optimizer.py` | `optimize_instruction()` — uses pydantic-ai to suggest instruction improvements |
+| `copilot/personas.py` | `VSCodePersona`, `ClaudeCodePersona`, `CopilotCLIPersona`, `HeadlessPersona` — inject IDE context |
+
+### Two Core Fixtures
+
+**`copilot_run(agent, prompt)`** — Executes a single agent run, auto-stashes result for aitest reporting.
+
+**`ab_run(baseline_agent, treatment_agent, task)`** — Runs two agents in isolated `tmp_path` directories and returns `(baseline_result, treatment_result)` for direct comparison.
+
+## Key Conventions
+
+### Every module uses `from __future__ import annotations`
+Required for forward references and PEP 563 deferred evaluation. Add it to every new module.
+
+### `CopilotAgent` is a frozen dataclass
+It is immutable and safe to share across parametrized tests. User-friendly field names (e.g., `instructions`) are mapped to SDK internals in `build_session_config()`. Unknown SDK fields go in `extra_config: dict`.
+
+### Async-first
+All SDK interactions are async. Test functions using `copilot_run` or `ab_run` must be `async def`. `asyncio_mode = "auto"` is set in `pyproject.toml`, so no `@pytest.mark.asyncio` decorator is needed.
+
+### Integration tests are parametrized over models
+```python
+from tests.conftest import MODELS
+
+@pytest.mark.parametrize("model", MODELS)
+async def test_something(copilot_run, model):
+    agent = CopilotAgent(model=model, ...)
+```
+`MODELS = ["gpt-5.2", "claude-opus-4.5"]` is defined in `tests/conftest.py`.
+
+### Result introspection methods
+Prefer the typed helper methods over raw field access:
+- `result.success` / `result.error`
+- `result.tool_was_called("create_file")` 
+- `result.all_tool_calls` / `result.final_response`
+- `result.file(path)` — reads a file from the agent's working directory
+- `result.usage` — `UsageInfo` with token counts and estimated cost
+
+### Personas inject IDE context post-config
+Apply a persona to a `CopilotAgent` before running to simulate a specific IDE environment (e.g., `VSCodePersona` polyfills `runSubagent`). This is separate from the agent config.
+
+### Custom agents use `.agent.md` files
+YAML frontmatter + Markdown body. Parsed by `load_custom_agent(path)`. The `mode` frontmatter field controls agent type.
+
+### Ruff rules: E, F, B, I — 100 char line length, double quotes
+Enforced by pre-commit hooks and CI. Run `uv run ruff check --fix src tests` before committing.
+
+### Pyright type checking is `basic` mode, scoped to `src/` only
+Tests directory is not type-checked by pyright. Type annotations in `src/` should be complete and valid.
@@ -87,11 +87,11 @@ async def test_docstring_instruction_iterates(ab_run, tmp_path):
 
 ## API Reference
 
-::: pytest_codingagents.copilot.optimizer.optimize_instruction
+::: pytest_aitest.execution.optimizer.optimize_instruction
 
 ---
 
-::: pytest_codingagents.copilot.optimizer.InstructionSuggestion
+::: pytest_aitest.execution.optimizer.InstructionSuggestion
 
 ## Choosing a Model
 

@@ -8,11 +8,11 @@
     options:
       show_source: false
 
-::: pytest_codingagents.optimize_instruction
+::: pytest_aitest.execution.optimizer.optimize_instruction
     options:
       show_source: false
 
-::: pytest_codingagents.InstructionSuggestion
+::: pytest_aitest.execution.optimizer.InstructionSuggestion
     options:
       show_source: false
 

@@ -8,14 +8,22 @@
     options:
       show_source: false
 
-::: pytest_codingagents.copilot.result.SubagentInvocation
+## SubagentInvocation
+
+`SubagentInvocation` is defined in [`pytest_aitest.core.result`](https://sbroenne.github.io/pytest-aitest/reference/result/) and available as:
+
+```python
+from pytest_aitest import SubagentInvocation
+```
+
+::: pytest_aitest.core.result.SubagentInvocation
     options:
       show_source: false
 
 ## Turn and ToolCall
 
-`Turn` and `ToolCall` are re-exported from [`pytest_aitest.core.result`](https://sbroenne.github.io/pytest-aitest/reference/result/) for convenience. See the pytest-aitest documentation for their full API.
+`Turn` and `ToolCall` are defined in [`pytest_aitest.core.result`](https://sbroenne.github.io/pytest-aitest/reference/result/) and available as:
 
 ```python
-from pytest_codingagents.copilot.result import Turn, ToolCall
+from pytest_aitest import Turn, ToolCall
 ```
@@ -28,7 +28,7 @@ classifiers = [
 dependencies = [
     "pytest>=9.0",
     "github-copilot-sdk>=0.1.25",
-    "pytest-aitest>=0.5.6",
+    "pytest-aitest>=0.5.7",
     "azure-identity>=1.25.2",
     "pyyaml>=6.0",
     "pydantic-ai>=1.0",

@@ -2,12 +2,10 @@
 
 from __future__ import annotations
 
+from pytest_aitest.execution.optimizer import InstructionSuggestion, optimize_instruction
+
 from pytest_codingagents.copilot.agent import CopilotAgent
 from pytest_codingagents.copilot.agents import load_custom_agent, load_custom_agents
-from pytest_codingagents.copilot.optimizer import (
-    InstructionSuggestion,
-    optimize_instruction,
-)
 from pytest_codingagents.copilot.personas import (
     ClaudeCodePersona,
     CopilotCLIPersona,

@@ -60,11 +60,11 @@
 import time
 from typing import TYPE_CHECKING, Any
 
+from pytest_aitest.core.result import SubagentInvocation
 from pytest_aitest.execution.cost import estimate_cost
 
 from pytest_codingagents.copilot.result import (
     CopilotResult,
-    SubagentInvocation,
     ToolCall,
     Turn,
     UsageInfo,