[Feature] Supervisor policy system v1 — intent guard, output formatter, playbook, routing, sequencing, approval + plan approval step

## What you want and why

The CUGA supervisor currently has no policy layer of its own. As multi-agent workflows grow in complexity, the supervisor needs the same governance primitives that exist for individual agents — plus orchestration-specific controls that govern how sub-agents are selected, ordered, and approved.

This feature tracks **v1 of the Supervisor Policy System**, implemented entirely through the Python SDK with full test coverage per policy type.

### Policy types in scope

| Policy | Description |
|--------|-------------|
| **Intent Guard** | Block or allow supervisor-level tasks before any sub-agent is dispatched |
| **Output Formatter** | Transform or gate the final response returned by the supervisor to the caller |
| **Playbook** | Define a structured multi-step workflow the supervisor must follow |
| **Agent Routing Policy** | Control which sub-agent(s) are eligible to handle a given task |
| **Agent Sequencing Policy** | Enforce ordering constraints on sub-agent execution (e.g. agent A must run before agent B) |
| **Agent Approval Policy** | Require human or automated approval before dispatching a specific sub-agent |

### Plan approval (separate, non-policy step)

In addition to the policy types above, add a dedicated **plan approval step** to the supervisor loop. This is not a policy — it is a first-class supervisor lifecycle hook that surfaces the sub-agent execution plan to an operator for review/approval before any sub-agent runs. This gives visibility into what the supervisor intends to do before it does it.

## How it could work

### SDK surface (high-level strawman)

```python
from cuga.supervisor import CugaSupervisor
from cuga.supervisor.policies import (
    SupervisorIntentGuard,
    SupervisorOutputFormatter,
    SupervisorPlaybook,
    AgentRoutingPolicy,
    AgentSequencingPolicy,
    AgentApprovalPolicy,
)

supervisor = CugaSupervisor(
    agents=[agent_a, agent_b, agent_c],
    policies=[
        SupervisorIntentGuard(blocked_patterns=["delete all", "drop table"]),
        SupervisorOutputFormatter(template="..."),
        SupervisorPlaybook(steps=["verify", "execute", "summarize"]),
        AgentRoutingPolicy(rules={"billing": agent_a, "hr": agent_b}),
        AgentSequencingPolicy(order=[agent_a, agent_b]),
        AgentApprovalPolicy(agents=[agent_c], requires_approval=True),
    ],
    plan_approval=True,  # surfaces sub-agent plan before execution
)
```

### Plan approval hook

When `plan_approval=True`, the supervisor pauses after building its execution plan and before dispatching any sub-agent. The plan (list of sub-agents + their inputs) is surfaced via the existing HITL mechanism for operator review. Execution resumes only after approval.

### Implementation scope

- New `cuga.supervisor.policies` module with a base class and one concrete class per policy type
- Supervisor loop updated to evaluate policies at the correct lifecycle points (pre-dispatch, post-output, routing, sequencing, approval)
- Plan approval as a separate supervisor lifecycle step (not coupled to the policy module)
- SDK-only for v1 — maybe just show enactment phases like what happened next to final answer as we did in existing system for policies

### Testing

Each policy type gets its own test file/class. Tests exercise the policy exclusively through the SDK (no UI, no integration harness). Coverage should include:

- Happy-path: policy allows execution to proceed
- Block/redirect path: policy halts or reroutes execution
- Edge cases per type (e.g. sequencing violation, routing miss, approval denied)
- Plan approval: plan is surfaced and execution is gated correctly

## Links or extra context

- Existing agent-level policy docs: https://docs.cuga.dev/docs/sdk/policies/
- Epic tracking the policy document parser (upstream): #99
- v1 is SDK-only; a future issue will add UI authoring and document-based policy loading

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Supervisor policy system v1 — intent guard, output formatter, playbook, routing, sequencing, approval + plan approval step #100

What you want and why

Policy types in scope

Plan approval (separate, non-policy step)

How it could work

SDK surface (high-level strawman)

Plan approval hook

Implementation scope

Testing

Links or extra context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Policy	Description
Intent Guard	Block or allow supervisor-level tasks before any sub-agent is dispatched
Output Formatter	Transform or gate the final response returned by the supervisor to the caller
Playbook	Define a structured multi-step workflow the supervisor must follow
Agent Routing Policy	Control which sub-agent(s) are eligible to handle a given task
Agent Sequencing Policy	Enforce ordering constraints on sub-agent execution (e.g. agent A must run before agent B)
Agent Approval Policy	Require human or automated approval before dispatching a specific sub-agent

[Feature] Supervisor policy system v1 — intent guard, output formatter, playbook, routing, sequencing, approval + plan approval step #100

Description

What you want and why

Policy types in scope

Plan approval (separate, non-policy step)

How it could work

SDK surface (high-level strawman)

Plan approval hook

Implementation scope

Testing

Links or extra context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions