feat: evolutionary prompt optimization — Rainbow Teaming QD loop for role system prompts

## Problem

Role system prompts in `src/runtime/roles.zig` are currently static and hand-tuned. There's no systematic way to evaluate whether a prompt is optimal for its role, or to evolve prompts over time based on actual task performance.

## Proposal

Implement an evolutionary prompt optimization loop inspired by quality-diversity (QD) search methods, specifically the Rainbow Teaming approach ([arXiv:2402.16822](https://arxiv.org/abs/2402.16822)).

### Core idea

Cast prompt optimization as a **quality-diversity problem**:
- **Quality**: task success rate, code correctness, fix accuracy
- **Diversity**: coverage across different task types, codebase patterns, failure modes

Use open-ended search to generate prompt variants that are both effective and diverse, maintaining a MAP-Elites style archive of best prompts per niche.

### Implementation sketch

1. **Prompt genome**: each role's system prompt is a "genome" that can be mutated
2. **Fitness function**: run the role on a benchmark task suite, measure success metrics
3. **Diversity dimensions**: task category (bug fix, review, search), codebase size, language features used
4. **Selection**: MAP-Elites archive keeps the best prompt per (quality, diversity) cell
5. **Mutation operators**: LLM-guided rewriting — "make this prompt better at X while keeping Y"
6. **Evolution loop**: generate variants → evaluate on benchmarks → archive best → repeat

### Integration with grid

The evolved prompts feed back into the grid system:
- `grid.zig` currently maps role → model tier
- Extend to map role → (model tier, prompt variant ID)
- Store winning prompts in `.devswarm/evolved_prompts/` per project
- Fall back to built-in defaults when no evolved prompts exist

### Connects to

- #274 (evolutionary grid tuning — DGM outer loop)
- This extends #274 from model selection to prompt content optimization
- #353 (eval framework for prompt selection)

## Why P0

The swarm now has 12 roles with structured prompts (#352). The next leverage point is making those prompts self-improving rather than hand-tuned. This is the foundation for the entire adaptive agent system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: evolutionary prompt optimization — Rainbow Teaming QD loop for role system prompts #353

Problem

Proposal

Core idea

Implementation sketch

Integration with grid

Connects to

Why P0

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

feat: evolutionary prompt optimization — Rainbow Teaming QD loop for role system prompts #353

Description

Problem

Proposal

Core idea

Implementation sketch

Integration with grid

Connects to

Why P0

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions