Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
c197a6c
docs: add prompt adherence design spec
neoneye Apr 9, 2026
818686d
docs: add prompt adherence implementation plan
neoneye Apr 9, 2026
4d8bea0
feat: add FilenameEnum entries for prompt adherence
neoneye Apr 9, 2026
06bfad3
feat: add prompt adherence Pydantic models, prompts, and markdown gen…
neoneye Apr 9, 2026
2ad532f
feat: add PromptAdherenceTask Luigi node
neoneye Apr 9, 2026
59ad512
feat: wire PromptAdherenceTask into pipeline and report
neoneye Apr 9, 2026
de9f0b2
refactor: use directive_index (int) instead of directive_id (str)
neoneye Apr 9, 2026
ee655a9
fix: use human-readable category labels in markdown output
neoneye Apr 9, 2026
4ceff77
fix: use human-readable directive type labels in markdown output
neoneye Apr 9, 2026
3876d99
fix: move Prompt Adherence to last section in report
neoneye Apr 9, 2026
ca653eb
fix: sort summary table by ID, use "Issue N - title" format in issues
neoneye Apr 9, 2026
57af3c6
fix: remove h1 header from prompt adherence markdown
neoneye Apr 9, 2026
95c30c3
refactor: split plan.txt into plan_raw.json + SetupTask template
neoneye Apr 9, 2026
b691e1a
fix: read plan_prompt from plan_raw.json in PromptAdherenceTask
neoneye Apr 9, 2026
ee32bd3
fix: show all non-perfect directives in Issues section
neoneye Apr 9, 2026
12cca36
feat: show adherence score math in markdown report
neoneye Apr 9, 2026
3c8fc52
fix: spell out adherence score formula in markdown
neoneye Apr 9, 2026
b28445c
fix: wrap adherence formula in code block
neoneye Apr 9, 2026
0cbb55d
fix: enable fenced_code in markdown_with_tables rendering
neoneye Apr 9, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
683 changes: 683 additions & 0 deletions docs/superpowers/plans/2026-04-09-prompt-adherence.md

Large diffs are not rendered by default.

133 changes: 133 additions & 0 deletions docs/superpowers/specs/2026-04-09-prompt-adherence-design.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Prompt Adherence Check for PlanExe

## Problem

PlanExe's pipeline has a "normalization bias." Each of the ~70 nodes nudges the plan toward what a reasonable project *should* look like, and the cumulative drift over the full pipeline is significant. The user's stated reality gets overridden by the LLM's priors about what's plausible.

This manifests as:
- **Stated facts ignored.** The user says "the East Wing has already been demolished" but the plan includes demolition permitting steps.
- **Requirements softened.** The user says "100% renewable energy" and the plan targets 60-80%.
- **Intent diluted.** The user's tone is "this is happening, execute it" but the plan spends 40% on feasibility studies.
- **Unsolicited caveats.** The plan adds qualifications, risk disclaimers, and scope reductions the user didn't ask for.
- **Generic PM filler.** The plan relies on boilerplate project management language instead of addressing the specific problem.

Existing pipeline steps (Premise Attack, Premortem, Expert Criticism, Self Audit) assess plan *quality* — whether the plan is internally consistent, well-structured, and risk-aware. None of them check whether the plan actually does what the user asked.

## Goal

A pipeline step that checks the final plan against the original user prompt and produces a scored report showing which user directives were honored, softened, or ignored. The user can scan the report and immediately see the degree of prompt drift.

## Architecture

Two-phase LLM approach: extract directives from the prompt, then score each one against the final plan.

### Phase 1 — Extract Directives

Read `plan.txt` (the original user prompt) and extract a structured list of directives. Each directive is one thing the user stated or implied that the plan must respect.

```python
class DirectiveType(str, Enum):
CONSTRAINT = "constraint" # "Budget: DKK 500M", "Timeline: 12 months"
STATED_FACT = "stated_fact" # "The East Wing has already been demolished"
REQUIREMENT = "requirement" # "Build a casino", "Reeducate teachers"
BANNED = "banned" # "Banned words: blockchain/NFT"
INTENT = "intent" # "I'm not targeting revenue", tone/posture signals
```

Each directive has:
- `directive_id`: "D1", "D2", etc.
- `directive_type`: one of the types above
- `text`: the user's words (short quote or paraphrase)
- `importance_5`: 1 (minor detail) to 5 (core requirement)

The LLM is instructed to extract 5-15 directives, prioritizing things that are easy to dilute: stated facts about the world, hard numbers, explicit scope boundaries, banned words, and the user's posture (execute vs. study).

### Phase 2 — Score Against Final Plan

Read the extracted directives plus the final plan artifacts (executive summary, project plan, consolidated assumptions). For each directive, score adherence.

```python
class AdherenceCategory(str, Enum):
FULLY_HONORED = "fully_honored"
PARTIALLY_HONORED = "partially_honored"
SOFTENED = "softened" # requirement weakened
IGNORED = "ignored" # not addressed at all
CONTRADICTED = "contradicted" # plan says the opposite
UNSOLICITED_CAVEAT = "unsolicited_caveat" # plan adds qualifications user didn't ask for
```

Each scoring result has:
- `directive_id`: references a Phase 1 directive
- `adherence_5`: 1 (ignored/contradicted) to 5 (fully honored)
- `category`: one of the categories above
- `evidence`: direct quote from the plan (under 200 chars)
- `explanation`: how the plan handled this directive and why the score was given

### Output Files

- `prompt_adherence_raw.json` — full structured data (directives + scores + metadata)
- `prompt_adherence.md` — human-readable report

### Markdown Report Structure

1. **Summary table** — all directives sorted by severity (importance_5 x (6 - adherence_5), worst offenders first):

```
| ID | Directive | Type | Importance | Adherence | Category |
|----|-----------|------|------------|-----------|----------|
| D3 | "East Wing already demolished" | stated_fact | 5/5 | 1/5 | contradicted |
| D1 | "Budget: DKK 500M" | constraint | 5/5 | 3/5 | softened |
| D7 | "No feasibility studies" | intent | 4/5 | 2/5 | ignored |
```

2. **Overall adherence score** — weighted average: `sum(adherence_5 * importance_5) / sum(5 * importance_5)` as a percentage. A plan that fully honors everything scores 100%.

3. **Detail section** — for each directive scoring adherence_5 ≤ 3, the full explanation and evidence quotes from both the prompt and the plan.

### Pipeline Placement

After `self_audit`, before `report`. The task reads:
- `setup` — plan.txt (the original user prompt)
- `executive_summary` — the final plan summary
- `project_plan` — the detailed plan
- `consolidate_assumptions_markdown` — accumulated assumptions that may have drifted

The report task includes `prompt_adherence.md` in the final HTML output.

### FilenameEnum Entries

```python
PROMPT_ADHERENCE_RAW = "prompt_adherence_raw.json"
PROMPT_ADHERENCE_MARKDOWN = "prompt_adherence.md"
```

### Code Structure

```
worker_plan/worker_plan_internal/
diagnostics/
prompt_adherence.py — Phase 1 + Phase 2 logic, Pydantic models, markdown generation
plan/nodes/
prompt_adherence.py — Luigi task (PromptAdherenceTask)
```

Follows the same pattern as `premortem.py` / `nodes/premortem.py`:
- Business logic in `diagnostics/prompt_adherence.py`
- Luigi wiring in `plan/nodes/prompt_adherence.py`
- Pydantic structured output via `llm.as_structured_llm()`
- `LLMExecutor` for model fallback and retry

### Scope Boundaries

**In scope:**
- Extract directives from plan.txt
- Score each directive against the final plan
- Produce JSON + markdown report
- Integrate as a Luigi pipeline step
- Include in the final HTML report

**Out of scope:**
- Fixing the drift (this step surfaces it, doesn't correct it)
- Tracing where in the pipeline drift was introduced (that's RCA's job)
- Judging plan quality (that's self_audit's job)
- Comparing multiple plans against each other
2 changes: 1 addition & 1 deletion worker_plan/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -223,7 +223,7 @@ def create_run_directory(request: StartRunRequest) -> tuple[str, Path]:
start_time_file.save(run_dir / FilenameEnum.START_TIME.value)

plan_file = PlanFile.create(vague_plan_description=request.plan_prompt, start_time=start_time)
plan_file.save(run_dir / FilenameEnum.INITIAL_PLAN.value)
plan_file.save(run_dir / FilenameEnum.INITIAL_PLAN_RAW.value)

return run_id, run_dir.resolve()

Expand Down
3 changes: 3 additions & 0 deletions worker_plan/worker_plan_api/filenames.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

class FilenameEnum(str, Enum):
START_TIME = "start_time.json"
INITIAL_PLAN_RAW = "plan_raw.json"
INITIAL_PLAN = "plan.txt"
PLANEXE_METADATA = "planexe_metadata.json"
SCREEN_PLANNING_PROMPT_RAW = "screen_planning_prompt.json"
Expand Down Expand Up @@ -128,6 +129,8 @@ class FilenameEnum(str, Enum):
PREMORTEM_MARKDOWN = "premortem.md"
SELF_AUDIT_RAW = "self_audit_raw.json"
SELF_AUDIT_MARKDOWN = "self_audit.md"
PROMPT_ADHERENCE_RAW = "prompt_adherence_raw.json"
PROMPT_ADHERENCE_MARKDOWN = "prompt_adherence.md"
REPORT = "report.html"
PIPELINE_COMPLETE = "pipeline_complete.txt"

Expand Down
40 changes: 31 additions & 9 deletions worker_plan/worker_plan_api/plan_file.py
Original file line number Diff line number Diff line change
@@ -1,28 +1,50 @@
"""
PROMPT> python -m worker_plan_api.plan_file
"""
import json
from datetime import datetime
from dataclasses import dataclass


PLAN_TEMPLATE = "Plan:\n{plan_prompt}\n\nToday's date:\n{pretty_date}\n\nProject start ASAP"


@dataclass
class PlanFile:
content: str
plan_prompt: str
pretty_date: str

@classmethod
def create(cls, vague_plan_description: str, start_time: datetime) -> "PlanFile":
pretty_date = start_time.strftime("%Y-%b-%d")
plan_prompt = (
f"Plan:\n{vague_plan_description}\n\n"
f"Today's date:\n{pretty_date}\n\n"
"Project start ASAP"
)
return cls(plan_prompt)
return cls(plan_prompt=vague_plan_description, pretty_date=pretty_date)

def to_dict(self) -> dict:
return {
"plan_prompt": self.plan_prompt,
"pretty_date": self.pretty_date,
}

@classmethod
def from_dict(cls, data: dict) -> "PlanFile":
return cls(plan_prompt=data["plan_prompt"], pretty_date=data["pretty_date"])

@classmethod
def load(cls, file_path: str) -> "PlanFile":
with open(file_path, "r", encoding="utf-8") as f:
return cls.from_dict(json.load(f))

def save(self, file_path: str) -> None:
with open(file_path, "w", encoding="utf-8") as f:
f.write(self.content)
json.dump(self.to_dict(), f, indent=2)

def to_plan_text(self) -> str:
return PLAN_TEMPLATE.format(plan_prompt=self.plan_prompt, pretty_date=self.pretty_date)


if __name__ == "__main__":
start_time: datetime = datetime.now().astimezone()
plan = PlanFile.create(vague_plan_description="My plan is here!", start_time=start_time)
print(plan.content)
print(json.dumps(plan.to_dict(), indent=2))
print("---")
print(plan.to_plan_text())
Loading