|
| 1 | +# Prompt Adherence Check for PlanExe |
| 2 | + |
| 3 | +## Problem |
| 4 | + |
| 5 | +PlanExe's pipeline has a "normalization bias." Each of the ~70 nodes nudges the plan toward what a reasonable project *should* look like, and the cumulative drift over the full pipeline is significant. The user's stated reality gets overridden by the LLM's priors about what's plausible. |
| 6 | + |
| 7 | +This manifests as: |
| 8 | +- **Stated facts ignored.** The user says "the East Wing has already been demolished" but the plan includes demolition permitting steps. |
| 9 | +- **Requirements softened.** The user says "100% renewable energy" and the plan targets 60-80%. |
| 10 | +- **Intent diluted.** The user's tone is "this is happening, execute it" but the plan spends 40% on feasibility studies. |
| 11 | +- **Unsolicited caveats.** The plan adds qualifications, risk disclaimers, and scope reductions the user didn't ask for. |
| 12 | +- **Generic PM filler.** The plan relies on boilerplate project management language instead of addressing the specific problem. |
| 13 | + |
| 14 | +Existing pipeline steps (Premise Attack, Premortem, Expert Criticism, Self Audit) assess plan *quality* — whether the plan is internally consistent, well-structured, and risk-aware. None of them check whether the plan actually does what the user asked. |
| 15 | + |
| 16 | +## Goal |
| 17 | + |
| 18 | +A pipeline step that checks the final plan against the original user prompt and produces a scored report showing which user directives were honored, softened, or ignored. The user can scan the report and immediately see the degree of prompt drift. |
| 19 | + |
| 20 | +## Architecture |
| 21 | + |
| 22 | +Two-phase LLM approach: extract directives from the prompt, then score each one against the final plan. |
| 23 | + |
| 24 | +### Phase 1 — Extract Directives |
| 25 | + |
| 26 | +Read `plan.txt` (the original user prompt) and extract a structured list of directives. Each directive is one thing the user stated or implied that the plan must respect. |
| 27 | + |
| 28 | +```python |
| 29 | +class DirectiveType(str, Enum): |
| 30 | + CONSTRAINT = "constraint" # "Budget: DKK 500M", "Timeline: 12 months" |
| 31 | + STATED_FACT = "stated_fact" # "The East Wing has already been demolished" |
| 32 | + REQUIREMENT = "requirement" # "Build a casino", "Reeducate teachers" |
| 33 | + BANNED = "banned" # "Banned words: blockchain/NFT" |
| 34 | + INTENT = "intent" # "I'm not targeting revenue", tone/posture signals |
| 35 | +``` |
| 36 | + |
| 37 | +Each directive has: |
| 38 | +- `directive_id`: "D1", "D2", etc. |
| 39 | +- `directive_type`: one of the types above |
| 40 | +- `text`: the user's words (short quote or paraphrase) |
| 41 | +- `importance_5`: 1 (minor detail) to 5 (core requirement) |
| 42 | + |
| 43 | +The LLM is instructed to extract 5-15 directives, prioritizing things that are easy to dilute: stated facts about the world, hard numbers, explicit scope boundaries, banned words, and the user's posture (execute vs. study). |
| 44 | + |
| 45 | +### Phase 2 — Score Against Final Plan |
| 46 | + |
| 47 | +Read the extracted directives plus the final plan artifacts (executive summary, project plan, consolidated assumptions). For each directive, score adherence. |
| 48 | + |
| 49 | +```python |
| 50 | +class AdherenceCategory(str, Enum): |
| 51 | + FULLY_HONORED = "fully_honored" |
| 52 | + PARTIALLY_HONORED = "partially_honored" |
| 53 | + SOFTENED = "softened" # requirement weakened |
| 54 | + IGNORED = "ignored" # not addressed at all |
| 55 | + CONTRADICTED = "contradicted" # plan says the opposite |
| 56 | + UNSOLICITED_CAVEAT = "unsolicited_caveat" # plan adds qualifications user didn't ask for |
| 57 | +``` |
| 58 | + |
| 59 | +Each scoring result has: |
| 60 | +- `directive_id`: references a Phase 1 directive |
| 61 | +- `adherence_5`: 1 (ignored/contradicted) to 5 (fully honored) |
| 62 | +- `category`: one of the categories above |
| 63 | +- `evidence`: direct quote from the plan (under 200 chars) |
| 64 | +- `explanation`: how the plan handled this directive and why the score was given |
| 65 | + |
| 66 | +### Output Files |
| 67 | + |
| 68 | +- `prompt_adherence_raw.json` — full structured data (directives + scores + metadata) |
| 69 | +- `prompt_adherence.md` — human-readable report |
| 70 | + |
| 71 | +### Markdown Report Structure |
| 72 | + |
| 73 | +1. **Summary table** — all directives sorted by severity (importance_5 x (6 - adherence_5), worst offenders first): |
| 74 | + |
| 75 | +``` |
| 76 | +| ID | Directive | Type | Importance | Adherence | Category | |
| 77 | +|----|-----------|------|------------|-----------|----------| |
| 78 | +| D3 | "East Wing already demolished" | stated_fact | 5/5 | 1/5 | contradicted | |
| 79 | +| D1 | "Budget: DKK 500M" | constraint | 5/5 | 3/5 | softened | |
| 80 | +| D7 | "No feasibility studies" | intent | 4/5 | 2/5 | ignored | |
| 81 | +``` |
| 82 | + |
| 83 | +2. **Overall adherence score** — weighted average: `sum(adherence_5 * importance_5) / sum(5 * importance_5)` as a percentage. A plan that fully honors everything scores 100%. |
| 84 | + |
| 85 | +3. **Detail section** — for each directive scoring adherence_5 ≤ 3, the full explanation and evidence quotes from both the prompt and the plan. |
| 86 | + |
| 87 | +### Pipeline Placement |
| 88 | + |
| 89 | +After `self_audit`, before `report`. The task reads: |
| 90 | +- `setup` — plan.txt (the original user prompt) |
| 91 | +- `executive_summary` — the final plan summary |
| 92 | +- `project_plan` — the detailed plan |
| 93 | +- `consolidate_assumptions_markdown` — accumulated assumptions that may have drifted |
| 94 | + |
| 95 | +The report task includes `prompt_adherence.md` in the final HTML output. |
| 96 | + |
| 97 | +### FilenameEnum Entries |
| 98 | + |
| 99 | +```python |
| 100 | +PROMPT_ADHERENCE_RAW = "prompt_adherence_raw.json" |
| 101 | +PROMPT_ADHERENCE_MARKDOWN = "prompt_adherence.md" |
| 102 | +``` |
| 103 | + |
| 104 | +### Code Structure |
| 105 | + |
| 106 | +``` |
| 107 | +worker_plan/worker_plan_internal/ |
| 108 | + diagnostics/ |
| 109 | + prompt_adherence.py — Phase 1 + Phase 2 logic, Pydantic models, markdown generation |
| 110 | + plan/nodes/ |
| 111 | + prompt_adherence.py — Luigi task (PromptAdherenceTask) |
| 112 | +``` |
| 113 | + |
| 114 | +Follows the same pattern as `premortem.py` / `nodes/premortem.py`: |
| 115 | +- Business logic in `diagnostics/prompt_adherence.py` |
| 116 | +- Luigi wiring in `plan/nodes/prompt_adherence.py` |
| 117 | +- Pydantic structured output via `llm.as_structured_llm()` |
| 118 | +- `LLMExecutor` for model fallback and retry |
| 119 | + |
| 120 | +### Scope Boundaries |
| 121 | + |
| 122 | +**In scope:** |
| 123 | +- Extract directives from plan.txt |
| 124 | +- Score each directive against the final plan |
| 125 | +- Produce JSON + markdown report |
| 126 | +- Integrate as a Luigi pipeline step |
| 127 | +- Include in the final HTML report |
| 128 | + |
| 129 | +**Out of scope:** |
| 130 | +- Fixing the drift (this step surfaces it, doesn't correct it) |
| 131 | +- Tracing where in the pipeline drift was introduced (that's RCA's job) |
| 132 | +- Judging plan quality (that's self_audit's job) |
| 133 | +- Comparing multiple plans against each other |
0 commit comments