diff --git a/prompts/phase3-classification.md b/prompts/phase3-classification.md index e798027..04dd1f8 100644 --- a/prompts/phase3-classification.md +++ b/prompts/phase3-classification.md @@ -57,6 +57,32 @@ Every classification MUST include a rationale explaining the chosen label and confidence. Stored in the audit store only; referenced by id in general logs (FR-120). +## Round-0 contract (opening call, empty transcript) + +When `round_count = 0` AND the `## Transcript` section is empty +(or contains the literal marker `(no replies yet — round 0)`): + +- `suggested_action` MUST be `ask_clarification`. Round 0 is the + call that opens the chat with both parties — picking `summarize` + or `escalate` here aborts the session before Serbero ever talks + to anyone, and Mostro never sees Serbero take the dispute. There + is intentionally nothing to summarize or escalate yet; your job + is to start the conversation. +- `classification` MUST be `unclear` and `confidence` MUST be low + (≤ 0.3). The policy layer applies a round-0 bypass that accepts + low-confidence `ask_clarification` so the chat can open. Do not + fabricate a higher confidence — the round-0 bypass is the + designed path, not a workaround. +- Both `buyer_clarification` and `seller_clarification` MUST be + populated using the "First Clarifying Question" template from + the message-templates bundle, with the `[SPECIFIC_QUESTION]` + token replaced by the role-appropriate generic opener (buyer: + fiat sent? proof of transfer; seller: fiat received? if not, + what proof the buyer shared). The "If you cannot produce a + useful question" escape hatch in the Hard Rules below does NOT + apply on round 0 — on round 0 a generic opener IS the useful + question, because no transcript exists yet. + ## Clarifying Questions (per-party) When `suggested_action = ask_clarification`, emit TWO distinct @@ -83,10 +109,21 @@ Hard rules: "Seller:". The transport layer handles recipient routing; these prefixes only leak confusion into the other party's chat if they ever land on the wrong side. +- Do NOT begin the question with a greeting or self-introduction + (e.g. "Hello, I'm Serbero, an automated mediation assistance + system..."), and do NOT append a sign-off or signature. The runtime + discloses Serbero's identity exactly once, by hard-prefixing a + one-line introduction on the very first outbound of the session + (`mediation::draft_and_send_initial_message`); every subsequent + clarification round must open directly with the question itself. + Repeating the greeting on top of the runtime prefix duplicates the + introduction inside a single chat message and is a defect. - Both strings MUST be non-empty. If you cannot produce a useful - question for one side, pick a different `suggested_action` - (`summarize` or `escalate`) instead of emitting a half-populated - clarification. + question for one side at `round_count >= 1` (because the existing + transcript already answered everything you would ask), pick a + different `suggested_action` (`summarize` or `escalate`) instead + of emitting a half-populated clarification. This escape hatch + does NOT apply on round 0 — see the Round-0 contract above. - Each question stands on its own — don't cross-reference the other party's text, since each party only ever sees theirs. diff --git a/prompts/phase3-message-templates.md b/prompts/phase3-message-templates.md index cc654e1..c576d84 100644 --- a/prompts/phase3-message-templates.md +++ b/prompts/phase3-message-templates.md @@ -11,21 +11,30 @@ layer. ## First Clarifying Question -"Hello, I'm Serbero, an automated mediation assistant helping the -assigned solver review this dispute. I'd like to understand your -perspective. Could you please describe what happened from your point -of view? Specifically: [SPECIFIC_QUESTION]" +"I'd like to understand your perspective. Could you please describe +what happened from your point of view? Specifically: [SPECIFIC_QUESTION]" — Replace `[SPECIFIC_QUESTION]` with one concrete, dispute-specific question. Do not return the literal token `[SPECIFIC_QUESTION]`. +The one-time "Hello, I'm Serbero, an automated mediation assistant +helping the assigned solver review this dispute. " self-introduction +is added by the runtime (`mediation::draft_and_send_initial_message`) +exactly once at session open, so the model MUST NOT repeat it inside +the clarification body. Repeating it produces a duplicated greeting in +a single chat message and is a defect. + ## Follow-Up Clarification -"Thank you for your response. To help the solver make a well-informed -decision, I have a follow-up question: [SPECIFIC_QUESTION]" +"[SPECIFIC_QUESTION]" -— Same rule: substitute a concrete follow-up question; never emit the -bracketed token. +— Substitute a concrete follow-up question and emit nothing else: no +greeting, no self-introduction, no "Thank you for your response" +preamble, no sign-off. Round 2+ messages travel through +`mediation::draft_and_send_followup_message`, which does NOT prefix a +greeting; the entire user-visible body is whatever the model returned. +Adding a preamble or signature here will surface verbatim in the +party's chat. ## Cooperative Summary Preamble diff --git a/prompts/phase3-self-resolution.md b/prompts/phase3-self-resolution.md index e8c15db..8211e05 100644 --- a/prompts/phase3-self-resolution.md +++ b/prompts/phase3-self-resolution.md @@ -35,13 +35,13 @@ the detected code has no matching `[xx]` section in this file. fallback_language = "en" [en] -template = "Thanks for the update — it sounds like the two of you may be close to coordinating the next step between yourselves. I'll keep monitoring this conversation in case anything changes." +template = "From what each of you has shared, it sounds like the two of you may be close to coordinating the next step between yourselves. I'll keep monitoring this conversation in case anything changes." human_assistance_optin = "If you'd prefer human assistance instead, just let me know in this chat and I'll route you to the assigned solver." [es] -template = "Gracias por la actualización: parece que ustedes dos podrían estar cerca de coordinar el siguiente paso entre sí. Sigo atento a esta conversación por si algo cambia." +template = "Por lo que cada uno ha compartido, parece que ustedes dos podrían estar cerca de coordinar el siguiente paso entre sí. Sigo atento a esta conversación por si algo cambia." human_assistance_optin = "Si prefieres asistencia humana, dímelo en este chat y te conecto con la persona asignada al caso." [pt] -template = "Obrigado pela atualização — parece que vocês dois podem estar perto de coordenar o próximo passo entre si. Continuo acompanhando esta conversa caso algo mude." +template = "Pelo que cada um compartilhou, parece que vocês dois podem estar perto de coordenar o próximo passo entre si. Continuo acompanhando esta conversa caso algo mude." human_assistance_optin = "Se preferir assistência humana, me avise neste chat e eu encaminho você para a pessoa designada." diff --git a/prompts/phase3-system.md b/prompts/phase3-system.md index ff0e356..b22b040 100644 --- a/prompts/phase3-system.md +++ b/prompts/phase3-system.md @@ -13,8 +13,14 @@ limits, and honesty discipline. These rules apply to every reasoning call. from both parties and drafting a clear, neutral summary. - You do NOT have authority over the dispute outcome. The human solver makes the final decision. -- Always identify yourself as an assistance system. Never claim to be - a human, mediator, judge, arbitrator, or solver. +- Never claim to be a human, mediator, judge, arbitrator, or solver. + If a party directly asks who or what you are, answer truthfully that + you are Serbero, an automated mediation assistance system. The + one-time identity disclosure that opens the session is handled by + the runtime, not by you — do NOT prefix every clarification, every + summary, or every cooperative invitation with a "Hello, I'm + Serbero..." preamble. Repeating the introduction inside a chat + message that already carries Serbero's voice is a defect. ## Authority Limits diff --git a/src/db/mediation.rs b/src/db/mediation.rs index 336aabd..cc00b24 100644 --- a/src/db/mediation.rs +++ b/src/db/mediation.rs @@ -274,11 +274,20 @@ pub struct LiveSession { } /// List all mediation sessions that are NOT in a terminal or -/// handed-off state. Same exclusion set as -/// [`latest_open_session_for`]: `closed`, `summary_delivered`, -/// `escalation_recommended`, `superseded_by_human`. The engine uses -/// this to decide which sessions to poll for inbound replies on each -/// tick and to rebuild in-memory chat material at startup. +/// handed-off state. Base exclusion set: `closed`, +/// `summary_delivered`, `escalation_recommended`, +/// `superseded_by_human`. The engine uses this to decide which +/// sessions to poll for inbound replies on each tick and to rebuild +/// in-memory chat material at startup. +/// +/// **Diverges intentionally from [`latest_open_session_for`]** via the +/// Feature 005 carve-out below: the ingest tick must keep watching +/// post-invitation `summary_delivered` sessions so a later party reply +/// can still trigger the `PartyRequestedHuman` opt-in. The +/// dispute_resolved handler uses `latest_open_session_for` (no +/// carve-out) so summarized sessions take the legal +/// `summary_delivered → closed` direct transition instead of the +/// illegal SupersededByHuman walk. pub fn list_live_sessions(conn: &Connection) -> Result> { use std::str::FromStr; @@ -393,12 +402,17 @@ pub fn set_session_state( /// `superseded_by_human`) are excluded — a dispute that was closed /// or escalated earlier must not block a later session open. /// -/// Feature 005 carve-out (mirrors [`list_live_sessions`]): a session -/// in `summary_delivered` that received the cooperative-self-resolution -/// invitation is still considered live so the human-assistance opt-in -/// path can fire on a later party reply. The carve-out is scoped by -/// the `self_resolution_offered` audit row, so legacy -/// `summary_delivered` sessions stay terminal. +/// **Diverges intentionally from [`list_live_sessions`].** The ingest +/// tick needs to keep watching post-invitation `summary_delivered` +/// sessions so a later party reply can trigger the +/// `PartyRequestedHuman` opt-in; this lookup, by contrast, gates +/// new-session-open eligibility and the dispute_resolved handler's +/// SupersededByHuman walk. The handler at +/// `src/handlers/dispute_resolved.rs` has a dedicated path that closes +/// `summary_delivered` sessions via the legal direct +/// `summary_delivered → closed` transition; surfacing them here would +/// route them through the illegal `summary_delivered → +/// superseded_by_human` step instead. /// /// Used by the engine to gate session opens and, crucially, re-checked /// inside the final open-session DB transaction to close the @@ -410,25 +424,15 @@ pub fn latest_open_session_for( use std::str::FromStr; match conn.query_row( - "SELECT s.session_id, s.state FROM mediation_sessions s - WHERE s.dispute_id = ?1 - AND ( - s.state NOT IN ( - 'closed', - 'summary_delivered', - 'escalation_recommended', - 'superseded_by_human' - ) - OR ( - s.state = 'summary_delivered' - AND EXISTS ( - SELECT 1 FROM mediation_events e - WHERE e.session_id = s.session_id - AND e.kind = 'self_resolution_offered' - ) - ) + "SELECT session_id, state FROM mediation_sessions + WHERE dispute_id = ?1 + AND state NOT IN ( + 'closed', + 'summary_delivered', + 'escalation_recommended', + 'superseded_by_human' ) - ORDER BY s.started_at DESC + ORDER BY started_at DESC LIMIT 1", params![dispute_id], |r| Ok((r.get::<_, String>(0)?, r.get::<_, String>(1)?)), @@ -769,6 +773,86 @@ mod tests { } } + /// Regression guard for the 2026-04-27 panic + /// (`set_session_state: illegal transition summary_delivered -> + /// superseded_by_human`). The two queries diverge on purpose: + /// + /// * `list_live_sessions` keeps a `summary_delivered` row visible + /// when it carries a prior `self_resolution_offered` audit row, + /// so the ingest tick can still observe a later party reply + /// for the `PartyRequestedHuman` opt-in. + /// * `latest_open_session_for` excludes `summary_delivered` + /// unconditionally, so the dispute_resolved handler closes + /// summarized sessions via the legal direct + /// `summary_delivered → closed` transition (the dedicated + /// query at `src/handlers/dispute_resolved.rs` lines 286-322) + /// instead of the illegal `SupersededByHuman` walk. + /// + /// A change that re-aligns the two filters in either direction + /// breaks one of those properties and must be caught here. + #[test] + fn summary_delivered_row_with_invitation_diverges_between_filters() { + let conn = fresh(); + insert_session(&conn, &new_session("pol-hash-divergence")).unwrap(); + conn.execute( + "UPDATE mediation_sessions SET state = 'summary_delivered' + WHERE session_id = 'sess-1'", + [], + ) + .unwrap(); + + // Without a `self_resolution_offered` audit row, `summary_delivered` + // is terminal for BOTH queries (legacy behaviour). + let live = list_live_sessions(&conn).unwrap(); + assert!( + live.is_empty(), + "legacy summary_delivered without invitation must be excluded \ + from list_live_sessions; got {live:?}" + ); + assert!( + latest_open_session_for(&conn, "dispute-xyz") + .unwrap() + .is_none(), + "legacy summary_delivered without invitation must be excluded \ + from latest_open_session_for" + ); + + // Add the `self_resolution_offered` audit row. + conn.execute( + "INSERT INTO mediation_events (session_id, kind, payload_json, occurred_at) + VALUES ('sess-1', 'self_resolution_offered', '{}', 100)", + [], + ) + .unwrap(); + + // Carve-out applies to `list_live_sessions` only. + let live = list_live_sessions(&conn).unwrap(); + assert_eq!( + live.len(), + 1, + "post-invitation summary_delivered must be visible to \ + list_live_sessions so the ingest tick can observe a later \ + party reply for the PartyRequestedHuman opt-in; got {live:?}" + ); + assert_eq!(live[0].session_id, "sess-1"); + assert_eq!(live[0].state, MediationSessionState::SummaryDelivered); + + // ...and MUST NOT apply to `latest_open_session_for`. Surfacing + // this row here is what triggered the 2026-04-27 panic — the + // dispute_resolved handler would walk it through + // `SupersededByHuman → Closed`, but `summary_delivered → + // superseded_by_human` is not a legal state-machine edge. + assert!( + latest_open_session_for(&conn, "dispute-xyz") + .unwrap() + .is_none(), + "post-invitation summary_delivered MUST stay invisible to \ + latest_open_session_for; the dedicated summary_delivered → \ + closed path in dispute_resolved.rs handles closure via the \ + legal direct transition" + ); + } + #[test] fn set_session_state_updates_state_and_transition_ts() { let conn = fresh(); diff --git a/src/mediation/mod.rs b/src/mediation/mod.rs index c5b44c4..d812a9a 100644 --- a/src/mediation/mod.rs +++ b/src/mediation/mod.rs @@ -1326,8 +1326,11 @@ pub async fn deliver_summary( // session at `summary_delivered` blocks re-eligibility // (since `summary_delivered` is treated as live by the // eligibility EXISTS clause) while still being recognised - // as terminal by `list_live_sessions` and - // `latest_open_session_for`. The legal `summary_delivered + // as terminal by `latest_open_session_for`. + // `list_live_sessions` keeps the row visible only when a + // prior `self_resolution_offered` audit row exists, so the + // ingest tick can still observe a later party reply for + // the human-assistance opt-in. The legal `summary_delivered // → closed` transition is taken later by the // `dispute_resolved` handler when Mostro closes the // dispute, or stays put indefinitely if the dispute never diff --git a/src/reasoning/openai.rs b/src/reasoning/openai.rs index 4ddaeea..b4d90d7 100644 --- a/src/reasoning/openai.rs +++ b/src/reasoning/openai.rs @@ -481,12 +481,27 @@ pub(super) fn build_classification_prompt(r: &ClassificationRequest) -> String { // the exact bytes the session's `policy_hash` pins. An auditor // can later grep the git-committed bundle for this hash and // recover the full prompt context. - let transcript = r - .transcript - .iter() - .map(|e| format!("[{}] {}: {}", e.inner_event_created_at, e.party, e.content)) - .collect::>() - .join("\n"); + // + // When the transcript is empty (round 0, before any party reply) + // render a literal marker instead of a blank section so the model + // sees an unambiguous round-0 signal. A blank "## Transcript" + // section reads as "transcript missing", which on `gpt-5.4-mini` + // (and similar router models) pushed the classifier toward + // `suggested_action = summarize | escalate` instead of + // `ask_clarification` — observed 2026-04-28 with the + // Alice/Bob test trade staying at `initiated` because the + // opening classify+take pre-empted itself with + // `Escalate(LowConfidence)`. The Round-0 contract section in + // `prompts/phase3-classification.md` keys off this marker. + let transcript = if r.transcript.is_empty() { + "(no replies yet — round 0)".to_string() + } else { + r.transcript + .iter() + .map(|e| format!("[{}] {}: {}", e.inner_event_created_at, e.party, e.content)) + .collect::>() + .join("\n") + }; // Feature 005: only ask for the `human_requested` field on rounds // following a `self_resolution_offered` audit row. Asking on every // round wastes tokens and risks false positives (a buggy provider