Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 40 additions & 3 deletions prompts/phase3-classification.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,32 @@ Every classification MUST include a rationale explaining the chosen
label and confidence. Stored in the audit store only; referenced by
id in general logs (FR-120).

## Round-0 contract (opening call, empty transcript)

When `round_count = 0` AND the `## Transcript` section is empty
(or contains the literal marker `(no replies yet — round 0)`):

- `suggested_action` MUST be `ask_clarification`. Round 0 is the
call that opens the chat with both parties — picking `summarize`
or `escalate` here aborts the session before Serbero ever talks
to anyone, and Mostro never sees Serbero take the dispute. There
is intentionally nothing to summarize or escalate yet; your job
is to start the conversation.
- `classification` MUST be `unclear` and `confidence` MUST be low
(≤ 0.3). The policy layer applies a round-0 bypass that accepts
low-confidence `ask_clarification` so the chat can open. Do not
fabricate a higher confidence — the round-0 bypass is the
designed path, not a workaround.
- Both `buyer_clarification` and `seller_clarification` MUST be
populated using the "First Clarifying Question" template from
the message-templates bundle, with the `[SPECIFIC_QUESTION]`
token replaced by the role-appropriate generic opener (buyer:
fiat sent? proof of transfer; seller: fiat received? if not,
what proof the buyer shared). The "If you cannot produce a
useful question" escape hatch in the Hard Rules below does NOT
apply on round 0 — on round 0 a generic opener IS the useful
question, because no transcript exists yet.

## Clarifying Questions (per-party)

When `suggested_action = ask_clarification`, emit TWO distinct
Expand All @@ -83,10 +109,21 @@ Hard rules:
"Seller:". The transport layer handles recipient routing; these
prefixes only leak confusion into the other party's chat if they
ever land on the wrong side.
- Do NOT begin the question with a greeting or self-introduction
(e.g. "Hello, I'm Serbero, an automated mediation assistance
system..."), and do NOT append a sign-off or signature. The runtime
discloses Serbero's identity exactly once, by hard-prefixing a
one-line introduction on the very first outbound of the session
(`mediation::draft_and_send_initial_message`); every subsequent
clarification round must open directly with the question itself.
Repeating the greeting on top of the runtime prefix duplicates the
introduction inside a single chat message and is a defect.
- Both strings MUST be non-empty. If you cannot produce a useful
question for one side, pick a different `suggested_action`
(`summarize` or `escalate`) instead of emitting a half-populated
clarification.
question for one side at `round_count >= 1` (because the existing
transcript already answered everything you would ask), pick a
different `suggested_action` (`summarize` or `escalate`) instead
of emitting a half-populated clarification. This escape hatch
does NOT apply on round 0 — see the Round-0 contract above.
- Each question stands on its own — don't cross-reference the other
party's text, since each party only ever sees theirs.

Expand Down
25 changes: 17 additions & 8 deletions prompts/phase3-message-templates.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,21 +11,30 @@ layer.

## First Clarifying Question

"Hello, I'm Serbero, an automated mediation assistant helping the
assigned solver review this dispute. I'd like to understand your
perspective. Could you please describe what happened from your point
of view? Specifically: [SPECIFIC_QUESTION]"
"I'd like to understand your perspective. Could you please describe
what happened from your point of view? Specifically: [SPECIFIC_QUESTION]"

— Replace `[SPECIFIC_QUESTION]` with one concrete, dispute-specific
question. Do not return the literal token `[SPECIFIC_QUESTION]`.

The one-time "Hello, I'm Serbero, an automated mediation assistant
helping the assigned solver review this dispute. " self-introduction
is added by the runtime (`mediation::draft_and_send_initial_message`)
exactly once at session open, so the model MUST NOT repeat it inside
the clarification body. Repeating it produces a duplicated greeting in
a single chat message and is a defect.

## Follow-Up Clarification

"Thank you for your response. To help the solver make a well-informed
decision, I have a follow-up question: [SPECIFIC_QUESTION]"
"[SPECIFIC_QUESTION]"

— Same rule: substitute a concrete follow-up question; never emit the
bracketed token.
— Substitute a concrete follow-up question and emit nothing else: no
greeting, no self-introduction, no "Thank you for your response"
preamble, no sign-off. Round 2+ messages travel through
`mediation::draft_and_send_followup_message`, which does NOT prefix a
greeting; the entire user-visible body is whatever the model returned.
Adding a preamble or signature here will surface verbatim in the
party's chat.

## Cooperative Summary Preamble

Expand Down
6 changes: 3 additions & 3 deletions prompts/phase3-self-resolution.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,13 @@ the detected code has no matching `[xx]` section in this file.
fallback_language = "en"

[en]
template = "Thanks for the update — it sounds like the two of you may be close to coordinating the next step between yourselves. I'll keep monitoring this conversation in case anything changes."
template = "From what each of you has shared, it sounds like the two of you may be close to coordinating the next step between yourselves. I'll keep monitoring this conversation in case anything changes."
human_assistance_optin = "If you'd prefer human assistance instead, just let me know in this chat and I'll route you to the assigned solver."

[es]
template = "Gracias por la actualización: parece que ustedes dos podrían estar cerca de coordinar el siguiente paso entre sí. Sigo atento a esta conversación por si algo cambia."
template = "Por lo que cada uno ha compartido, parece que ustedes dos podrían estar cerca de coordinar el siguiente paso entre sí. Sigo atento a esta conversación por si algo cambia."
human_assistance_optin = "Si prefieres asistencia humana, dímelo en este chat y te conecto con la persona asignada al caso."

[pt]
template = "Obrigado pela atualização — parece que vocês dois podem estar perto de coordenar o próximo passo entre si. Continuo acompanhando esta conversa caso algo mude."
template = "Pelo que cada um compartilhou, parece que vocês dois podem estar perto de coordenar o próximo passo entre si. Continuo acompanhando esta conversa caso algo mude."
human_assistance_optin = "Se preferir assistência humana, me avise neste chat e eu encaminho você para a pessoa designada."
10 changes: 8 additions & 2 deletions prompts/phase3-system.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,14 @@ limits, and honesty discipline. These rules apply to every reasoning call.
from both parties and drafting a clear, neutral summary.
- You do NOT have authority over the dispute outcome. The human solver
makes the final decision.
- Always identify yourself as an assistance system. Never claim to be
a human, mediator, judge, arbitrator, or solver.
- Never claim to be a human, mediator, judge, arbitrator, or solver.
If a party directly asks who or what you are, answer truthfully that
you are Serbero, an automated mediation assistance system. The
one-time identity disclosure that opens the session is handled by
the runtime, not by you — do NOT prefix every clarification, every
summary, or every cooperative invitation with a "Hello, I'm
Serbero..." preamble. Repeating the introduction inside a chat
message that already carries Serbero's voice is a defect.

## Authority Limits

Expand Down
142 changes: 113 additions & 29 deletions src/db/mediation.rs
Original file line number Diff line number Diff line change
Expand Up @@ -274,11 +274,20 @@ pub struct LiveSession {
}

/// List all mediation sessions that are NOT in a terminal or
/// handed-off state. Same exclusion set as
/// [`latest_open_session_for`]: `closed`, `summary_delivered`,
/// `escalation_recommended`, `superseded_by_human`. The engine uses
/// this to decide which sessions to poll for inbound replies on each
/// tick and to rebuild in-memory chat material at startup.
/// handed-off state. Base exclusion set: `closed`,
/// `summary_delivered`, `escalation_recommended`,
/// `superseded_by_human`. The engine uses this to decide which
/// sessions to poll for inbound replies on each tick and to rebuild
/// in-memory chat material at startup.
///
/// **Diverges intentionally from [`latest_open_session_for`]** via the
/// Feature 005 carve-out below: the ingest tick must keep watching
/// post-invitation `summary_delivered` sessions so a later party reply
/// can still trigger the `PartyRequestedHuman` opt-in. The
/// dispute_resolved handler uses `latest_open_session_for` (no
/// carve-out) so summarized sessions take the legal
/// `summary_delivered → closed` direct transition instead of the
/// illegal SupersededByHuman walk.
pub fn list_live_sessions(conn: &Connection) -> Result<Vec<LiveSession>> {
use std::str::FromStr;

Expand Down Expand Up @@ -393,12 +402,17 @@ pub fn set_session_state(
/// `superseded_by_human`) are excluded — a dispute that was closed
/// or escalated earlier must not block a later session open.
///
/// Feature 005 carve-out (mirrors [`list_live_sessions`]): a session
/// in `summary_delivered` that received the cooperative-self-resolution
/// invitation is still considered live so the human-assistance opt-in
/// path can fire on a later party reply. The carve-out is scoped by
/// the `self_resolution_offered` audit row, so legacy
/// `summary_delivered` sessions stay terminal.
/// **Diverges intentionally from [`list_live_sessions`].** The ingest
/// tick needs to keep watching post-invitation `summary_delivered`
/// sessions so a later party reply can trigger the
/// `PartyRequestedHuman` opt-in; this lookup, by contrast, gates
/// new-session-open eligibility and the dispute_resolved handler's
/// SupersededByHuman walk. The handler at
/// `src/handlers/dispute_resolved.rs` has a dedicated path that closes
/// `summary_delivered` sessions via the legal direct
/// `summary_delivered → closed` transition; surfacing them here would
/// route them through the illegal `summary_delivered →
/// superseded_by_human` step instead.
///
/// Used by the engine to gate session opens and, crucially, re-checked
/// inside the final open-session DB transaction to close the
Expand All @@ -410,25 +424,15 @@ pub fn latest_open_session_for(
use std::str::FromStr;

match conn.query_row(
"SELECT s.session_id, s.state FROM mediation_sessions s
WHERE s.dispute_id = ?1
AND (
s.state NOT IN (
'closed',
'summary_delivered',
'escalation_recommended',
'superseded_by_human'
)
OR (
s.state = 'summary_delivered'
AND EXISTS (
SELECT 1 FROM mediation_events e
WHERE e.session_id = s.session_id
AND e.kind = 'self_resolution_offered'
)
)
"SELECT session_id, state FROM mediation_sessions
WHERE dispute_id = ?1
AND state NOT IN (
'closed',
'summary_delivered',
'escalation_recommended',
'superseded_by_human'
)
ORDER BY s.started_at DESC
ORDER BY started_at DESC
LIMIT 1",
params![dispute_id],
|r| Ok((r.get::<_, String>(0)?, r.get::<_, String>(1)?)),
Expand Down Expand Up @@ -769,6 +773,86 @@ mod tests {
}
}

/// Regression guard for the 2026-04-27 panic
/// (`set_session_state: illegal transition summary_delivered ->
/// superseded_by_human`). The two queries diverge on purpose:
///
/// * `list_live_sessions` keeps a `summary_delivered` row visible
/// when it carries a prior `self_resolution_offered` audit row,
/// so the ingest tick can still observe a later party reply
/// for the `PartyRequestedHuman` opt-in.
/// * `latest_open_session_for` excludes `summary_delivered`
/// unconditionally, so the dispute_resolved handler closes
/// summarized sessions via the legal direct
/// `summary_delivered → closed` transition (the dedicated
/// query at `src/handlers/dispute_resolved.rs` lines 286-322)
/// instead of the illegal `SupersededByHuman` walk.
///
/// A change that re-aligns the two filters in either direction
/// breaks one of those properties and must be caught here.
#[test]
fn summary_delivered_row_with_invitation_diverges_between_filters() {
let conn = fresh();
insert_session(&conn, &new_session("pol-hash-divergence")).unwrap();
conn.execute(
"UPDATE mediation_sessions SET state = 'summary_delivered'
WHERE session_id = 'sess-1'",
[],
)
.unwrap();

// Without a `self_resolution_offered` audit row, `summary_delivered`
// is terminal for BOTH queries (legacy behaviour).
let live = list_live_sessions(&conn).unwrap();
assert!(
live.is_empty(),
"legacy summary_delivered without invitation must be excluded \
from list_live_sessions; got {live:?}"
);
assert!(
latest_open_session_for(&conn, "dispute-xyz")
.unwrap()
.is_none(),
"legacy summary_delivered without invitation must be excluded \
from latest_open_session_for"
);

// Add the `self_resolution_offered` audit row.
conn.execute(
"INSERT INTO mediation_events (session_id, kind, payload_json, occurred_at)
VALUES ('sess-1', 'self_resolution_offered', '{}', 100)",
[],
)
.unwrap();

// Carve-out applies to `list_live_sessions` only.
let live = list_live_sessions(&conn).unwrap();
assert_eq!(
live.len(),
1,
"post-invitation summary_delivered must be visible to \
list_live_sessions so the ingest tick can observe a later \
party reply for the PartyRequestedHuman opt-in; got {live:?}"
);
assert_eq!(live[0].session_id, "sess-1");
assert_eq!(live[0].state, MediationSessionState::SummaryDelivered);

// ...and MUST NOT apply to `latest_open_session_for`. Surfacing
// this row here is what triggered the 2026-04-27 panic — the
// dispute_resolved handler would walk it through
// `SupersededByHuman → Closed`, but `summary_delivered →
// superseded_by_human` is not a legal state-machine edge.
assert!(
latest_open_session_for(&conn, "dispute-xyz")
.unwrap()
.is_none(),
"post-invitation summary_delivered MUST stay invisible to \
latest_open_session_for; the dedicated summary_delivered → \
closed path in dispute_resolved.rs handles closure via the \
legal direct transition"
);
}

#[test]
fn set_session_state_updates_state_and_transition_ts() {
let conn = fresh();
Expand Down
7 changes: 5 additions & 2 deletions src/mediation/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1326,8 +1326,11 @@ pub async fn deliver_summary(
// session at `summary_delivered` blocks re-eligibility
// (since `summary_delivered` is treated as live by the
// eligibility EXISTS clause) while still being recognised
// as terminal by `list_live_sessions` and
// `latest_open_session_for`. The legal `summary_delivered
// as terminal by `latest_open_session_for`.
// `list_live_sessions` keeps the row visible only when a
// prior `self_resolution_offered` audit row exists, so the
// ingest tick can still observe a later party reply for
// the human-assistance opt-in. The legal `summary_delivered
// → closed` transition is taken later by the
// `dispute_resolved` handler when Mostro closes the
// dispute, or stays put indefinitely if the dispute never
Expand Down
27 changes: 21 additions & 6 deletions src/reasoning/openai.rs
Original file line number Diff line number Diff line change
Expand Up @@ -481,12 +481,27 @@ pub(super) fn build_classification_prompt(r: &ClassificationRequest) -> String {
// the exact bytes the session's `policy_hash` pins. An auditor
// can later grep the git-committed bundle for this hash and
// recover the full prompt context.
let transcript = r
.transcript
.iter()
.map(|e| format!("[{}] {}: {}", e.inner_event_created_at, e.party, e.content))
.collect::<Vec<_>>()
.join("\n");
//
// When the transcript is empty (round 0, before any party reply)
// render a literal marker instead of a blank section so the model
// sees an unambiguous round-0 signal. A blank "## Transcript"
// section reads as "transcript missing", which on `gpt-5.4-mini`
// (and similar router models) pushed the classifier toward
// `suggested_action = summarize | escalate` instead of
// `ask_clarification` — observed 2026-04-28 with the
// Alice/Bob test trade staying at `initiated` because the
// opening classify+take pre-empted itself with
// `Escalate(LowConfidence)`. The Round-0 contract section in
// `prompts/phase3-classification.md` keys off this marker.
let transcript = if r.transcript.is_empty() {
"(no replies yet — round 0)".to_string()
} else {
r.transcript
.iter()
.map(|e| format!("[{}] {}: {}", e.inner_event_created_at, e.party, e.content))
.collect::<Vec<_>>()
.join("\n")
};
// Feature 005: only ask for the `human_requested` field on rounds
// following a `self_resolution_offered` audit row. Asking on every
// round wastes tokens and risks false positives (a buggy provider
Expand Down
Loading