Skip to content

fix(mediation): duplicate greeting, biased invitation, summary→supersede panic#49

Merged
grunch merged 3 commits intomainfrom
fix/phase3-greeting-and-summary-close
Apr 28, 2026
Merged

fix(mediation): duplicate greeting, biased invitation, summary→supersede panic#49
grunch merged 3 commits intomainfrom
fix/phase3-greeting-and-summary-close

Conversation

@grunch
Copy link
Copy Markdown
Member

@grunch grunch commented Apr 28, 2026

Summary

Three production bugs surfaced in the 2026-04-27 Alice/Bob test trade after PR #48 (cooperative self-resolution) shipped. Fixes are isolated to prompt files plus one DB-helper revert; no schema change.

1. Duplicated greeting inside a single chat message

mediation::draft_and_send_initial_message hard-prefixes "Hello, I'm Serbero, an automated mediation assistant…" on the first outbound. The model was also emitting the same line at the start of buyer_clarification / seller_clarification because prompts/phase3-message-templates.md told it to and phase3-system.md's "always identify yourself" rule reinforced it. Result: greeting appeared twice in Alice's chat (screenshot in the production report).

  • phase3-message-templates.md: removed the greeting from "First Clarifying Question" — runtime owns the introduction; stripped the "Thank you for your response" preamble from "Follow-Up Clarification".
  • phase3-system.md: narrowed "always identify yourself" to "answer truthfully if asked" plus an explicit prohibition on prefixing every clarification with a self-introduction.
  • phase3-classification.md: new hard rule under "Clarifying Questions (per-party)" forbidding greetings, self-introductions, and sign-offs in either *_clarification field.

2. Cooperative invitation presumed the recipient gave the update

prompts/phase3-self-resolution.md opened each language section with "Thanks for the update" / "Gracias por la actualización" / "Obrigado pela atualização", correct for the party who just replied but nonsensical for a counterparty who only said "hola". Reworded to a neutral lead-in ("From what each of you has shared, …" / "Por lo que cada uno ha compartido, …" / "Pelo que cada um compartilhou, …") that applies symmetrically. Banned-keyword audit and opt-in invariants are unaffected; tests/phase3_self_resolution_template_audit.rs continues to pass.

3. Daemon panic: summary_delivered → superseded_by_human illegal

After Alice released funds, the daemon panicked with:

set_session_state: illegal transition summary_delivered -> superseded_by_human for session_id=89bf0e36-…

This is a regression from the earlier carve-out that added the self_resolution_offered EXISTS clause to both list_live_sessions and latest_open_session_for. The dispute_resolved handler at src/handlers/dispute_resolved.rs has two distinct paths:

  • Lines 156-271 walk live sessions through SupersededByHuman → Closed.
  • Lines 286-322 walk summary_delivered sessions through the legal direct summary_delivered → Closed transition.

Surfacing post-invitation summary_delivered rows on the SupersededByHuman path tripped the debug-asserted state-machine guard.

Fix: reverted only the latest_open_session_for carve-out; left the list_live_sessions carve-out in place because the ingest tick still needs to observe a later party reply on a post-invitation session for the PartyRequestedHuman opt-in. Doc comments on both functions now spell out the intentional divergence, and the inline comment in mediation::mod.rs was rewritten to match.

Test plan

  • cargo test — 295 lib + integration tests pass.
  • cargo clippy --all-targets -- -D warnings — clean.
  • cargo fmt --all -- --check — clean.
  • Re-run the Alice/Bob test trade scenario in staging:
    • First clarification message contains the runtime greeting exactly once.
    • Cooperative invitation reads neutrally for the party who did not give the update.
    • Releasing funds after invitation closes the session via the legal summary_delivered → closed path; no panic.

Summary by CodeRabbit

  • Bug Fixes

    • Improved message formatting in clarification questions to eliminate duplicate greetings and introductions.
    • Removed redundant preambles from follow-up clarification messages for cleaner conversations.
    • Updated self-resolution message wording to reflect shared information more clearly.
  • Documentation

    • Clarified system identity guidance to prevent unnecessary repetition in chat messages.

…trade

1. Duplicated "Hello, I'm Serbero..." greeting in clarification messages.

   The runtime's `draft_and_send_initial_message` already prefixes a
   one-line self-introduction at session open, but the model was also
   producing the same greeting inside its `*_clarification` text — both
   `phase3-message-templates.md` instructed it to and
   `phase3-system.md` told it to "always identify yourself", which it
   was reasonably interpreting as "greet on every message." Result:
   the greeting appeared twice in a single chat message.

   Fix at the prompt layer: drop the greeting from the "First
   Clarifying Question" template (the runtime owns it), strip the
   "Thank you for your response" preamble from the follow-up template,
   narrow the system-prompt identity rule to "answer truthfully if
   asked" instead of "always identify yourself", and add an explicit
   no-greeting / no-self-introduction hard rule to the classifier
   prompt's clarifying-question section.

2. Cooperative self-resolution template wording presumed the recipient
   gave the update.

   The `[en]/[es]/[pt]` sections opened with "Thanks for the update" /
   "Gracias por la actualización" / "Obrigado pela atualização", which
   is correct for the party who replied but jarring for the
   counterparty who only said "hola". Reworded to a neutral lead-in
   ("From what each of you has shared, ...") that applies symmetrically
   to both parties. Banned-keyword audit and opt-in invariants are
   unaffected.

3. PANIC: `set_session_state: illegal transition summary_delivered ->
   superseded_by_human`.

   Regression introduced by the carve-out commit that added the EXISTS
   clause for post-invitation `summary_delivered` sessions to
   `latest_open_session_for`. The dispute_resolved handler at
   `src/handlers/dispute_resolved.rs` has two distinct paths: lines
   156-271 walk live sessions through `SupersededByHuman → Closed`,
   and lines 286-322 walk `summary_delivered` sessions through the
   legal direct `summary_delivered → Closed` transition. The
   carve-out broke that contract by surfacing post-invitation
   summary_delivered rows on the SupersededByHuman path, where the
   debug-asserted state-machine guard panics.

   Reverted only the `latest_open_session_for` carve-out; left the
   `list_live_sessions` carve-out in place because the ingest tick
   still needs to observe a later party reply on a post-invitation
   session for the `PartyRequestedHuman` opt-in. Updated doc
   comments on both functions to make the intentional divergence
   explicit, and rewrote the inline comment in `mediation::mod.rs`
   that described the old (uniform-exclusion) behaviour.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 28, 2026

Warning

Rate limit exceeded

@grunch has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 25 minutes and 40 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cd23497a-749c-44a1-822a-1a8468a542ce

📥 Commits

Reviewing files that changed from the base of the PR and between c261072 and e172ef7.

📒 Files selected for processing (3)
  • prompts/phase3-classification.md
  • src/db/mediation.rs
  • src/reasoning/openai.rs

Walkthrough

This PR updates prompt templates and mediation logic documentation to refine clarification-question formatting policies (preventing duplicate greetings and sign-offs), clarify session eligibility semantics for summary-delivered states, and standardize self-resolution template phrasing across locales.

Changes

Cohort / File(s) Summary
Prompt Template Clarification Rules
prompts/phase3-classification.md, prompts/phase3-message-templates.md
Tightens formatting policies for clarification questions: removes greetings/sign-offs from question text (identity provided once by runtime at session open) and ensures follow-up clarification messages contain only the substituted question without preambles or sign-offs.
Self-Resolution Template Localization
prompts/phase3-self-resolution.md
Updates template copy in English, Spanish, and Portuguese to reframe opening from "thanks for the update" to "from what each of you has shared" while preserving neutral meaning and message structure.
Session Eligibility Logic Documentation
src/db/mediation.rs, src/mediation/mod.rs
Clarifies that list_live_sessions and latest_open_session_for apply different eligibility logic around summary_delivered states; updates SQL in latest_open_session_for to unconditionally exclude summary_delivered (removing prior self_resolution_offered gating); documents that list_live_sessions retains Feature 005 behavior for later-party replies.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • mostronatorcoder

Poem

🐰 Questions were greeting themselves twice,
But now they're concise, oh so nice!
No "hello," no "thanks"—just the query,
Sessions bloom clear (no more worry),
Whiskers twitch ✨—clarity's spice!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the three main bugs fixed: duplicate greeting in clarifications, biased wording in self-resolution, and a daemon panic on state transition. Each component corresponds to actual changes in the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/phase3-greeting-and-summary-close

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/db/mediation.rs (1)

427-435: Add a regression test that locks the intentional helper divergence.

Given this bug just regressed, it’s worth pinning a test that verifies:

  1. list_live_sessions includes summary_delivered only with self_resolution_offered, and
  2. latest_open_session_for still returns None for that same row.

That will prevent accidental reintroduction of the panic path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/db/mediation.rs` around lines 427 - 435, Add a regression test around
mediation session filtering to lock the behavior that caused the regression:
create a test that inserts a session row with state = "summary_delivered" and
self_resolution_offered = true and assert that list_live_sessions returns that
session, then assert that latest_open_session_for returns None for that same
row; conversely insert a "summary_delivered" row with self_resolution_offered =
false and assert it is excluded from list_live_sessions. Target your test to
call the existing functions list_live_sessions and latest_open_session_for (and
use the same DB setup helpers used by other mediation tests) so it fails if the
SQL filtering in list_live_sessions or latest_open_session_for regresses.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@prompts/phase3-classification.md`:
- Around line 86-94: The current guidance block "Do NOT begin the question with
a greeting..." contains a stale reference to an “always identify yourself”
system rule; remove that sentence and rephrase the block to reflect the runtime
identity disclosure policy (identity is provided once by the runtime in the
first message), e.g., state that subsequent clarification rounds must open
directly with the question because identity is already provided by the runtime's
initial message, and ensure the prohibition on repeated greetings/sign-offs
remains clear (update the block that starts "Do NOT begin the question with a
greeting..." and remove the phrase "always identify yourself").

---

Nitpick comments:
In `@src/db/mediation.rs`:
- Around line 427-435: Add a regression test around mediation session filtering
to lock the behavior that caused the regression: create a test that inserts a
session row with state = "summary_delivered" and self_resolution_offered = true
and assert that list_live_sessions returns that session, then assert that
latest_open_session_for returns None for that same row; conversely insert a
"summary_delivered" row with self_resolution_offered = false and assert it is
excluded from list_live_sessions. Target your test to call the existing
functions list_live_sessions and latest_open_session_for (and use the same DB
setup helpers used by other mediation tests) so it fails if the SQL filtering in
list_live_sessions or latest_open_session_for regresses.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 724ae1aa-5c9f-41dc-9d42-00655995e02e

📥 Commits

Reviewing files that changed from the base of the PR and between 10a4fd2 and c261072.

📒 Files selected for processing (6)
  • prompts/phase3-classification.md
  • prompts/phase3-message-templates.md
  • prompts/phase3-self-resolution.md
  • prompts/phase3-system.md
  • src/db/mediation.rs
  • src/mediation/mod.rs

Comment thread prompts/phase3-classification.md Outdated
grunch added 2 commits April 28, 2026 12:41
… divergence

Two follow-ups against the previous commit, raised in the PR review:

1. `prompts/phase3-classification.md`: the new no-greeting hard rule
   pointed at the system prompt's "always identify yourself" rule, but
   that phrase was rewritten in the same commit (to "answer truthfully
   if asked") so the cross-reference dangled. Reworded the block to
   point at the runtime owner of the introduction
   (`mediation::draft_and_send_initial_message`) directly.

2. `src/db/mediation.rs`: added a regression test that locks the
   intentional divergence between `list_live_sessions` (post-invitation
   `summary_delivered` rows STAY visible so the ingest tick can observe
   a later party reply for `PartyRequestedHuman`) and
   `latest_open_session_for` (post-invitation `summary_delivered` rows
   STAY invisible so the dispute_resolved handler closes them via the
   legal direct `summary_delivered → closed` transition instead of the
   illegal `SupersededByHuman` walk). A change that re-aligns the two
   filters in either direction will now fail this test.
…ipts

Observed 2026-04-28: a freshly-opened Alice/Bob dispute (event
6380cdd2…, dispute 4c64e6ab-…) stayed at `initiated` because the
opening classify+take call returned `Escalate(LowConfidence)`
before any chat-transport step. Mostro never saw Serbero take the
dispute, so the dispute lifecycle did not advance. Log signature:

    classify_for_start: rationale persisted dispute-scoped
      classification=unclear confidence=0.12
      decision=Escalate(LowConfidence)
    opening-call escalate: dispute-scoped handoff recorded, no take
      trigger=low_confidence

The round-0 bypass in `policy::classify_to_decision` accepts
low-confidence classifications iff `suggested_action =
ask_clarification`. With confidence 0.12 and decision LowConfidence
the bypass did not engage, so the model returned
`suggested_action ∈ { summarize, escalate }` instead of
`ask_clarification`. Two contributing causes, both fixed here:

1. **Blank transcript section.** With `transcript = ""` the user
   prompt rendered "## Transcript\n\n## Output contract", which the
   `gpt-5.4-mini` router model read as "transcript missing" rather
   than "transcript empty because no replies yet". The model fell
   back to a defensive `summarize | escalate`. Fix: emit a literal
   `(no replies yet — round 0)` marker when `transcript.is_empty()`,
   so the round-0 signal is unambiguous in the prompt body.

2. **Hard-rule escape hatch interpreted too broadly.** The
   classifier prompt's "If you cannot produce a useful question
   for one side, pick a different `suggested_action` (`summarize`
   or `escalate`)" rule had no round-0 carve-out. With an empty
   transcript a literal reading is "I have no information about
   this dispute, so I cannot produce a useful question for either
   side, so I should escalate." Fix: added an explicit Round-0
   contract section to `prompts/phase3-classification.md` stating
   that round 0 MUST return `ask_clarification` with both opening
   questions, and narrowed the escape-hatch wording to
   `round_count >= 1` — round 0's job is to start the chat, not
   classify or escalate.

Pre-fix the "First Clarifying Question" template carried the
"Hello, I'm Serbero, …" greeting, which apparently anchored the
model toward `ask_clarification` on round 0 even without an explicit
contract section. Removing that greeting (necessary to fix the
duplicated-greeting bug) exposed this latent fragility.

Lib + integration tests pass; no schema change.
@grunch grunch merged commit 69f7e9f into main Apr 28, 2026
2 checks passed
@grunch grunch deleted the fix/phase3-greeting-and-summary-close branch April 28, 2026 16:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant