feat(mediation): cooperative self-resolution invitation branch#48
feat(mediation): cooperative self-resolution invitation branch#48
Conversation
Feature 005 — Cooperative Self-Resolution Nudge. When Serbero classifies a mediation case as coordination_failure_resolvable with high confidence, send a templated, language-matched invitation to each party in the existing chat transport, inviting them to coordinate the resolution among themselves with an explicit human-assistance opt-in. The invitation does not name, instruct, or imply any fund-moving action; the Phase 3 policy boundary remains intact. Artifacts (specs/005-cooperative-self-resolution/): - spec.md — 14 FRs across 3 user stories (US1 cooperative happy path P1, US2 human-assistance opt-in P1, US3 no-lock-in P2), 7 success criteria, edge cases, assumptions, dependencies, out-of-scope. - plan.md — Technical Context, Constitution Check (all 13 principles pass), Project Structure (one new prompt bundle, three Rust enum variants, two config keys, one state-machine edge, six test files; no migration, no new crates). - research.md — Phase 0 design decisions and best-practice references (templates static not LLM-authored, bundle integration via existing PromptBundle loader, threshold global, one-shot guard via audit row, opt-in via classifier- output extension, summary fires immediately, threshold inclusivity, multilingual append-only). - data-model.md — Phase 1 data shapes (zero schema changes; +3 enum variants, +2 config keys, +1 state-machine edge, +1 audit payload schema). - quickstart.md — operator walkthrough per user story plus SC-007 byte-for-byte verification recipe. - contracts/template-bundle.md — bundle file format, banned-substring matrix per language, recipe for adding a language. - contracts/audit-events.md — sequence invariants for self_resolution_offered + reuse of summary_generated, escalation_recommended. - contracts/config.md — operator-facing TOML keys, validation, recipes, downgrade path. - contracts/classifier-output.md — additive human_requested field, prompt instruction, provider-coverage matrix, backward compat. - checklists/requirements.md — spec quality checklist (all items pass). CLAUDE.md and .specify/feature.json regenerated to reflect the new active feature (005).
When the classifier marks a session as `coordination_failure_resolvable`
with confidence ≥ the configured threshold (default 0.75), Serbero now
sends a templated, language-matched invitation to both parties inviting
them to coordinate the next step among themselves, with an explicit
human-assistance opt-in. Static repo templates in
`prompts/phase3-self-resolution.md` (en/es/pt initially) — the LLM only
decides whether the branch fires, never what to say. The fund-action
prohibition is enforced by a keyword-audit unit test that refuses to
merge a bundle containing forbidden verbs in any language.
Key surface changes:
- `MediationEventKind::SelfResolutionOffered` — new audit kind, written
before outbound dispatch with payload `{confidence, languages}` and a
rationale-id reference (FR-120 / no inlined text).
- `EscalationTrigger::PartyRequestedHuman` — fires when the model
detects an explicit human-assistance request on a round following a
`self_resolution_offered` event.
- `PolicyDecision::SuggestSelfResolutionWithSummary { confidence }` —
new dispatch arm; falls through to the legacy `Summarize` when any
pre-condition fails (kill-switch off, sub-threshold, prior offered).
- State-machine edge `SummaryDelivered → EscalationRecommended` so the
opt-in can lift the session out of the post-summary state.
- Two `[mediation]` config keys: `self_resolution_threshold` (f32,
default 0.75) and `self_resolution_enabled` (bool, default true,
master kill-switch).
- `ClassificationResponse` extended with three additive serde-defaulted
fields: `human_requested`, `buyer_language`, `seller_language`. The
language fields ride through the structured response so dispatch
doesn't need a separate Rust-side detector.
- Operational tracing: `cooperative_case_detected` (on dispatch) and
`cooperative_case_closed_externally` (on `dispute_resolved` for
sessions that received the invitation) for SC-001 / SC-002 baseline.
Backwards compatibility:
- No SQL migration. The new audit kind is unconstrained TEXT; old
sessions are unaffected.
- Both reasoning adapters (OpenAI-compatible + Anthropic) reuse the
shared `build_classification_prompt` / `parse_classification` from
the OpenAI module, so the prompt + parser changes land in both.
- Daemon upgrades that haven't shipped `phase3-self-resolution.md` yet
log a warning at startup and inert the branch (placeholder rendered
string) instead of refusing to start.
Test coverage:
- 8 new unit tests in `policy::evaluate` covering the cooperative
branch (fires / threshold-inclusive / sub-threshold / kill-switch /
one-shot / human-requested short-circuit / defence-in-depth /
no-lock-in).
- New `phase3_self_resolution_template_audit` test asserts the bundle
parses, every language is covered by the banned-substring matrix,
and rendered strings carry no fund-action keywords + always include
the opt-in sentence.
- Unit tests for `SelfResolutionTemplates::render_for` (5 cases) and
the bundle parser (6 cases).
- All 287 existing lib tests + integration suite pass with zero
regressions.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
WalkthroughAdds Feature 005 (Cooperative Self-Resolution): introduces config knobs and validation, prompt/template bundle + parser, classifier output extensions, policy/follow‑up/control‑flow changes to emit per‑party invitations and summaries, new audit-event kind and DB helpers, state-machine extensions, many test updates, and supporting specs/docs. Changes
Sequence Diagram(s)sequenceDiagram
participant Party
participant Engine as MediationEngine
participant Classifier
participant Policy
participant Templates
participant DB as Database
participant SummaryNotif
Party->>Engine: tick / session activity
Engine->>Classifier: classify(round, ctx)
Classifier-->>Engine: ClassificationResponse(confidence, human_requested, buyer_language, seller_language)
Engine->>Policy: evaluate(classification, &mediation_cfg)
Policy->>DB: EXISTS mediation_events.self_resolution_offered?
DB-->>Policy: prior_offered (true/false)
alt enabled AND confidence >= threshold AND no prior_offered AND coordination_failure_resolvable
Policy-->>Engine: SuggestSelfResolutionWithSummary
Engine->>Templates: render_for(buyer_language)
Templates-->>Engine: buyer_invite
Engine->>Templates: render_for(seller_language)
Templates-->>Engine: seller_invite
Engine->>DB: record_self_resolution_offered(..., classification_confidence, rationale_id, buyer_language, seller_language, prompt_bundle_id, policy_hash, occurred_at)
Engine->>Party: deliver buyer_invite
Engine->>Party: deliver seller_invite
Engine->>SummaryNotif: deliver_summary(suggested_next_step="self_resolution_offered_to_parties")
Party->>Engine: may opt-in (human_requested)
Engine->>Classifier: classify(follow-up round, request human_requested)
Classifier-->>Engine: ClassificationResponse(human_requested=true)
Engine->>Policy: evaluate(classification, &mediation_cfg)
alt human_requested == true AND prior_offered == true
Policy-->>Engine: Escalate(PartyRequestedHuman)
Engine->>DB: record_escalation_recommended(...)
end
else
Policy-->>Engine: legacy decision (Summarize/Escalate)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ebb94756bb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (3)
src/prompts/hash.rs (1)
38-69: Consider adding unit tests forpolicy_hash_v2.The implementation correctly extends the v1 scheme, but the existing tests only cover
policy_hash(v1). Adding similar tests forpolicy_hash_v2would ensure:
- Identical inputs produce identical hashes
- Any byte change in
self_resolutionflips the hash- The v2 hash differs from v1 for the same first five inputs
🧪 Suggested test additions
#[test] fn policy_hash_v2_identical_inputs_produce_identical_hashes() { let a = policy_hash_v2("s", "c", "e", "m", "t", "sr"); let b = policy_hash_v2("s", "c", "e", "m", "t", "sr"); assert_eq!(a, b); assert_eq!(a.len(), 64); } #[test] fn policy_hash_v2_self_resolution_change_flips_hash() { let base = policy_hash_v2("s", "c", "e", "m", "t", "sr"); assert_ne!(base, policy_hash_v2("s", "c", "e", "m", "t", "SR")); } #[test] fn policy_hash_v2_differs_from_v1() { let v1 = policy_hash("s", "c", "e", "m", "t"); let v2 = policy_hash_v2("s", "c", "e", "m", "t", ""); assert_ne!(v1, v2, "v2 must differ even with empty self_resolution"); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/prompts/hash.rs` around lines 38 - 69, Add unit tests for policy_hash_v2 mirroring the existing v1 tests: create tests named like policy_hash_v2_identical_inputs_produce_identical_hashes (assert identical inputs produce equal 64-char hashes), policy_hash_v2_self_resolution_change_flips_hash (assert changing self_resolution changes the hash), and policy_hash_v2_differs_from_v1 (call policy_hash and policy_hash_v2 with the same first five args and an empty self_resolution and assert they differ); place them alongside the existing policy_hash tests and use the policy_hash_v2 and policy_hash symbols in assertions.src/mediation/follow_up.rs (1)
683-694: Consider storing NULL instead of empty string for missing rationale_id.When
rationale_idisNone, the code passes an empty string (""). This gets stored as a non-NULL value in the database, which differs from the "reference is absent" semantic. For forensic consistency, consider propagatingOption<&str>through torecord_self_resolution_offeredand storingNULLwhen the rationale is unavailable.♻️ Suggested approach
Change the signature of
record_self_resolution_offeredto acceptOption<&str>forrationale_id, and adjust the call site:db::mediation_events::record_self_resolution_offered( &tx, session_id, - rationale_id.unwrap_or(""), + rationale_id.as_deref(), confidence,And in
mediation_events.rs:pub fn record_self_resolution_offered( conn: &Connection, session_id: &str, - rationale_id: &str, + rationale_id: Option<&str>, ... ) -> Result<i64> {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mediation/follow_up.rs` around lines 683 - 694, Change the call and API to propagate Option for missing rationale IDs: update the signature of db::mediation_events::record_self_resolution_offered to accept Option<&str> for rationale_id (instead of &str), update its implementation in mediation_events.rs so the SQL insert/bind writes NULL when rationale_id is None, and modify the caller in follow_up.rs to pass rationale_id (not rationale_id.unwrap_or("")) so None becomes NULL in the DB; ensure any downstream call sites are updated to the new signature.specs/005-cooperative-self-resolution/contracts/audit-events.md (1)
81-87: Minor: Add language specifier to fenced code blocks.The audit-row sequence diagrams would benefit from a language specifier (e.g.,
textorplaintext) to satisfy markdown linting and improve rendering consistency.📝 Suggested fix
-``` +```text 1. session_opened (existing — emitted at session open)Apply similarly to the code block at line 98.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@specs/005-cooperative-self-resolution/contracts/audit-events.md` around lines 81 - 87, The fenced code blocks showing the audit-row sequence (the block containing lines like "session_opened", "classification_produced", "self_resolution_offered", "summary_generated", "session_closed") need a language specifier to satisfy markdown linting and rendering; update each triple-backtick fence (including the other block around line with "session_opened" and the later block at line ~98) to use a language tag such as ```text or ```plaintext so the blocks read e.g. ```text and leave the contents unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config.sample.toml`:
- Around line 86-87: Update the comment that claims "Setting this to 1.0
effectively disables the branch" to clarify that a confidence of 1.0 does not
strictly disable the branch (it can still pass the gate); change the phrasing to
something like "nearly disables" or "effectively reduces probability to pass"
and explicitly state that the kill-switch is the only strict disable mechanism,
referencing the confidence value "1.0" and the "kill-switch" so readers know
which settings are being compared.
In `@specs/005-cooperative-self-resolution/contracts/classifier-output.md`:
- Around line 14-24: The example JSON in the classifier output schema is missing
the additive fields buyer_language and seller_language; update the example
object in the classifier-output.md schema example (the JSON block under
classification/confidence/suggested_action) to include buyer_language and
seller_language with representative values (e.g., two-letter language codes or
placeholder strings) so it matches the implemented classifier surface and
adapters consuming fields like buyer_language and seller_language.
In `@specs/005-cooperative-self-resolution/contracts/template-bundle.md`:
- Around line 4-5: The "Loaded by" note in the template-bundle.md still points
to the old bundle loader; update that header to reference the current loader
location in the prompts module (prompts::mod) so it accurately reflects where
the bundle parsing is wired up (look for the loader wiring in the prompts module
code and replace the stale `Loaded by` line accordingly).
In `@specs/005-cooperative-self-resolution/plan.md`:
- Around line 44-48: Clarify the lifecycle sentence to resolve the
contradiction: update the sentence that currently says the session lifecycle
"still ends at `summary_delivered`" to state that the default/happy path still
ends at `summary_delivered`, while an explicit human-assistance opt-in path
(referenced later as `SummaryDelivered → EscalationRecommended`) allows
post-summary escalation; keep references to the `Summarize` decision path and
`summary_delivered` token so readers can find the related flow.
In `@specs/005-cooperative-self-resolution/quickstart.md`:
- Around line 18-21: Update the quickstart bullet that currently reads "A
reasoning provider that emits the `human_requested` field on round N+1" and its
surrounding historical phrasing so it states the current runtime expectation:
that adapters (OpenAI-compatible and Anthropic) now emit the `human_requested`
field on round N+1. Remove the "neither adapter emits it" and "Phase 2"
plan-time notes and make the sentence present-tense and concise to reflect
shipped behavior.
In `@specs/005-cooperative-self-resolution/tasks.md`:
- Around line 19-20: Replace the machine-local absolute path string
"/home/negrunch/dev/cancerbero/" with a repository-relative guideline (e.g., use
"./" or a placeholder like "<repo-root>" or "$REPO_ROOT") in the tasks text so
instructions are portable; update the sentence that currently reads "All paths
are absolute under the repository root `/home/negrunch/dev/cancerbero/`" to
instead state that all paths are relative to the repository root and show the
chosen placeholder format for examples.
In `@src/mediation/policy.rs`:
- Around line 271-325: The read-then-write race around self_resolution_offered
can allow duplicate invitations; update the mediation_events schema or insertion
logic so (session_id, kind) is unique (or make the insert idempotent/IGNORE ON
CONFLICT) to prevent concurrent inserts—follow the escalation_dispatches v5
pattern (UNIQUE on handoff_prepared) used for handoff_prepared; specifically
change the mediation_events table to add UNIQUE(session_id, kind) or modify the
commit in the dispatch path that writes self_resolution_offered (the code that
runs after SuggestSelfResolutionWithSummary and the helper
session_has_self_resolution_offered) to perform an atomic upsert or gracefully
ignore duplicate-key errors, or alternatively serialize per-session writes with
a lock to ensure the “fire at most once per session” guarantee.
In `@src/mediation/self_resolution.rs`:
- Around line 101-114: The fallback in render_for currently returns an internal
diagnostic string to users; change its signature from pub fn
render_for(language_code: Option<&str>, templates: &SelfResolutionTemplates) ->
String to return Option<String> (or Result<String, SelfResolutionError>) and
have render_for return None (or Err) when templates.entry_for(language_code) is
None instead of the diagnostic placeholder; update callers of render_for (the
self-resolution dispatch path) to detect None/Err, log the diagnostic/audit
message there and inert/skip the self-resolution branch. Ensure references:
modify render_for, its use of entry_for on SelfResolutionTemplates, and update
the dispatch caller to handle the new return type.
In `@src/prompts/mod.rs`:
- Around line 69-87: The code currently treats any std::fs::read_to_string error
as a missing file; change the Err branch to inspect the io::Error kind and only
use the fallback when e.kind() == std::io::ErrorKind::NotFound. For other
errors, return Err(Error::PromptBundleLoad(...)) including the actual error and
path. Update the match that sets (self_resolution, self_resolution_raw) around
read_to_string(&self_resolution_path) so that non-NotFound errors are propagated
(referencing self_resolution_path, self_resolution_parser::parse,
SelfResolutionTemplates::default(), and Error::PromptBundleLoad).
In `@src/prompts/self_resolution_parser.rs`:
- Around line 66-85: The loop over parsed.languages currently lowercases/trims
keys into normalized then unconditionally calls by_language.insert(...), which
silently overwrites duplicates; change the logic to check for duplicates before
inserting (e.g., if by_language.contains_key(&normalized) { return
Err(format!("duplicate language section after normalization: {}",
normalized).into()) }), using the same normalized key and referencing
parsed.languages, normalized, by_language.insert, and
SelfResolutionLanguageEntry so the function returns an error instead of silently
overwriting when a normalized key already exists.
---
Nitpick comments:
In `@specs/005-cooperative-self-resolution/contracts/audit-events.md`:
- Around line 81-87: The fenced code blocks showing the audit-row sequence (the
block containing lines like "session_opened", "classification_produced",
"self_resolution_offered", "summary_generated", "session_closed") need a
language specifier to satisfy markdown linting and rendering; update each
triple-backtick fence (including the other block around line with
"session_opened" and the later block at line ~98) to use a language tag such as
```text or ```plaintext so the blocks read e.g. ```text and leave the contents
unchanged.
In `@src/mediation/follow_up.rs`:
- Around line 683-694: Change the call and API to propagate Option for missing
rationale IDs: update the signature of
db::mediation_events::record_self_resolution_offered to accept Option<&str> for
rationale_id (instead of &str), update its implementation in mediation_events.rs
so the SQL insert/bind writes NULL when rationale_id is None, and modify the
caller in follow_up.rs to pass rationale_id (not rationale_id.unwrap_or("")) so
None becomes NULL in the DB; ensure any downstream call sites are updated to the
new signature.
In `@src/prompts/hash.rs`:
- Around line 38-69: Add unit tests for policy_hash_v2 mirroring the existing v1
tests: create tests named like
policy_hash_v2_identical_inputs_produce_identical_hashes (assert identical
inputs produce equal 64-char hashes),
policy_hash_v2_self_resolution_change_flips_hash (assert changing
self_resolution changes the hash), and policy_hash_v2_differs_from_v1 (call
policy_hash and policy_hash_v2 with the same first five args and an empty
self_resolution and assert they differ); place them alongside the existing
policy_hash tests and use the policy_hash_v2 and policy_hash symbols in
assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c3ef2748-f96d-458c-a6d8-32874c9044eb
📒 Files selected for processing (44)
.specify/feature.jsonCLAUDE.mdconfig.sample.tomlprompts/phase3-classification.mdprompts/phase3-escalation-policy.mdprompts/phase3-self-resolution.mdprompts/phase3-system.mdspecs/005-cooperative-self-resolution/checklists/requirements.mdspecs/005-cooperative-self-resolution/contracts/audit-events.mdspecs/005-cooperative-self-resolution/contracts/classifier-output.mdspecs/005-cooperative-self-resolution/contracts/config.mdspecs/005-cooperative-self-resolution/contracts/template-bundle.mdspecs/005-cooperative-self-resolution/data-model.mdspecs/005-cooperative-self-resolution/plan.mdspecs/005-cooperative-self-resolution/quickstart.mdspecs/005-cooperative-self-resolution/research.mdspecs/005-cooperative-self-resolution/spec.mdspecs/005-cooperative-self-resolution/tasks.mdsrc/config.rssrc/db/mediation_events.rssrc/handlers/dispute_resolved.rssrc/mediation/escalation.rssrc/mediation/follow_up.rssrc/mediation/mod.rssrc/mediation/policy.rssrc/mediation/self_resolution.rssrc/mediation/session.rssrc/models/config.rssrc/models/mediation.rssrc/models/reasoning.rssrc/prompts/hash.rssrc/prompts/mod.rssrc/prompts/self_resolution_parser.rssrc/reasoning/openai.rstests/common/mod.rstests/phase3_authority_boundary.rstests/phase3_escalation_triggers.rstests/phase3_followup_reasoning_failure.rstests/phase3_followup_round.rstests/phase3_followup_summary.rstests/phase3_self_resolution_template_audit.rstests/phase3_take_reasoning_coupling.rstests/reasoning_anthropic.rstests/reasoning_ppqai.rs
Code fixes (verified against current behavior): - prompts: keep policy_hash v1 when phase3-self-resolution.md is absent, only switch to v2 when the file is present. This preserves the bundle hash on legacy deployments so live sessions' pinned policy_hash keeps matching after upgrade — startup_resume_pass no longer compares mismatched hashes for healthy sessions. - prompts: only fall back to default templates on io::ErrorKind::NotFound; permission errors and other I/O failures now propagate as Error::PromptBundleLoad. - policy: gate the cooperative branch on a non-empty templates map. When the bundle file is absent, the branch is inert and the legacy Summarize path runs — so the render_for placeholder string can never reach a party-facing chat. - db::mediation::list_live_sessions: include `summary_delivered` sessions that received the cooperative invitation (EXISTS clause on a self_resolution_offered audit row), and relax the advance_session_round state gate accordingly. Without this, the human_requested short-circuit was unreachable: the cooperative dispatch transitions the session to summary_delivered, which the ingest loop excludes, so post-invitation replies were never re-classified. Carve-out is scoped to invited sessions only; legacy summary_delivered sessions stay terminal. - follow_up: when re-classifying a post-invitation summary_delivered session, only Escalate decisions are actionable. Other decisions (AskClarification / Summarize / SuggestSelfResolutionWithSummary) no-op silently — they would otherwise attempt illegal state transitions. - prompts::self_resolution_parser: error on duplicate language keys after case-normalization (e.g. `[en]` + `[EN]`) instead of silently overwriting one with the other. - db::mediation_events::record_self_resolution_offered: signature now takes Option<&str> for rationale_id so a missing classification rationale becomes NULL in the audit row instead of an empty string. Caller in follow_up updated. Skipped (verified non-issue): - The `self_resolution_offered` race-condition concern: the engine is single-threaded per session (ingest tick holds the AsyncMutex on the connection across the predicate + write), and the spec explicitly forbids new SQL migrations. UNIQUE constraint intentionally not added. Documentation fixes: - config.sample.toml: clarify that `self_resolution_threshold = 1.0` only nearly-disables (a classifier emitting exactly 1.0 still passes); the only strict disable is `self_resolution_enabled = false`. - contracts/classifier-output.md: schema example now shows `buyer_language` / `seller_language` alongside `human_requested` to match the implemented surface. - contracts/template-bundle.md: update "Loaded by" to point at `src/prompts/mod.rs::load_bundle` and `src/prompts/self_resolution_parser.rs`; document the v1/v2 hash fallback. - plan.md: clarify the lifecycle. Default / happy path still ends at summary_delivered; the explicit human-assistance opt-in path re-opens via SummaryDelivered → EscalationRecommended (gated on a prior self_resolution_offered audit row). - quickstart.md: rewrite prerequisites in present tense to reflect shipped behavior — both adapters share the prompt + parser pair so the human_requested field is emitted in both, and the prompts/phase3-self-resolution.md bundle is in the deploy. - tasks.md: replace the machine-local absolute path with a repository-relative description. - contracts/audit-events.md: add `text` language tag to the audit-row sequence code fences for markdown-lint compliance. Tests: - 4 new unit tests: parser duplicate-key guard, policy_hash_v2 determinism, policy_hash_v2 sensitivity to self_resolution bytes, policy_hash_v2 vs policy_hash domain separation, and the cooperative-branch inert-when-templates-empty regression. - All 292 lib tests + integration suite still pass.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 7
♻️ Duplicate comments (1)
src/mediation/policy.rs (1)
271-335:⚠️ Potential issue | 🟠 MajorPotential TOCTOU race on the one-shot guard.
The
prior_offeredcheck at lines 272-275 reads the DB before returning the decision, but the actualself_resolution_offeredrow is written later infollow_up.rsduring dispatch. If two concurrent engine ticks evaluate the same session, both could seeprior_offered = falseand both returnSuggestSelfResolutionWithSummary, leading to duplicate invitations.Consider either:
- Adding a
UNIQUE(session_id)partial index onmediation_eventswherekind = 'self_resolution_offered', withINSERT OR IGNOREin the dispatch path- Using
INSERT ... ON CONFLICT DO NOTHINGand checking affected rows to detect the raceThis follows the pattern from the v5 migration's
UNIQUEconstraint onescalation_dispatchesforhandoff_prepared.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mediation/policy.rs` around lines 271 - 335, The check of prior_offered (via db::mediation_events::session_has_self_resolution_offered) is vulnerable to TOCTOU races because the actual write happens later in follow_up.rs when dispatching SuggestSelfResolutionWithSummary; prevent duplicate invitations by making the DB operation atomic: add a UNIQUE partial index on mediation_events for kind='self_resolution_offered' (or use an equivalent constraint) and change the dispatch path to perform INSERT ... ON CONFLICT DO NOTHING (or INSERT OR IGNORE) and treat zero affected rows as a lost race (skip sending); update follow_up.rs where the self-resolution offer row is inserted to use the conflict-safe insert and check the result rather than relying solely on prior_offered.
🧹 Nitpick comments (3)
src/db/mediation_events.rs (1)
672-713: Makerecord_self_resolution_offeredtransaction-only.This row is supposed to land atomically with the two outbound
mediation_messages, but the&Connectionsignature still lets a future caller write it outside a transaction. Matchingrecord_escalation_dispatchedand taking&Transaction<'_>would encode that invariant at the type level.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/db/mediation_events.rs` around lines 672 - 713, Change record_self_resolution_offered to accept a transaction reference instead of a plain connection to enforce atomic writes: update its signature from conn: &Connection to conn: &Transaction<'_> (matching record_escalation_dispatched), adjust any internal calls that currently take &Connection if needed, and update all callers to pass the active Transaction rather than a Connection so the audit row is only written inside a transaction; also add or update necessary use/import for Transaction in the file.src/prompts/mod.rs (1)
11-17: Keep prompt types in the prompt layer.
src/prompts/mod.rsnow depends oncrate::mediation::self_resolution::SelfResolutionTemplatesjust to own parsed template data. Moving that type next tosrc/prompts/self_resolution_parser.rswould keep prompt loading reusable and avoid tightening the prompts → mediation dependency direction.Also applies to: 32-36
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/prompts/mod.rs` around lines 11 - 17, Remove the mediation dependency from src/prompts/mod.rs by moving the SelfResolutionTemplates type definition into the prompts layer (e.g., into src/prompts/self_resolution_parser.rs) so prompts own parsed template data; update imports so src/prompts/mod.rs and self_resolution_parser.rs reference the local SelfResolutionTemplates and change any consumers in crate::mediation::self_resolution to import the type from crate::prompts::self_resolution_parser (or re-export it from src/prompts/mod.rs) and remove the crate::mediation::self_resolution::SelfResolutionTemplates import; ensure PromptsConfig and error Result usages remain unchanged and adjust module visibility (pub use) as needed.src/mediation/policy.rs (1)
320-323: Minor: f64 → f32 cast for threshold comparison.The confidence is stored as
f64but the threshold isf32. The cast(confidence as f32)could lose precision on edge cases near the threshold boundary. For a ≥ comparison this is generally safe, but consider usingf64for the config field for consistency, or documenting that the threshold comparison is performed at f32 precision.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mediation/policy.rs` around lines 320 - 323, The comparison casts confidence (f64) to f32 which can lose precision; update the code so both sides use f64 by changing the config field or the comparison: make mediation_cfg.self_resolution_threshold an f64 (and adjust its deserialization/usage) or cast the threshold to f64 in the conditional instead of casting confidence to f32; locate the conditional using mediation_cfg.self_resolution_enabled, templates_present, confidence and prior_offered and ensure the comparison uses confidence >= mediation_cfg.self_resolution_threshold with matching f64 types.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config.sample.toml`:
- Around line 92-96: The comment for the config key self_resolution_enabled is
incorrect: update the description to state that when self_resolution_enabled =
false the system falls back to the legacy cooperative-summary/summarize path
rather than forcing human review; locate the block describing "Master
kill-switch" and replace the overstated lines about forcing every cooperative
case through human review with a concise note that disabling the flag reverts to
the legacy summarize behavior (use the config key name self_resolution_enabled
in the comment for clarity).
In `@specs/005-cooperative-self-resolution/contracts/audit-events.md`:
- Around line 21-45: record_self_resolution_offered is writing a different shape
than the spec: change src/db/mediation_events.rs::record_self_resolution_offered
to serialize the payload JSON to match the contract (use
"classification_confidence" instead of "confidence", include "rationale_id"
inside payload_json and stop writing it to the dedicated rationale_id column,
and only include languages.buyer and languages.seller (remove
languages.fallback)), keep existing companion columns (session_id,
kind='self_resolution_offered', prompt_bundle_id, policy_hash, occurred_at)
unchanged, and update any SQL/insert logic so the rationale_id column is NULL
while the full rationale_id value is present inside payload_json.
- Around line 89-90: The commit for self_resolution_offered and the two outbound
mediation_messages must occur in the same DB transaction as the subsequent
deliver_summary call; currently src/mediation/follow_up.rs inserts the
mediation_messages and commits self_resolution_offered, then calls
deliver_summary in a separate transaction which violates the contract. Change
the control flow so that the insert/commit of self_resolution_offered and the
mediation_messages and the call to deliver_summary are executed within the same
transaction boundary (e.g. expand the transactional closure or pass the active
transaction/connection into deliver_summary) so that both the outbound rows and
the summary delivery are committed atomically.
In `@specs/005-cooperative-self-resolution/contracts/classifier-output.md`:
- Around line 67-89: The spec claims serde will default human_requested but the
Rust type lacks serde attributes; update
src/models/reasoning.rs::ClassificationResponse to include serde derives and the
field attribute (e.g., add #[derive(Deserialize, Serialize)] on
ClassificationResponse and #[serde(default)] on the human_requested field) so
deserialisation supplies a false fallback, or alternatively update the spec to
require every adapter/parser (the adapter code that constructs
ClassificationResponse) to explicitly set human_requested = false when omitted;
choose one approach and make the contract consistent by either adding the serde
annotations to ClassificationResponse or documenting the adapter responsibility.
In `@specs/005-cooperative-self-resolution/contracts/template-bundle.md`:
- Around line 81-82: The documentation claims banned-substring checks are
"ASCII-folded for the diacritic languages" but the implementation uses
to_ascii_lowercase (preserving diacritics); update the banned substring
normalization routine (the code that currently calls to_ascii_lowercase in the
banned substring check used by the self-resolution template audit — referenced
by the tests/phase3_self_resolution_template_audit.rs comment) to perform
Unicode NFKD decomposition and strip diacritic marks to ASCII then apply
Unicode-aware case folding (e.g., use the unicode-normalization crate's nfkd()
and filter to ASCII followed by to_lowercase() or an equivalent casefold),
ensuring both the input templates and banned substrings are normalized the same
way before substring matching.
In `@src/db/mediation.rs`:
- Around line 285-312: The WHERE predicate used earlier to treat
'summary_delivered' sessions as still live must also be applied in
latest_open_session_for (or centralized); update the SQL in the
latest_open_session_for function so it no longer unconditionally excludes
summary_delivered rows but instead mirrors the same carve-out: allow rows where
s.state = 'summary_delivered' only when EXISTS (SELECT 1 FROM mediation_events e
WHERE e.session_id = s.session_id AND e.kind = 'self_resolution_offered'),
matching the OR/EXISTS logic used in the other query on mediation_sessions so
both queries remain consistent.
In `@src/mediation/follow_up.rs`:
- Around line 313-321: The branch in advance_session_round that returns early
when is_post_invitation_summary_delivered && !matches!(decision,
policy::PolicyDecision::Escalate(_)) fails to advance the evaluator marker,
causing repeated reclassification; before the early return, set
info.round_count_last_evaluated = info.round_count (or otherwise advance the
marker to the current round) so the inbound is marked evaluated; make the change
in the same branch (where debug! is logged) referencing the info struct and
round_count_last_evaluated/round_count fields so the tick will not reprocess the
same message.
---
Duplicate comments:
In `@src/mediation/policy.rs`:
- Around line 271-335: The check of prior_offered (via
db::mediation_events::session_has_self_resolution_offered) is vulnerable to
TOCTOU races because the actual write happens later in follow_up.rs when
dispatching SuggestSelfResolutionWithSummary; prevent duplicate invitations by
making the DB operation atomic: add a UNIQUE partial index on mediation_events
for kind='self_resolution_offered' (or use an equivalent constraint) and change
the dispatch path to perform INSERT ... ON CONFLICT DO NOTHING (or INSERT OR
IGNORE) and treat zero affected rows as a lost race (skip sending); update
follow_up.rs where the self-resolution offer row is inserted to use the
conflict-safe insert and check the result rather than relying solely on
prior_offered.
---
Nitpick comments:
In `@src/db/mediation_events.rs`:
- Around line 672-713: Change record_self_resolution_offered to accept a
transaction reference instead of a plain connection to enforce atomic writes:
update its signature from conn: &Connection to conn: &Transaction<'_> (matching
record_escalation_dispatched), adjust any internal calls that currently take
&Connection if needed, and update all callers to pass the active Transaction
rather than a Connection so the audit row is only written inside a transaction;
also add or update necessary use/import for Transaction in the file.
In `@src/mediation/policy.rs`:
- Around line 320-323: The comparison casts confidence (f64) to f32 which can
lose precision; update the code so both sides use f64 by changing the config
field or the comparison: make mediation_cfg.self_resolution_threshold an f64
(and adjust its deserialization/usage) or cast the threshold to f64 in the
conditional instead of casting confidence to f32; locate the conditional using
mediation_cfg.self_resolution_enabled, templates_present, confidence and
prior_offered and ensure the comparison uses confidence >=
mediation_cfg.self_resolution_threshold with matching f64 types.
In `@src/prompts/mod.rs`:
- Around line 11-17: Remove the mediation dependency from src/prompts/mod.rs by
moving the SelfResolutionTemplates type definition into the prompts layer (e.g.,
into src/prompts/self_resolution_parser.rs) so prompts own parsed template data;
update imports so src/prompts/mod.rs and self_resolution_parser.rs reference the
local SelfResolutionTemplates and change any consumers in
crate::mediation::self_resolution to import the type from
crate::prompts::self_resolution_parser (or re-export it from src/prompts/mod.rs)
and remove the crate::mediation::self_resolution::SelfResolutionTemplates
import; ensure PromptsConfig and error Result usages remain unchanged and adjust
module visibility (pub use) as needed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 88d8fb4f-728a-419e-9878-e6095cb11cbd
📒 Files selected for processing (14)
config.sample.tomlspecs/005-cooperative-self-resolution/contracts/audit-events.mdspecs/005-cooperative-self-resolution/contracts/classifier-output.mdspecs/005-cooperative-self-resolution/contracts/template-bundle.mdspecs/005-cooperative-self-resolution/plan.mdspecs/005-cooperative-self-resolution/quickstart.mdspecs/005-cooperative-self-resolution/tasks.mdsrc/db/mediation.rssrc/db/mediation_events.rssrc/mediation/follow_up.rssrc/mediation/policy.rssrc/prompts/hash.rssrc/prompts/mod.rssrc/prompts/self_resolution_parser.rs
Verified each finding against the current code; fixed real bugs and
spec-vs-code mismatches, skipped one non-issue with justification.
Real bugs fixed:
- db::mediation::latest_open_session_for: mirrored the Feature 005
carve-out from `list_live_sessions`. A `summary_delivered`
session that received the cooperative invitation now also surfaces
through the open-session lookup, so a re-take or a re-mediation
attempt observes the same session-state semantics consistently
across both queries. Legacy `summary_delivered` sessions stay
terminal (no `self_resolution_offered` row).
- mediation::follow_up: when the post-invitation early-return arm
fires (re-classified summary_delivered session whose decision is
not `Escalate`), advance `round_count_last_evaluated` before
returning. Without the advance the FR-127 idempotency gate would
re-classify the same fresh inbound on every subsequent tick,
burning reasoning-provider budget on a session that's already
settled into "wait silently for a possible human-assistance
request".
- mediation::policy: switch the cooperative threshold comparison to
`confidence >= f64::from(self_resolution_threshold)` so the
classifier's f64 confidence is no longer cast down to f32 before
comparison. Threshold storage stays f32 per the config contract;
the comparison is now precision-honest.
Spec-vs-code mismatches fixed (audit-row payload):
- db::mediation_events::record_self_resolution_offered: payload
shape now matches `contracts/audit-events.md`:
* `classification_confidence` (renamed from `confidence`)
* `rationale_id` lives inside `payload_json` (the dedicated
`mediation_events.rationale_id` column stays NULL on this kind
— the column convention is reserved for rationales whose
lifecycle is owned by the audit row itself; here the rationale
is owned by the round-N classification call).
* `languages` carries only `buyer` and `seller` codes (dropped
`fallback`, which was code-internal noise).
* Signature changed from `&Connection` to `&Transaction<'_>` so
the audit row can only be written inside an outer transaction
— same shape and discipline as `record_escalation_dispatched`.
The caller in `follow_up.rs` already passes `&tx`, so this
enforcement is purely additive.
Documentation fixes (verified against shipped code):
- config.sample.toml: clarified the `self_resolution_enabled = false`
description. With the kill-switch off, Serbero falls back to the
legacy cooperative-summary path (LLM still summarizes for the
solver); it does NOT force every cooperative case through human
review as the previous comment overstated.
- contracts/classifier-output.md: clarified the serde-defaults
story. `ClassificationResponse` itself has no serde derives — it
is the adapter-facing public shape, not a wire type. The actual
wire-format deserialization lives in each adapter's intermediate
struct (`ClassificationJson` in src/reasoning/openai.rs), which
carries the `#[serde(default)]` annotations on `human_requested`
/ `buyer_language` / `seller_language`. Adapters are responsible
for translating the wire struct into a `ClassificationResponse`.
- contracts/template-bundle.md: corrected the "ASCII-folded" claim.
The keyword-audit comparison uses `str::to_ascii_lowercase` on
both sides, which folds ASCII case but preserves diacritics
(`ñ`, `ç`, `á`). The banned list enumerates diacritic-bearing
forms verbatim — adding Unicode normalization would pull a new
dependency for negligible coverage gain. The representative
banned-list table in the spec now matches the canonical
`BANNED` matrix in the test file.
- contracts/audit-events.md: relaxed the "Steps 3 and 4 land in the
same transaction" claim to accurately describe the
transactional-outbox behavior. Step 3 (`self_resolution_offered`)
+ the two outbound `mediation_messages` rows commit atomically;
step 4 (`summary_generated`) lands in a subsequent transaction
because the summarizer makes an LLM HTTP call between the two.
A crash between TX1 and TX2 leaves the audit row + outbound rows
in place, and the next tick's idempotency check sees the
`self_resolution_offered` row + falls through to the legacy
Summarize path to recover the still-pending summary.
Skipped (verified non-issue):
- The TOCTOU race re-flagged on the `prior_offered` predicate in
`policy::evaluate` vs. the later write in `follow_up::dispatch`.
In single-process operation the engine holds the
`Arc<AsyncMutex<Connection>>` across the predicate read and the
eventual write inside `draft_and_send_self_resolution_invitation`'s
transaction, so two concurrent `advance_session_round` calls
cannot both pass the predicate. Adding a UNIQUE partial index
would require a new SQL migration which `plan.md` forbids; an
HA / multi-process deploy is out of scope for this feature.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
src/mediation/policy.rs (1)
271-339:⚠️ Potential issue | 🟠 MajorThe one-shot cooperative guard is still non-atomic.
prior_offeredis read here, but theself_resolution_offeredrow is inserted later insrc/mediation/follow_up.rs::draft_and_send_self_resolution_invitation. Without a DB-level uniqueness/idempotency guard for this kind, two overlapping evaluations can both returnSuggestSelfResolutionWithSummaryand send duplicate invitations for the same session. This also makes the later rationale lookup race with whichever classification commit lands last.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mediation/policy.rs` around lines 271 - 339, The check using prior_offered is racy because the self_resolution_offered row is inserted later in draft_and_send_self_resolution_invitation, so two concurrent evaluations can both choose PolicyDecision::SuggestSelfResolutionWithSummary and send duplicate invites; fix by making the existence check and the insertion atomic at the DB level (e.g. add a uniqueness constraint on the self_resolution_offered/session_id key and perform the check+insert inside a single transaction or use INSERT ... ON CONFLICT DO NOTHING and treat a no-op insert as “already offered”), or use a DB advisory/row lock around the session id so classify_to_decision + draft_and_send_self_resolution_invitation cannot interleave; update references in this flow (prior_offered, classify_to_decision -> SuggestSelfResolutionWithSummary branch, and draft_and_send_self_resolution_invitation) to rely on the atomic DB guard and honor the insert result to avoid duplicate invitations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@specs/005-cooperative-self-resolution/contracts/template-bundle.md`:
- Around line 49-51: The fenced code block containing "{template}
{human_assistance_optin}" should include a language tag to satisfy markdownlint
MD040; update that block by adding a language identifier such as "text" (i.e.,
change the opening fence from ``` to ```text) so the block reads with the
language tag and keeps the contract doc lint-clean.
In `@src/mediation/follow_up.rs`:
- Around line 659-660: The audit currently stores the raw classifier outputs
(buyer_language/seller_language) even when render_for() falls back to
fallback_language; change the code so you first resolve the effective template
language using the same resolver render_for uses (e.g., call the internal
language-resolution helper used by self_resolution::render_for or add a
self_resolution::resolve_language(buyer_language, fallback_language)), assign
that resolved value to variables (e.g., buyer_effective_language,
seller_effective_language), pass those into self_resolution::render_for to
produce buyer_msg/seller_msg, and persist
buyer_effective_language/seller_effective_language into
self_resolution_offered.payload_json.languages (also apply the same change to
the other block at 731-741) so the audit records the actual template language
delivered rather than the raw classifier output.
---
Duplicate comments:
In `@src/mediation/policy.rs`:
- Around line 271-339: The check using prior_offered is racy because the
self_resolution_offered row is inserted later in
draft_and_send_self_resolution_invitation, so two concurrent evaluations can
both choose PolicyDecision::SuggestSelfResolutionWithSummary and send duplicate
invites; fix by making the existence check and the insertion atomic at the DB
level (e.g. add a uniqueness constraint on the
self_resolution_offered/session_id key and perform the check+insert inside a
single transaction or use INSERT ... ON CONFLICT DO NOTHING and treat a no-op
insert as “already offered”), or use a DB advisory/row lock around the session
id so classify_to_decision + draft_and_send_self_resolution_invitation cannot
interleave; update references in this flow (prior_offered, classify_to_decision
-> SuggestSelfResolutionWithSummary branch, and
draft_and_send_self_resolution_invitation) to rely on the atomic DB guard and
honor the insert result to avoid duplicate invitations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 84522757-2ee7-44d1-800a-dec2d6e76f67
📒 Files selected for processing (8)
config.sample.tomlspecs/005-cooperative-self-resolution/contracts/audit-events.mdspecs/005-cooperative-self-resolution/contracts/classifier-output.mdspecs/005-cooperative-self-resolution/contracts/template-bundle.mdsrc/db/mediation.rssrc/db/mediation_events.rssrc/mediation/follow_up.rssrc/mediation/policy.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- specs/005-cooperative-self-resolution/contracts/classifier-output.md
- src/db/mediation_events.rs
Verified each finding against the current code; fixed two real
issues, skipped a re-flagged TOCTOU concern with the same
justification as before.
Real fixes:
- mediation::self_resolution: add a public
`SelfResolutionTemplates::resolve_effective_language(...)` helper
that returns the language code `render_for` will actually use
(the input when the bundle has a matching section; the
`fallback_language` otherwise; `None` when even the fallback is
missing). The existing `entry_for(...)` is rewritten to delegate
to this resolver so the lookup logic has one source of truth.
- mediation::follow_up::draft_and_send_self_resolution_invitation:
resolve the effective language for each party BEFORE rendering
+ recording the audit row, and use those resolved codes in:
* the `self_resolution_offered` audit row's
`payload_json.languages.{buyer,seller}` fields, and
* the `cooperative_case_detected` tracing event (alongside the
raw classifier output, so operators can see when the fallback
kicked in).
Without this fix, the audit recorded the raw classifier output
(e.g. `"de"`) while `render_for` had silently fallen back to the
bundle's `fallback_language` (e.g. `"en"`). Forensic replay (per
`quickstart.md`) re-renders the message body from the audit row's
`languages.<party>` field — using the raw value would reproduce a
different message than the one the party actually saw.
Documentation fix:
- contracts/template-bundle.md: added a `text` language tag to the
`{template} {human_assistance_optin}` fenced block (markdownlint
MD040).
Skipped (re-flagged for the third time, same answer as before):
- The TOCTOU concern on `policy::evaluate`'s `prior_offered`
predicate vs. the later write in
`draft_and_send_self_resolution_invitation`. In single-process
operation the engine serialises `advance_session_round` calls
through `Arc<AsyncMutex<Connection>>` (the predicate read and
the eventual write inside `draft_and_send_self_resolution_invitation`'s
transaction both run under the same async task), so two
concurrent calls cannot both pass the predicate and both write.
Adding a UNIQUE partial index to defend against multi-process /
HA deploys would require a new SQL migration which the spec's
`plan.md` explicitly forbids; HA is out of scope for this
feature.
Tests:
- 3 new unit tests on `resolve_effective_language` (known code,
unknown code falls back, structurally-invalid bundle returns
`None`). Total 295 lib tests + integration suite pass; clippy
clean.
This finding has been re-flagged four times now. The predicate
read + write window is not racy in single-process operation, and
the structural invariant that makes it safe lives several call
sites away from the predicate itself — making it easy to miss on
review. Add an inline comment at the predicate site explaining:
1. The mediation engine spawns exactly one tokio task that runs
ticks sequentially.
2. `run_ingest_tick` processes sessions serially via
`while let Some(res) = fetchers.join_next().await { ...
advance_session_round(...).await }`.
3. A per-session UNIQUE partial index would be the belt-and-braces
defence for HA / multi-process, but HA is out of scope per
`plan.md` and that file also forbids new SQL migrations as a
feature goal.
The reviewer's referenced "escalation_dispatches v5 pattern" is
materially different: v5's UNIQUE INDEX is on the
`escalation_dispatches` side-table (the dispatcher's own bookkeeping),
not on the `mediation_events` audit log. The matching audit row
(`escalation_dispatched`) has no UNIQUE constraint of its own. There
is no analogous side-table for self-resolution to apply that
pattern to. A blanket `UNIQUE(session_id, kind)` on `mediation_events`
would break other kinds that legitimately occur multiple times
(`classification_produced` per round, `inbound_ingested` per
inbound, `state_transition` per transition). A partial UNIQUE
index scoped to `kind = 'self_resolution_offered'` would work but
needs a migration `plan.md` forbids.
No code change beyond the comment. Tests still pass (295 lib +
integration suite); clippy clean.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/mediation/follow_up.rs (1)
513-556:⚠️ Potential issue | 🟠 MajorKeep the session retryable if post-invitation summary delivery fails.
Line 517 moves the session to
Classifiedbeforedeliver_summary(), and Lines 542-556 return on error without a compensating transition. The gate at Lines 148-150 only revisitsAwaitingResponseor post-invitationSummaryDelivered, so one summary failure leaves this session permanently skipped on later ticks.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mediation/follow_up.rs` around lines 513 - 556, The code sets the session to MediationSessionState::Classified before calling deliver_summary(), but on deliver_summary() error it returns without reverting state, leaving the session non-retryable; update the Err(e) branch so before calling handle_reasoning_failure() (or before returning) you acquire the same conn lock and call db::mediation::set_session_state(...) to restore the session to MediationSessionState::AwaitingResponse (using super::current_ts_secs()?) so the gate that revisits AwaitingResponse can retry; reference the functions/ids set_session_state, deliver_summary, handle_reasoning_failure, and MediationSessionState::Classified/AwaitingResponse when making the change.src/mediation/self_resolution.rs (1)
134-147:⚠️ Potential issue | 🟠 MajorDon’t relay bundle diagnostics to parties.
If
entry_for()returnsNone, Lines 137-146 turn an internal invariant/configuration failure into user-visible chat content. The parser already rejects malformed bundles, and the missing-template path is supposed to inert this feature, sorender_for()should returnNone/Errhere and let the dispatch layer log + skip instead of publishing the repository path to a buyer/seller.Possible shape
-pub fn render_for(language_code: Option<&str>, templates: &SelfResolutionTemplates) -> String { - match templates.entry_for(language_code) { - Some(entry) => format!("{} {}", entry.template, entry.human_assistance_optin), - None => { - String::from( - "[serbero: self-resolution template bundle is missing the configured fallback language; \ - please ask the operator to verify prompts/phase3-self-resolution.md]", - ) - } - } +pub fn render_for( + language_code: Option<&str>, + templates: &SelfResolutionTemplates, +) -> Option<String> { + templates + .entry_for(language_code) + .map(|entry| format!("{} {}", entry.template, entry.human_assistance_optin)) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/mediation/self_resolution.rs` around lines 134 - 147, Change render_for to stop emitting internal diagnostics to users: make pub fn render_for(language_code: Option<&str>, templates: &SelfResolutionTemplates) -> Option<String> (or Result<Option<String>, _> if you prefer errors) and return Some(format!(...)) when templates.entry_for(language_code) yields Some(entry), but return None when entry_for returns None. Update the callers of render_for (the dispatch/engine layer that previously forwarded the string) to treat None as "feature inert" — log the configuration invariant locally with context and skip sending any user-visible content — so no internal bundle paths or diagnostics are relayed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/mediation/follow_up.rs`:
- Around line 513-556: The code sets the session to
MediationSessionState::Classified before calling deliver_summary(), but on
deliver_summary() error it returns without reverting state, leaving the session
non-retryable; update the Err(e) branch so before calling
handle_reasoning_failure() (or before returning) you acquire the same conn lock
and call db::mediation::set_session_state(...) to restore the session to
MediationSessionState::AwaitingResponse (using super::current_ts_secs()?) so the
gate that revisits AwaitingResponse can retry; reference the functions/ids
set_session_state, deliver_summary, handle_reasoning_failure, and
MediationSessionState::Classified/AwaitingResponse when making the change.
In `@src/mediation/self_resolution.rs`:
- Around line 134-147: Change render_for to stop emitting internal diagnostics
to users: make pub fn render_for(language_code: Option<&str>, templates:
&SelfResolutionTemplates) -> Option<String> (or Result<Option<String>, _> if you
prefer errors) and return Some(format!(...)) when
templates.entry_for(language_code) yields Some(entry), but return None when
entry_for returns None. Update the callers of render_for (the dispatch/engine
layer that previously forwarded the string) to treat None as "feature inert" —
log the configuration invariant locally with context and skip sending any
user-visible content — so no internal bundle paths or diagnostics are relayed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 7dfcefaa-b2ad-4f2d-b645-6aafeb8149a0
📒 Files selected for processing (3)
specs/005-cooperative-self-resolution/contracts/template-bundle.mdsrc/mediation/follow_up.rssrc/mediation/self_resolution.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- specs/005-cooperative-self-resolution/contracts/template-bundle.md
Even though single-process operation is safe (sequential engine
task → sequential per-session loop → AsyncMutex on Connection),
the previous fix lived only in a comment at the predicate site.
This commit moves the defence to the actual write site so the
guard is co-located with the critical section and survives any
future architectural change.
Implementation:
- `draft_and_send_self_resolution_invitation` now returns
`Result<bool>` instead of `Result<()>`:
* `Ok(true)` — invitation committed and published.
* `Ok(false)` — duplicate detected at write time; transaction
was rolled back, no gift-wraps published.
* `Err(_)` — genuine failure (failure-counter path applies).
- The transaction now re-checks
`db::mediation_events::session_has_self_resolution_offered(&tx,
session_id)` BEFORE writing. If a row exists (another path won
the race), the helper drops the transaction without committing,
rolling back the two `mediation_messages` rows that would have
been inserted, and returns `Ok(false)`.
- The dispatch arm in `advance_session_round` interprets the
return value: on `Ok(false)` it skips the pre-flip +
`deliver_summary` (the other path already handles that side of
the dispatch) and just advances `round_count_last_evaluated` so
the same fresh inbound isn't re-classified next tick.
This is option 3 from the reviewer's suggestions ("serialize
per-session writes with a lock to ensure 'fire at most once per
session'"). The serialization comes from the existing
`Arc<AsyncMutex<Connection>>` plus the in-TX re-check, not from a
UNIQUE constraint (which would require a SQL migration that
`plan.md` forbids as a feature goal). The `escalation_dispatches`
v5 pattern referenced in earlier reviews protects a separate
side-table — not the audit log; there is no analogous side-table
for self-resolution to apply that pattern to.
Tests:
- All 295 lib + integration tests still pass; clippy clean.
Verified the finding against current code. The placeholder string was unreachable in normal flow (the `templates_present` gate in `policy::evaluate` falls the cooperative branch through to legacy Summarize whenever `by_language.is_empty()`, and the parser rejects bundles whose `fallback_language` has no matching section at load time), but emitting a diagnostic operator-message into a party's chat on any future bypass would be a worse failure mode than skipping the branch. Tightened the API: - `mediation::self_resolution::render_for` now returns `Option<String>`. It returns `None` when `entry_for(...)` is `None` — i.e. when the bundle has neither the requested language nor the configured fallback. The diagnostic-placeholder string is gone. - `mediation::follow_up::draft_and_send_self_resolution_invitation` matches on the new return shape: when either party's render returns `None`, it logs a `warn!` and returns `Ok(false)` (the duplicate-skip outcome from the previous commit), so the dispatch caller naturally skips the publishes + state walk + the cooperative-summary delivery and just advances the evaluator marker. The party-facing chat never sees the placeholder. - Updated unit tests in `self_resolution.rs` (`.expect(...)` / assert `is_none()`) and the keyword-audit test in `tests/phase3_self_resolution_template_audit.rs` to handle the new return type. All 295 lib tests + integration suite still pass; clippy clean.
Verified findings: - self_resolution.rs:134-147 (render_for): already addressed in commit 5110ee6 — `render_for` now returns `Option<String>` and the dispatch caller treats `None` as "feature inert" (logs + returns `Ok(false)` so the cooperative branch is skipped). No further change needed. - follow_up.rs:513-556 (state revert on deliver_summary failure): real bug. The cooperative dispatch arm pre-flips `awaiting_response → classified` so `deliver_summary`'s opening `classified → summary_pending` transition is legal. If `deliver_summary` then returns `Err`, the previous code called `handle_reasoning_failure` and returned without reverting. The session stays in `classified` forever — the gate at the top of `advance_session_round` only accepts `awaiting_response` or post-invitation `summary_delivered`, so subsequent ticks skip it and the session is non-retryable. Fix: - `models::mediation`: add `Classified → AwaitingResponse` as a legal recovery edge with a comment explaining the use case (a dispatch arm that pre-flipped to `classified` and then saw the summary delivery fail). Test pin in `allowed_transitions_pass`. - `mediation::follow_up::advance_session_round` (cooperative arm): on `deliver_summary` error, acquire the lock and call `db::mediation::set_session_state(..., MediationSessionState::AwaitingResponse, now)` BEFORE invoking `handle_reasoning_failure`. A clock-acquisition or set-state failure logs loudly but does NOT prevent `handle_reasoning_failure` from running, so the consecutive-failure counter still advances and can eventually escalate. Out of scope: the legacy `Summarize` arm has the same shape and the same pre-existing limitation (documented in `mediation/mod.rs:991` as a "Phase 11 limitation"). Touching it would expand this PR's scope; the same fix can land as a separate defensive commit if desired. Tests: all 295 lib + integration tests still pass; clippy clean.
Summary
When the classifier marks a session as
coordination_failure_resolvablewith confidence ≥ the configured threshold (default0.75), Serbero now sends a templated, language-matched invitation to both parties inviting them to coordinate the next step among themselves, with an explicit human-assistance opt-in. Static repo templates (prompts/phase3-self-resolution.md) — the LLM only decides whether the branch fires, never what to say.Motivated by the 2026-04-27 production transcript on dispute
096fb2e4-…where the seller confirmed receipt of fiat in Spanish, classifier producedcoordination_failure_resolvableconfidence 0.74, dispatched the existing solver-facing summary — but the parties heard nothing back. This branch closes that gap.Spec / plan / tasks live under
specs/005-cooperative-self-resolution/.Surface changes
MediationEventKind::SelfResolutionOffered— written before outbound dispatch, payload{confidence, languages}, references rationale id (FR-120 / no inlined text).EscalationTrigger::PartyRequestedHuman— fires when the classifier detects an explicit human-assistance request on a round following aself_resolution_offeredevent. Routes through the existing Phase 4 dispatcher.PolicyDecision::SuggestSelfResolutionWithSummary { confidence }— new arm; falls through to legacySummarizewhen any pre-condition fails (kill-switch off, sub-threshold, prior offered).SummaryDelivered → EscalationRecommendedso the opt-in can lift the session out of the post-summary state.[mediation]keys —self_resolution_threshold(f32, default0.75) andself_resolution_enabled(bool, defaulttrue, master kill-switch). Range-validated at load time.ClassificationResponseextended with three additiveserde(default)fields:human_requested,buyer_language,seller_language. Language fields ride through the structured response so dispatch doesn't need a separate Rust-side detector.cooperative_case_detected(on dispatch) +cooperative_case_closed_externally(ondispute_resolvedfor sessions that received the invitation) for SC-001 / SC-002 baseline measurement.Backwards compatibility
build_classification_prompt/parse_classificationfrom the OpenAI module, so the prompt + parser changes land in both.phase3-self-resolution.mdyet log a warning at startup and inert the branch (placeholder rendered string) instead of refusing to start.Translations
Initial set: English / Spanish / Portuguese. Adding a new language is a two-file change (append a
[xx]section toprompts/phase3-self-resolution.mdAND extend the banned-substring matrix intests/phase3_self_resolution_template_audit.rs); the parser rejects unknown sections at parse time.Test plan
cargo test— 287 lib tests + integration suite pass with zero regressions.cargo clippy --all-targets— clean.cargo fmt --all --check— clean.policy::evaluatecovering: cooperative branch fires, threshold-inclusive (R-007), sub-threshold falls through, kill-switch off (SC-007), one-shot guard (FR-006 / SC-006), human-requested short-circuit (FR-008), defence in depth (no prior invitation), no-lock-in (US3 acceptance).tests/phase3_self_resolution_template_audit.rsasserts the bundle parses, every language is covered by the banned-substring matrix, and rendered strings carry no fund-action keywords + always include the opt-in sentence.SelfResolutionTemplates::render_for(5 cases: known language / unknown fallback /Nonefallback / case+whitespace normalization / placeholder on missing fallback) and the prompt-bundle parser (6 cases).quickstart.md.Out of scope (per spec)
coordination_failure_resolvable.Summary by CodeRabbit
New Features
Configuration
Behavior
Documentation
Tests