Skip to content

feat(content-refinement): reviewer decision bands + Devil's Advocate concession guard#9

Open
Ar9av wants to merge 1 commit into
mainfrom
feature/reviewer-rubric-decision-bands
Open

feat(content-refinement): reviewer decision bands + Devil's Advocate concession guard#9
Ar9av wants to merge 1 commit into
mainfrom
feature/reviewer-rubric-decision-bands

Conversation

@Ar9av
Copy link
Copy Markdown
Owner

@Ar9av Ar9av commented May 26, 2026

What

Upgrades the content-refinement-agent simulated-reviewer loop with two deterministic mechanisms, both backward-compatible with the existing score_delta.py exit-code contract:

  1. 0-100 decision bands with a principled target-met early halt.
  2. A Devil's Advocate concession guard that stops the reviewer from sycophantically caving.

This is item #4 from the study of academic-research-skills, reimplemented to fit this repo (concepts only — no files copied; that repo is CC-BY-NC).

Why

The loop already had a 6-axis 0-100 rubric, anti-inflation caps, score deltas, and a prose-only DA protocol. Two gaps:

  • The reviewer's decision was free-form and not tied to the numeric score — it could drift (e.g. "Accept" at overall 62), and the loop had no absolute quality target, only relative deltas.
  • The DA concession rules ("concede only on strong rebuttal", "no consecutive concessions") were prose only — nothing stopped the simulated reviewer from quietly relaxing them and caving on a real weakness.

How

File Role
scripts/decision_band.py new — map overall_score → canonical band (Accept ≥80 / Minor 65–79 / Major 50–64 / Reject <50); importable + CLI
scripts/score_delta.py extended — emits decision_band_prev/curr; adds target-met halt (exit 5)
scripts/concession_guard.py new — deterministically enforces the DA concession protocol

Decision bands + target-met halt. Bands give the loop an absolute target on top of the relative accept/revert logic. When an accepted iteration reaches the Accept band (overall ≥ 80), score_delta.py returns the new exit code 5 (HALT_TARGET_MET) — promote the current draft and stop, rather than risk regressing it for marginal gains. Takes precedence over the plateau halt; --no-target-halt disables it (band still reported). Exit codes 0/1/2/4 are unchanged, so existing callers keep working.

DA concession guard. Reads a concession log (da_concessions.json) and re-derives which concessions are legitimate:

  • Rule 1 — evidence: a concession is valid only at rebuttal_score ≥ 4. Caving at ≤3 is rejected; the finding is restored to standing.
  • Rule 2 — no consecutive concessions (iron rule): a concession in a round immediately following another is rejected.
  • A standing CRITICAL (unresolved and not validly conceded) → exit 1 (BLOCK) → host treats the iteration as REVERT regardless of rubric scores. A rejected non-blocking concession → exit 2 (WARN) → DA must restate. Exit 3 = schema error.

Testing

  • py_compile clean on all three scripts.
  • decision_band.py: 81→Accept, 74.6→Minor, 58→Major, 42→Reject; reads overall_score from a real score.json.
  • score_delta.py: accept below band → exit 0 (Major→Minor); reaches band → HALT_TARGET_MET exit 5; --no-target-halt → exit 0; regression → exit 1. Regression-tested on the real workspace iter1iter2 files: still ACCEPT_IMPROVED, exit 0, behavior preserved.
  • concession_guard.py: clear→0/PROCEED, caving-on-critical→1/REVERT (concession rejected, critical stands), consecutive-concessions-on-majors→2/DA_RESTATE, unresolved-critical→1/REVERT, bad severity→3.

Docs

  • references/reviewer-rubric.md — decision-band table + decision_band output field
  • references/da-reviewer.md — concession-log schema + guard verdict→action table
  • references/halt-rules.md — exit 5, decision bands, guard override of ACCEPT→REVERT
  • SKILL.md — wired into Steps 1/2/5/6, Resources, and frontmatter description

Note: branched off main, independent of #8 (which touches literature-review-agent). No overlap.

Make the simulated-reviewer halt logic more principled with two deterministic
additions, both backward compatible with the existing score_delta exit codes.

Decision bands (decision_band.py):
- Map the 0-100 overall_score to a canonical band: Accept (>=80), Minor
  Revision (65-79), Major Revision (50-64), Reject (<50). Importable + CLI.
- score_delta.py now emits decision_band_prev/curr on every comparison and adds
  a target-met halt (exit 5): when an accepted iteration reaches the Accept
  band, the loop promotes the current draft and stops rather than risk
  regressing it chasing marginal gains. Takes precedence over the plateau halt;
  disable with --no-target-halt. Existing exit codes 0/1/2/4 unchanged.

DA concession guard (concession_guard.py):
- Enforce the Devil's Advocate concession-threshold protocol deterministically
  so the simulated reviewer cannot sycophantically cave. Reads a concession log
  and rejects any concession made at rebuttal_score < 4 or in a round
  immediately following another concession, restoring the affected finding to
  'standing'. A standing CRITICAL blocks acceptance (exit 1 -> host treats the
  iteration as REVERT); a rejected non-blocking concession is a WARN (exit 2).

Docs: reviewer-rubric.md (bands + decision_band field), da-reviewer.md
(concession-log schema + guard verdict table), halt-rules.md (exit 5 + guard
override), and SKILL.md (wired into Steps 1/2/5/6, Resources, description).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant