openclaw · coygeek · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026
diff --git a/.github/workflows/validate.yml b/.github/workflows/validate.yml
@@ -26,7 +26,9 @@ jobs:
       - name: Check shell scripts
         run: bash -n skills/autoreview/scripts/test-review-harness
       - name: Check Python scripts
-        run: python3 -m py_compile skills/autoreview/scripts/autoreview skills/autoreview/scripts/test-review-harness.py
+        run: python3 -m py_compile skills/autoreview/scripts/autoreview skills/autoreview/scripts/test-review-harness.py skills/autoreview/scripts/autoreview_test.py
+      - name: Run Python tests
+        run: python3 -m unittest skills/autoreview/scripts/autoreview_test.py
       - name: Check Node scripts
         run: node --check skills/agent-transcript/scripts/agent-transcript
       - name: Run Node tests

diff --git a/skills/autoreview/SKILL.md b/skills/autoreview/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: autoreview
-description: "Run a structured code review (Codex default, Claude optional) as a closeout check on a local or PR branch before commit or ship."
+description: "Run a structured code review (Codex default, optional Claude, Droid, Copilot, or Cursor) as a closeout check on a local or PR branch before commit or ship."
 ---
 
 # Auto Review
@@ -11,7 +11,7 @@ Codex review is the default when no engine is set. It usually delivers the best
 
 Use when:
 
-- user asks for Codex review / Claude review / autoreview / second-model review
+- user asks for Codex review / Claude review / Cursor review / autoreview / second-model review
 - after non-trivial code edits, before final/commit/ship
 - reviewing a local branch or PR branch after fixes
 
@@ -29,7 +29,7 @@ Use when:
 - For security-audit suppression changes, verify accepted findings remain auditable: suppressed findings stay in structured output, active output keeps an unsuppressible suppression notice, and aggregate findings cannot hide unrelated active risk.
 - Never switch or override the requested review engine/model. If the review hits model capacity, retry the same command a few times with the same engine/model.
 - Be patient with large bundles. Structured review can take up to 30 minutes while the model call is active, especially with Codex tools or web search.
-- Treat heartbeat lines like `review still running: ... elapsed=... pid=...` as healthy progress, not a hang. Let the helper continue while heartbeats are advancing. Pass `--stream-engine-output` when live engine text is useful; Codex and Claude filter tool/file chatter, other engines pass raw output through.
+- Treat heartbeat lines like `review still running: ... elapsed=... pid=...` as healthy progress, not a hang. Let the helper continue while heartbeats are advancing. Pass `--stream-engine-output` when live engine text is useful; Codex, Claude, and Cursor filter noisy tool/file chatter, other engines pass raw output through.
 - Do not kill a review just because it has been quiet for 2-5 minutes, or because it is still running under the 30-minute window. Inspect the process only after missing multiple expected heartbeats, after 30 minutes, or after an obviously failed subprocess; prefer letting the same helper command finish.
 - Tools are useful in review mode. The helper allows read-only inspection tools and web search by default so reviewers can check dependency contracts, upstream docs, and current behavior.
 - Security perspective is always included, but it should not cripple legitimate functionality. Report security findings only when the change creates a concrete, actionable risk or removes an important safety check.
@@ -144,6 +144,10 @@ Run multiple reviewers against one frozen bundle:
 "$AUTOREVIEW" --panel
 ```
 
+`--reviewers all` intentionally excludes Cursor. Cursor review can require
+additional trust flags when workspace instructions or local MCP config are
+present, so opt into Cursor explicitly.
+
 Set reviewer models and thinking/effort explicitly:
 
 ```bash
@@ -158,7 +162,41 @@ Inline syntax is also supported:
 
 Codex maps thinking to `model_reasoning_effort` and accepts `low`, `medium`,
 `high`, or `xhigh`. Claude maps thinking to `--effort` and also accepts `max`.
-Engines without a real thinking knob reject `--thinking`.
+Cursor supports `--model` but rejects `--thinking`. Engines without a real
+thinking knob reject `--thinking`.
+
+## Cursor review
+
+Run Cursor explicitly:
+
+```bash
+"$AUTOREVIEW" --engine cursor --model auto
+```
+
+Cursor review uses `cursor-agent` in `--mode ask` with `--workspace <repo>`,
+`--trust`, and `--sandbox enabled` by default. Authenticate with
+`cursor-agent login` or `CURSOR_API_KEY`.
+
+Optional Cursor-specific controls:
+
+- `CURSOR_BIN` or `--cursor-bin` to select the CLI binary
+- `AUTOREVIEW_CURSOR_SANDBOX=enabled|disabled` or `--cursor-sandbox ...`
+- `AUTOREVIEW_CURSOR_APPROVE_MCPS=1`, `--cursor-approve-mcps`, or `--no-cursor-approve-mcps` for trusted MCP-enabled environments
+- `AUTOREVIEW_CURSOR_ALLOW_WORKSPACE_INSTRUCTIONS=1`, `--cursor-allow-workspace-instructions`, or `--no-cursor-allow-workspace-instructions` when the repo contains Cursor rules, `AGENTS.md`, `CLAUDE.md`, or local MCP config
+
+Cursor caveats:
+
+- `--thinking`, `--no-tools`, and `--no-web-search` are not supported
+- Cursor CLI currently documents no `--ignore-rules` / safe-mode equivalent for
+  review runs
+- Cursor CLI also does not document a schema flag like Codex/Claude, so the
+  helper validates Cursor JSON locally and retries malformed structured output a
+  few times before failing
+- by default the helper fails closed when workspace `.cursor/rules`,
+  `AGENTS.md`, `CLAUDE.md`, or local MCP config would be loaded; opt in with
+  `--cursor-allow-workspace-instructions` only for trusted repos
+- local MCP config also fails closed unless `--cursor-approve-mcps` is set
+- keep MCP auto-approval deliberate, even in trusted repos
 
 ## Context Efficiency
 
@@ -196,15 +234,15 @@ The helper:
 - accepts `--mode uncommitted` as an alias for `--mode local`
 - otherwise uses current PR base if `gh pr view` works
 - otherwise uses `origin/main` for non-main branches
-- supports `--engine codex`, `claude`, `droid`, and `copilot`; default is `AUTOREVIEW_ENGINE` or `codex`; Codex should remain the default when nothing is set
+- supports `--engine codex`, `claude`, `droid`, `copilot`, and `cursor`; default is `AUTOREVIEW_ENGINE` or `codex`; Codex should remain the default when nothing is set
 - resolves bare `git`, `gh`, reviewer, and PowerShell shell commands from absolute `PATH` entries only, never from the reviewed checkout; explicit relative `--*-bin` paths are resolved from the reviewed repository root
 - use `--mode commit --commit <ref>` for already-committed work, especially clean `main` after landing
 - should be left in `--mode auto` or forced to `--mode branch` for PR/branch work; do not force `--mode local` after committing
 - writes only to stdout unless `--output`, `--json-output`, or live streamed engine stderr is set
 - supports `--dry-run`, `--parallel-tests`, `--parallel-tests-shell`, `--prompt`, `--prompt-file`, `--dataset`, `--no-tools`, `--no-web-search`, and commit refs
-- supports `--stream-engine-output` or `AUTOREVIEW_STREAM_ENGINE_OUTPUT=1` for live engine text while preserving structured validation; Codex and Claude hide tool/file event details, emit compact activity summaries, and report usage at turn completion
-- supports opt-in review panels with `--panel` / `--reviewers`, plus per-engine `--model` and `--thinking`
-- allows read-only tools and web search by default where the selected CLI supports them; forbids nested review in the prompt; Codex is run through `codex exec` with read-only sandbox and structured output
+- supports `--stream-engine-output` or `AUTOREVIEW_STREAM_ENGINE_OUTPUT=1` for live engine text while preserving structured validation; Codex, Claude, and Cursor hide noisy tool/status events, emit compact activity summaries, and report usage at turn completion
+- supports opt-in review panels with `--panel` / `--reviewers`, plus per-engine `--model` and `--thinking` where the selected engine supports it
+- allows read-only tools and web search by default where the selected CLI supports them; forbids nested review in the prompt; Codex is run through `codex exec` with read-only sandbox and structured output, and Cursor is run through `cursor-agent --mode ask --workspace <repo> --sandbox ...` with session/request metadata echoed for traceability and a fail-closed workspace-instruction gate
 - prints `review still running: <engine> elapsed=<seconds>s pid=<pid>` to stderr at long-running intervals while waiting for the selected review engine, unless streamed output or compact Codex activity has been visible recently
 - prints `autoreview clean: no accepted/actionable findings reported` when the selected review command exits 0
 - exits nonzero when accepted/actionable findings are present