feat: prepare pinned read-only PR workbench#380
Conversation
1fb8e3a to
c7001a1
Compare
|
Findings
|
a1a06ae to
60332b7
Compare
|
Findings
|
60332b7 to
bc11571
Compare
|
Findings
|
|
The suite is directionally good, but it still leaves a few explicit ticket guarantees unproven.
|
bc11571 to
297e2ea
Compare
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: 297e2ea
Summary
| Reviewer | Findings |
|---|---|
| harness-engineering:harness-enforcement-reviewer | 2 |
| harness-engineering:harness-knowledge-reviewer | 2 |
| harness-engineering:harness-self-documenting-code-reviewer | 2 |
| security:security-code-auditor | 1 |
harness-engineering:harness-enforcement-reviewer (2 findings)
internal/pipeline/pipeline.go:50
The package-level
workbenchAutoUnlockOnExit boolis a test-only escape hatch embedded in production code.shouldAutoUnlockWorkbenchOnExitORs it withOptions.AutoUnlockWorkbenchOnExit, creating hidden production-visible state that bypasses the options pattern used everywhere else. No structural guard prevents non-test code from setting it, and co-loading this package into any in-process test suite (fuzz harness, integration suite) silently disables the read-only workbench guarantee for all pipeline runs in that process. Remove the global and route all unlock control throughOptions.AutoUnlockWorkbenchOnExit; thetestRuntimeOptions/ harness-helper path already wires this correctly for most call sites.
💡 Suggestion - internal/cmd/noleak/noleak_test.go:1119
The
shouldSkipOwnedPathexclusion forworkbench/repopermanently removes this subtree from the owned-files leak detector. The compensatingTestDefaultRemoveAllUnlocksReadOnlyArtifactTreecovers the unlock+remove primitive in isolation but does not verify the full end-to-end cleanup lifecycle. Consider asserting after the noleak run that the workbench directory inside the artifact root does not exist, so structural cleanup verification is preserved without needing the skip.
harness-engineering:harness-knowledge-reviewer (2 findings)
💡 Suggestion - internal/pipeline/pipeline.go:2855
The
workbenchMetadataArtifactstruct introduces a durable JSON artifact schema (schema_version: 1) with fields likefingerprint_inputs,checkout_mode, andsource_repo_rootthat are not documented. Thecheckout_modeconstantartifact-cloneimplies future modes are possible, making this an extensible format. Agents consumingworkbench/metadata.jsonin future runs have no versioned reference for what these fields mean or howfingerprint_inputsis used to identify stale workbenches.
💡 Suggestion - internal/pipeline/pipeline.go:48
The workbench contract — read-only checkout at
workbench/repo, writable scratch atworkbench/scratch, pinned to head SHA, with source repo provenance validation — is expressed only in code. A shortdocs/entry describing the workbench layout and its provenance validation rules would make it discoverable to future agent sessions without requiring reverse-engineering of the pipeline.
harness-engineering:harness-self-documenting-code-reviewer (2 findings)
internal/pipeline/pipeline.go:2997
In
resolveWorkbenchBaseRemoteURL, the parsed host is silently discarded via blank identifier:_, owner, repo, _, err := parseWorkbenchRemoteURL(originURL). A reader skimming this function may assume host validation occurs here — it does not; only owner/repo are checked. Two open PR discussion threads confirm this is a real correctness gap: host validation is bypassed when both base and head commits already exist locally, meaning a checkout from a different host can be accepted as the PR workbench. Extract the host into a named variable and either assert it or explicitly comment why it is deferred.
💡 Suggestion - internal/cmd/reviewcmd/reviewcmd_test.go:2723
testRuntimeOptionsdescribes where the function is used (tests) but not what it does. The body only setsAutoUnlockWorkbenchOnExit: true; a reader must inspect it to know that. A name likewithAutoUnlockWorkbenchorruntimeOptsWithAutoUnlockwould make the transformation self-evident at every call site.
security:security-code-auditor (1 findings)
💡 Suggestion - internal/pipeline/pipeline.go:3009
branch.Ref(sourced from the GitHub API response) is passed as a positional argument togit fetch --no-tags <remote> <ref>without local validation.exec.Commandprevents shell injection, but git treats arguments beginning with-as flags. A ref value such as--upload-pack=<path>would cause git to execute an arbitrary binary. GitHub validates branch names in practice, but adding a guard that assertsbranch.Refstarts withrefs/(or at minimum does not start with-) before passing it to git provides defense in depth.
4 PR discussion threads considered.
Completed in 5m 13s | $2.79 | sonnet | daemon 0.2.132 | Glorfindel
| Field | Value |
|---|---|
| Model | sonnet |
| Reviewers | hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor |
| Engine | claude · sonnet |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 5m 13s wall · 5m 10s compute (Reviewers: 3m 40s · Synthesis: 1m 29s) |
| Cost | $2.79 (estimated) |
| Tokens | 455.9k in / 33.7k out |
| Turns | 12 |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost |
|---|---|---|---|---|---|---|
| hybrid-synthesis | sonnet | 42.7k | 4.8k | 13.1k | 29.6k (1h) | $0.25 |
| harness-engineering:harness-architecture-reviewer | sonnet | 78.4k | 1.8k | 13.1k | 65.2k (1h) | $0.42 |
| harness-engineering:harness-enforcement-reviewer | sonnet | 80.8k | 3.9k | 13.1k | 67.7k (1h) | $0.47 |
| harness-engineering:harness-knowledge-reviewer | sonnet | 79.1k | 2.4k | 13.1k | 66.0k (1h) | $0.44 |
| harness-engineering:harness-self-documenting-code-reviewer | sonnet | 85.4k | 7.9k | 13.1k | 72.2k (1h) | $0.56 |
| security:security-code-auditor | sonnet | 89.5k | 12.9k | 13.1k | 76.3k (1h) | $0.66 |
Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.
297e2ea to
b0beef8
Compare
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: b0beef8 | Previous: 297e2ea (incremental)
Summary
No issues found.
11 PR discussion threads considered.
Completed in 3m 38s | $8.09 | sonnet | daemon 0.2.132 | Glorfindel
| Field | Value |
|---|---|
| Model | sonnet |
| Mode | Re-review · Cycle 2 · Session resumed |
| Reviewers | hybrid-synthesis, documentation:docs-reviewer, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor |
| Engine | claude · sonnet |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 3m 38s wall · 1m 31s compute (Reviewers: 1m 10s · Synthesis: 21s) |
| Cost | $8.09 (estimated) |
| Tokens | 1395.7k in / 46.1k out |
| Turns | 26 |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost |
|---|---|---|---|---|---|---|
| hybrid-synthesis | sonnet | 101.2k | 5.6k | 26.3k | 74.9k (1h) | $0.54 |
| documentation:docs-reviewer | sonnet | 34.5k | 2.0k | 13.1k | 21.3k (1h) | $0.16 |
| harness-engineering:harness-architecture-reviewer | sonnet | 238.7k | 2.9k | 26.3k | 212.4k (1h) | $1.33 |
| harness-engineering:harness-enforcement-reviewer | sonnet | 247.1k | 6.1k | 26.3k | 220.8k (1h) | $1.42 |
| harness-engineering:harness-knowledge-reviewer | sonnet | 240.9k | 3.4k | 26.3k | 214.6k (1h) | $1.35 |
| harness-engineering:harness-self-documenting-code-reviewer | sonnet | 261.2k | 11.2k | 26.3k | 234.9k (1h) | $1.59 |
| security:security-code-auditor | sonnet | 272.2k | 14.9k | 26.3k | 245.9k (1h) | $1.71 |
Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.
b0beef8 to
f338537
Compare
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: f338537 | Previous: b0beef8 (incremental)
Approved with 1 non-blocking suggestion below. Address at your discretion.
Summary
| Reviewer | Findings |
|---|---|
| harness-engineering:harness-self-documenting-code-reviewer | 1 |
harness-engineering:harness-self-documenting-code-reviewer (1 findings)
💡 Suggestion - internal/pipeline/pipeline_test.go:44
liveForTestinternally setsopts.AutoUnlockWorkbenchOnExit = true, but every call site also setsAutoUnlockWorkbenchOnExit: trueexplicitly in theOptionsliteral, creating redundancy.dryRunForTestandselectionOnlyForTestcallers do not set the field — the wrapper handles it. The inconsistency makes a reader wonder whether the explicit field in theliveForTestliterals is load-bearing or redundant. Remove the explicitAutoUnlockWorkbenchOnExit: truefromliveForTestcall sites to match the pattern established by the other two wrappers.
11 PR discussion threads considered.
Completed in 3m 32s | $16.82 | sonnet | daemon 0.2.132 | Glorfindel
| Field | Value |
|---|---|
| Model | sonnet |
| Mode | Re-review · Cycle 3 · Session resumed |
| Reviewers | hybrid-synthesis, documentation:docs-reviewer, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor |
| Engine | claude · sonnet |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 3m 32s wall · 3m 00s compute (Reviewers: 2m 37s · Synthesis: 23s) |
| Cost | $16.82 (estimated) |
| Tokens | 2877.5k in / 59.9k out |
| Turns | 40 |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost |
|---|---|---|---|---|---|---|
| hybrid-synthesis | sonnet | 167.9k | 6.6k | 39.4k | 128.4k (1h) | $0.88 |
| documentation:docs-reviewer | sonnet | 76.4k | 2.2k | 26.3k | 50.1k (1h) | $0.34 |
| harness-engineering:harness-architecture-reviewer | sonnet | 492.1k | 3.1k | 39.4k | 452.7k (1h) | $2.77 |
| harness-engineering:harness-enforcement-reviewer | sonnet | 509.8k | 8.1k | 39.4k | 470.3k (1h) | $2.96 |
| harness-engineering:harness-knowledge-reviewer | sonnet | 496.1k | 3.6k | 39.4k | 456.6k (1h) | $2.81 |
| harness-engineering:harness-self-documenting-code-reviewer | sonnet | 540.2k | 18.9k | 39.4k | 500.8k (1h) | $3.30 |
| security:security-code-auditor | sonnet | 549.6k | 16.4k | 39.4k | 510.1k (1h) | $3.32 |
| discussion-summarizer | — | 45.5k | 1.0k | 3.7k | 41.8k (1h) | $0.45 |
Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.
| return SelectionOnly(ctx, opts, req) | ||
| } | ||
|
|
||
| func liveForTest(ctx context.Context, opts Options, req Request, run ledger.Run) (Result, error) { |
There was a problem hiding this comment.
🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): liveForTest internally sets opts.AutoUnlockWorkbenchOnExit = true, but every call site also sets AutoUnlockWorkbenchOnExit: true explicitly in the Options literal, creating redundancy. dryRunForTest and selectionOnlyForTest callers do not set the field — the wrapper handles it. The inconsistency makes a reader wonder whether the explicit field in the liveForTest literals is load-bearing or redundant. Remove the explicit AutoUnlockWorkbenchOnExit: true from liveForTest call sites to match the pattern established by the other two wrappers.
Reply to this thread when addressed.
Summary
Closes #370
Testing