feat: route orchestrator selection through dossier and workbench#381
Conversation
|
Findings
|
82581ab to
bdfcfd9
Compare
|
Findings No architectural blockers or product-adherence issues found. The updated diff removes harness-only dossier/workbench paths from the selection prompt, keeps selection prompt input semantic, fingerprints selection with both the dossier-summary dependency and artifact digests, and validates dependency-task mismatches before resume/retry. |
|
The test additions are directionally right, but they still leave two important regressions unproven.
|
bdfcfd9 to
933b8d1
Compare
| @@ -5323,7 +5607,6 @@ func assertPromptOmitsLocalAgentSourceProvenance(t *testing.T, prompt string, so | |||
| for _, forbidden := range []string{ | |||
There was a problem hiding this comment.
🔵 Low (harness-engineering:harness-enforcement-reviewer): Removes "fingerprint" from the forbidden-strings list in assertPromptOmitsLocalAgentSourceProvenance without an explanation comment. The new TestSelectionOnlyPromptPreservesRoutingContractWithoutReviewerPromptBodies guards against the more-specific "fingerprint_inputs" string, but the original broad guard against bare "fingerprint" is now gone. If the removal is intentional (e.g., the word legitimately appears in some non-harness context), a one-line comment explaining the rationale would restore confidence in the invariant.
Reply to this thread when addressed.
| AgentID string | ||
| Rationale string | ||
| Files []string | ||
| AgentID string |
There was a problem hiding this comment.
🔵 Low (harness-engineering:harness-knowledge-reviewer): AllowedFiles is a new concept in the selection output contract with semantics distinct from Files — validation enforces it must be a subset of changed files, but the behavioral distinction (what an agent is 'allowed to access' vs what it 'should review') is not captured in any versioned doc. If there are docs describing the selection schema or reviewer agent authoring guide, they should be updated to explain when AllowedFiles is appropriate and how reviewers should interpret it vs Files.
Reply to this thread when addressed.
| if err != nil { | ||
| return llm.Selection{}, sessionDraft{}, ledger.Session{}, err | ||
| } | ||
| dependencyTaskIDs := []string{dossierSummaryTaskID} |
There was a problem hiding this comment.
🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): The nested append(append([]string(nil), dependencyTaskIDs...), promptDeps...) is a correct but non-obvious Go idiom for safe slice concatenation. Since slices is already imported in this file, slices.Concat(dependencyTaskIDs, promptDeps) expresses the intent directly and is immediately readable.
Reply to this thread when addressed.
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: 933b8d1
Approved with 4 non-blocking suggestions below. Address at your discretion.
Summary
| Reviewer | Findings |
|---|---|
| harness-engineering:harness-enforcement-reviewer | 1 |
| harness-engineering:harness-knowledge-reviewer | 1 |
| harness-engineering:harness-self-documenting-code-reviewer | 2 |
harness-engineering:harness-enforcement-reviewer (1 findings)
💡 Suggestion - internal/pipeline/pipeline_test.go:5607
Removes
"fingerprint"from the forbidden-strings list inassertPromptOmitsLocalAgentSourceProvenancewithout an explanation comment. The newTestSelectionOnlyPromptPreservesRoutingContractWithoutReviewerPromptBodiesguards against the more-specific"fingerprint_inputs"string, but the original broad guard against bare"fingerprint"is now gone. If the removal is intentional (e.g., the word legitimately appears in some non-harness context), a one-line comment explaining the rationale would restore confidence in the invariant.
harness-engineering:harness-knowledge-reviewer (1 findings)
💡 Suggestion - internal/llm/contracts.go:38
AllowedFiles is a new concept in the selection output contract with semantics distinct from Files — validation enforces it must be a subset of changed files, but the behavioral distinction (what an agent is 'allowed to access' vs what it 'should review') is not captured in any versioned doc. If there are docs describing the selection schema or reviewer agent authoring guide, they should be updated to explain when AllowedFiles is appropriate and how reviewers should interpret it vs Files.
harness-engineering:harness-self-documenting-code-reviewer (2 findings)
💡 Suggestion - internal/pipeline/pipeline.go:1097
The nested
append(append([]string(nil), dependencyTaskIDs...), promptDeps...)is a correct but non-obvious Go idiom for safe slice concatenation. Sinceslicesis already imported in this file,slices.Concat(dependencyTaskIDs, promptDeps)expresses the intent directly and is immediately readable.
💡 Suggestion - internal/pipeline/pipeline.go:2560
In
selectionThreadPrompts, the anchor key for the map lookup is computed inline inside theifcondition —summaryByAnchor[dossierInlineThreadAnchorKey(thread.Path, string(thread.Side), thread.Line, string(thread.SubjectType))]— while the loop that builds the map extracts the same computation into a named variablekey. Extracting athreadAnchorKeylocal variable before theifwould make the lookup consistent with the indexing pattern above it and make the condition easier to scan.
1 info-level observations excluded. Run with --verbose to include.
3 PR discussion threads considered.
Completed in 3m 29s | $2.10 | sonnet | daemon 0.2.132 | Glorfindel
| Field | Value |
|---|---|
| Model | sonnet |
| Reviewers | hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor |
| Engine | claude · sonnet |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 3m 29s wall · 3m 25s compute (Reviewers: 2m 22s · Synthesis: 1m 03s) |
| Cost | $2.10 (estimated) |
| Tokens | 370.0k in / 21.9k out |
| Turns | 12 |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost |
|---|---|---|---|---|---|---|
| hybrid-synthesis | sonnet | 37.5k | 2.7k | 13.1k | 24.3k (1h) | $0.19 |
| harness-engineering:harness-architecture-reviewer | sonnet | 63.0k | 686 | 13.1k | 49.9k (1h) | $0.31 |
| harness-engineering:harness-enforcement-reviewer | sonnet | 67.5k | 4.8k | 13.1k | 54.3k (1h) | $0.40 |
| harness-engineering:harness-knowledge-reviewer | sonnet | 64.9k | 2.3k | 13.1k | 51.7k (1h) | $0.35 |
| harness-engineering:harness-self-documenting-code-reviewer | sonnet | 71.6k | 8.3k | 13.1k | 58.5k (1h) | $0.48 |
| security:security-code-auditor | sonnet | 65.5k | 3.1k | 13.1k | 52.4k (1h) | $0.37 |
Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.
Note: Posted 3 inline comments. 1 finding remained in the summary because GitHub could not attach them to the current diff.
Summary
allowed_filesselection support and tighten tests around prompt shape and budget behaviorCloses #371