Skip to content

feat: route orchestrator selection through dossier and workbench#381

Merged
rianjs merged 1 commit into
mainfrom
issue-371-orchestrator-dossier-workbench
Jun 24, 2026
Merged

feat: route orchestrator selection through dossier and workbench#381
rianjs merged 1 commit into
mainfrom
issue-371-orchestrator-dossier-workbench

Conversation

@rianjs

@rianjs rianjs commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

  • route orchestrator selection through dossier and workbench artifacts instead of raw diff hunks
  • preserve selection-task digest invalidation and durable dependency metadata for dossier summaries
  • add allowed_files selection support and tighten tests around prompt shape and budget behavior

Closes #371

@rianjs

rianjs commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Findings

  • Major: internal/pipeline/pipeline.go:2327 / internal/pipeline/pipeline.go:2432 serializes dossier artifact paths, root dirs, and SHA bookkeeping into the selection prompt. That is harness/artifact state, not reviewer-facing context, and it violates the checkout-native contract’s separation of review semantics from run mechanics. Keep root_dir, index_path, per-artifact path, and hashes in fingerprint inputs only; project the prompt as semantic dossier sections.

  • Major: internal/pipeline/pipeline.go:1119 stores dependency_task_ids=[dossier-discussion-summary], but internal/pipeline/pipeline.go:1122 computes the selection fingerprint with promptDeps instead of the actual dependency task IDs. The durable task contract says dependency task IDs are part of input_fingerprint; artifact digest inputs should be additive, not a replacement. Include dossierSummaryTaskID in the fingerprint inputs and validate stored dependency IDs on resume.

@rianjs rianjs force-pushed the issue-371-orchestrator-dossier-workbench branch from 82581ab to bdfcfd9 Compare June 24, 2026 19:31
@rianjs

rianjs commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Findings

No architectural blockers or product-adherence issues found. The updated diff removes harness-only dossier/workbench paths from the selection prompt, keeps selection prompt input semantic, fingerprints selection with both the dossier-summary dependency and artifact digests, and validates dependency-task mismatches before resume/retry.

@rianjs

rianjs commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

The test additions are directionally right, but they still leave two important regressions unproven.

  • internal/pipeline/pipeline_test.go#L3994 only exercises selectionPromptInputFromArtifacts in isolation. That means a bug where runSelectionPhase stops feeding promptDeps into llmTaskFingerprint, or stops wiring the new dependency task IDs into the structured task spec, would still pass. This does not actually prove the ticket’s fingerprint-invalidation criterion end to end. I would add a stale-metadata rerun test that mutates dossier/workbench content and asserts the selection task is rejected or rerun fresh.

  • internal/pipeline/pipeline_test.go#L604 checks for diff --git and @@ markers, but that is not a strong proof that full diff hunks are gone. A regression could still embed raw patch text in another serialized form and this test would miss it. The assertion should compare against the actual diff payload or otherwise prove the selection prompt is built only from dossier/workbench artifacts, not raw patch content.

  • internal/pipeline/pipeline_test.go#L3960 and internal/llm/contracts_test.go#L11 only prove that allowed_files is accepted and mentioned somewhere. They do not prove the selection prompt/example guides the model to omit it for broad reviewers, which is the part most likely to regress into overusing the new field. A small prompt-shape assertion for the omission case would tighten that up.

@rianjs rianjs force-pushed the issue-371-orchestrator-dossier-workbench branch from bdfcfd9 to 933b8d1 Compare June 24, 2026 19:38
@@ -5323,7 +5607,6 @@ func assertPromptOmitsLocalAgentSourceProvenance(t *testing.T, prompt string, so
for _, forbidden := range []string{

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-enforcement-reviewer): Removes "fingerprint" from the forbidden-strings list in assertPromptOmitsLocalAgentSourceProvenance without an explanation comment. The new TestSelectionOnlyPromptPreservesRoutingContractWithoutReviewerPromptBodies guards against the more-specific "fingerprint_inputs" string, but the original broad guard against bare "fingerprint" is now gone. If the removal is intentional (e.g., the word legitimately appears in some non-harness context), a one-line comment explaining the rationale would restore confidence in the invariant.

Reply to this thread when addressed.

Comment thread internal/llm/contracts.go
AgentID string
Rationale string
Files []string
AgentID string

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-knowledge-reviewer): AllowedFiles is a new concept in the selection output contract with semantics distinct from Files — validation enforces it must be a subset of changed files, but the behavioral distinction (what an agent is 'allowed to access' vs what it 'should review') is not captured in any versioned doc. If there are docs describing the selection schema or reviewer agent authoring guide, they should be updated to explain when AllowedFiles is appropriate and how reviewers should interpret it vs Files.

Reply to this thread when addressed.

if err != nil {
return llm.Selection{}, sessionDraft{}, ledger.Session{}, err
}
dependencyTaskIDs := []string{dossierSummaryTaskID}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Low (harness-engineering:harness-self-documenting-code-reviewer): The nested append(append([]string(nil), dependencyTaskIDs...), promptDeps...) is a correct but non-obvious Go idiom for safe slice concatenation. Since slices is already imported in this file, slices.Concat(dependencyTaskIDs, promptDeps) expresses the intent directly and is immediately readable.

Reply to this thread when addressed.

@monit-reviewer monit-reviewer left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR Review

Reviewed commit: 933b8d1

Approved with 4 non-blocking suggestions below. Address at your discretion.

Summary

Reviewer Findings
harness-engineering:harness-enforcement-reviewer 1
harness-engineering:harness-knowledge-reviewer 1
harness-engineering:harness-self-documenting-code-reviewer 2
harness-engineering:harness-enforcement-reviewer (1 findings)

💡 Suggestion - internal/pipeline/pipeline_test.go:5607

Removes "fingerprint" from the forbidden-strings list in assertPromptOmitsLocalAgentSourceProvenance without an explanation comment. The new TestSelectionOnlyPromptPreservesRoutingContractWithoutReviewerPromptBodies guards against the more-specific "fingerprint_inputs" string, but the original broad guard against bare "fingerprint" is now gone. If the removal is intentional (e.g., the word legitimately appears in some non-harness context), a one-line comment explaining the rationale would restore confidence in the invariant.

harness-engineering:harness-knowledge-reviewer (1 findings)

💡 Suggestion - internal/llm/contracts.go:38

AllowedFiles is a new concept in the selection output contract with semantics distinct from Files — validation enforces it must be a subset of changed files, but the behavioral distinction (what an agent is 'allowed to access' vs what it 'should review') is not captured in any versioned doc. If there are docs describing the selection schema or reviewer agent authoring guide, they should be updated to explain when AllowedFiles is appropriate and how reviewers should interpret it vs Files.

harness-engineering:harness-self-documenting-code-reviewer (2 findings)

💡 Suggestion - internal/pipeline/pipeline.go:1097

The nested append(append([]string(nil), dependencyTaskIDs...), promptDeps...) is a correct but non-obvious Go idiom for safe slice concatenation. Since slices is already imported in this file, slices.Concat(dependencyTaskIDs, promptDeps) expresses the intent directly and is immediately readable.

💡 Suggestion - internal/pipeline/pipeline.go:2560

In selectionThreadPrompts, the anchor key for the map lookup is computed inline inside the if condition — summaryByAnchor[dossierInlineThreadAnchorKey(thread.Path, string(thread.Side), thread.Line, string(thread.SubjectType))] — while the loop that builds the map extracts the same computation into a named variable key. Extracting a threadAnchorKey local variable before the if would make the lookup consistent with the indexing pattern above it and make the condition easier to scan.

1 info-level observations excluded. Run with --verbose to include.

3 PR discussion threads considered.


Completed in 3m 29s | $2.10 | sonnet | daemon 0.2.132 | Glorfindel
Field Value
Model sonnet
Reviewers hybrid-synthesis, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor
Engine claude · sonnet
Reviewed by pr-review-daemon · monit-pr-reviewer
Duration 3m 29s wall · 3m 25s compute (Reviewers: 2m 22s · Synthesis: 1m 03s)
Cost $2.10 (estimated)
Tokens 370.0k in / 21.9k out
Turns 12

Per-workstream usage

Workstream Model In Out Cache read Cache create Cost
hybrid-synthesis sonnet 37.5k 2.7k 13.1k 24.3k (1h) $0.19
harness-engineering:harness-architecture-reviewer sonnet 63.0k 686 13.1k 49.9k (1h) $0.31
harness-engineering:harness-enforcement-reviewer sonnet 67.5k 4.8k 13.1k 54.3k (1h) $0.40
harness-engineering:harness-knowledge-reviewer sonnet 64.9k 2.3k 13.1k 51.7k (1h) $0.35
harness-engineering:harness-self-documenting-code-reviewer sonnet 71.6k 8.3k 13.1k 58.5k (1h) $0.48
security:security-code-auditor sonnet 65.5k 3.1k 13.1k 52.4k (1h) $0.37

Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.

Note: Posted 3 inline comments. 1 finding remained in the summary because GitHub could not attach them to the current diff.

@rianjs rianjs merged commit 38c757a into main Jun 24, 2026
10 checks passed
@rianjs rianjs deleted the issue-371-orchestrator-dossier-workbench branch June 24, 2026 19:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Convert orchestrator selection to dossier/workbench mode

2 participants