feat: synthesize rollup findings failures and coverage#384
Conversation
|
Findings
|
Major
|
Major
Minor
|
Minor
|
|
No findings. The read-scope versus coverage-scope split now matches the product intent: all-skipped outputs decode, skipped coverage is assignment-scoped before rollup/rendering, and the added regressions cover the prior false-incomplete paths. |
|
The suite covers the new schema/summary shapes, but the main behavior is still only proven through helper-level tests.
|
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: 8395b81
Summary
| Reviewer | Findings |
|---|---|
| harness-engineering:harness-enforcement-reviewer | 1 |
| harness-engineering:harness-knowledge-reviewer | 2 |
| harness-engineering:harness-self-documenting-code-reviewer | 4 |
harness-engineering:harness-enforcement-reviewer (1 findings)
💡 Suggestion - internal/reviewplan/reviewplan.go:321
hasIncompleteReviewerCoverage relies on a strings.HasPrefix(entry.Status, "incomplete_") naming convention rather than a type-system invariant. A future contributor adding a new incomplete status (e.g. "partial_coverage") without knowing this convention would silently bypass the approval-coercion guard. Consider a defined string type with an IsIncomplete() method, or an exhaustive switch that returns an error on unrecognized status values.
harness-engineering:harness-knowledge-reviewer (2 findings)
docs/checkout-native-review-contract.md:240
The PR introduces five stable coverage status values (complete_broad, complete_constrained, incomplete_skipped, incomplete_failed, incomplete_unassigned) that appear in dry-run JSON output and rollup markdown, but the contract doc does not define them or describe what each means. Consumers of dry-run JSON or rollup markdown will encounter these values without source access; they belong in the contract documentation.
💡 Suggestion - internal/pipeline/pipeline.go:2242
The non-obvious split between reviewerReviewableScope (read access over all changed files) and reviewerCoverageScope (assignment-scoped, uses selected.Files when allowed_files is absent) is what allows two broad reviewers splitting changed files to both be complete_broad. This design decision is not captured in any versioned document; future modifications to coverage logic will need this context to avoid silently regressing the multi-reviewer split case.
harness-engineering:harness-self-documenting-code-reviewer (4 findings)
internal/pipeline/pipeline.go:2228
appendSortedStrings is a misleading name: 'append' implies accumulation or mutation of a target, but the function creates and returns a defensive sorted copy of its input. sortedCopy or copySorted would accurately communicate what callers receive without implying a side effect.
💡 Suggestion - internal/pipeline/pipeline.go:2183
reviewerReviewableScope and reviewerCoverageScope are easy to conflate; 'reviewable' vs 'coverage' does not convey the critical distinction (read access vs expected assignment coverage). A reader encountering both in buildReviewerPrompt and buildReviewerCoverage must read both bodies to understand why they diverge. reviewerReadableFiles / reviewerAssignmentScope, or an inline comment on the non-obvious invariant, would make the behavioral split explicit.
💡 Suggestion - internal/pipeline/pipeline.go:2207
In intersectSortedStrings, 'sorted' as a qualifier conventionally describes an input precondition, but here inputs can arrive in any order and it is the output that is sorted. intersectToSorted or sortedIntersect would more clearly signal the direction of the sort guarantee.
💡 Suggestion - internal/reviewplan/summary.go:278
coverageConstraintCell merges both entry.Constraints and entry.Diagnostic into its rendered output, but 'constraint' in the name implies only constraints are included. Diagnostics are a distinct concept (error or missing-file messages) from reviewer-declared scope constraints. coverageDetailsCell or coverageAnnotationCell would reflect that diagnostics are part of the output without requiring a reader to inspect the body.
6 PR discussion threads considered.
Completed in 3m 47s | $3.01 | sonnet | daemon 0.2.132 | Glorfindel
| Field | Value |
|---|---|
| Model | sonnet |
| Reviewers | hybrid-synthesis, documentation:docs-reviewer, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor |
| Engine | claude · sonnet |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 3m 47s wall · 3m 44s compute (Reviewers: 2m 37s · Synthesis: 1m 06s) |
| Cost | $3.01 (estimated) |
| Tokens | 519.1k in / 27.8k out |
| Turns | 14 |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost |
|---|---|---|---|---|---|---|
| hybrid-synthesis | sonnet | 40.3k | 3.5k | 13.1k | 27.1k (1h) | $0.22 |
| documentation:docs-reviewer | sonnet | 34.3k | 1.7k | 13.1k | 21.2k (1h) | $0.16 |
| harness-engineering:harness-architecture-reviewer | sonnet | 85.9k | 1.5k | 13.1k | 72.7k (1h) | $0.46 |
| harness-engineering:harness-enforcement-reviewer | sonnet | 87.8k | 3.1k | 13.1k | 74.7k (1h) | $0.50 |
| harness-engineering:harness-knowledge-reviewer | sonnet | 87.7k | 3.1k | 13.1k | 74.5k (1h) | $0.50 |
| harness-engineering:harness-self-documenting-code-reviewer | sonnet | 94.2k | 9.6k | 13.1k | 81.1k (1h) | $0.63 |
| security:security-code-auditor | sonnet | 88.9k | 5.2k | 13.1k | 75.8k (1h) | $0.54 |
Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.
monit-reviewer
left a comment
There was a problem hiding this comment.
Automated PR Review
Reviewed commit: 053a666 | Previous: 8395b81 (incremental)
Summary
No issues found.
13 PR discussion threads considered.
Completed in 1m 29s | $8.04 | sonnet | daemon 0.2.132 | Glorfindel
| Field | Value |
|---|---|
| Model | sonnet |
| Mode | Re-review · Cycle 2 · Session resumed |
| Reviewers | hybrid-synthesis, documentation:docs-reviewer, harness-engineering:harness-architecture-reviewer, harness-engineering:harness-enforcement-reviewer, harness-engineering:harness-knowledge-reviewer, harness-engineering:harness-self-documenting-code-reviewer, security:security-code-auditor |
| Engine | claude · sonnet |
| Reviewed by | pr-review-daemon · monit-pr-reviewer |
| Duration | 1m 29s wall · 59s compute (Reviewers: 38s · Synthesis: 21s) |
| Cost | $8.04 (estimated) |
| Tokens | 1425.6k in / 34.5k out |
| Turns | 28 |
Per-workstream usage
| Workstream | Model | In | Out | Cache read | Cache create | Cost |
|---|---|---|---|---|---|---|
| hybrid-synthesis | sonnet | 93.5k | 4.6k | 26.3k | 67.2k (1h) | $0.48 |
| documentation:docs-reviewer | sonnet | 77.1k | 2.6k | 26.3k | 50.8k (1h) | $0.35 |
| harness-engineering:harness-architecture-reviewer | sonnet | 233.0k | 2.1k | 26.3k | 206.7k (1h) | $1.28 |
| harness-engineering:harness-enforcement-reviewer | sonnet | 239.0k | 3.9k | 26.3k | 212.7k (1h) | $1.34 |
| harness-engineering:harness-knowledge-reviewer | sonnet | 238.7k | 4.0k | 26.3k | 212.4k (1h) | $1.34 |
| harness-engineering:harness-self-documenting-code-reviewer | sonnet | 258.3k | 11.0k | 26.3k | 232.0k (1h) | $1.56 |
| security:security-code-auditor | sonnet | 241.3k | 5.4k | 26.3k | 215.0k (1h) | $1.38 |
| discussion-summarizer | — | 44.6k | 1.1k | 18.5k | 26.1k (1h) | $0.30 |
Re-reviews only run when @monit-reviewer is re-requested as a reviewer — push as many commits as you need, then re-request when ready. PRs targeting branches other than main, master are skipped, even when @monit-reviewer is re-requested.
|
No findings. The daemon edits stayed aligned with the ticket architecture: they added coverage contract documentation, dry-run mapping coverage, conservative handling for unknown coverage statuses, and regression coverage for approval coercion without changing the read-scope/assignment-scope split. |
Summary
Verification
Closes #374