feat(review-agent): add outcome labels to post-review.sh by fullsend-ai-coder[bot] · Pull Request #874 · fullsend-ai/fullsend

fullsend-ai-coder · 2026-05-13T12:29:34Z

After posting the review verdict, post-review.sh now applies outcome labels based on the review action, completing the label state machine documented in the bugfix workflow guide:

approve (no protected-path downgrade) -> ready-for-merge
approve (downgraded to comment) -> requires-manual-review
comment (split/conflicting review) -> requires-manual-review
request_changes -> no label (fix agent triggers on the event)

Labels are created if missing, matching the needs-human pattern in post-fix.sh. A DOWNGRADED flag tracks whether the protected-path check changed the action so the correct label is applied based on the final posted action rather than the original.

Adds post-review-test.sh with unit tests for the label-selection logic and registers it in the Makefile script-test target.

Note: pre-commit and make lint could not run in sandbox due to network restrictions (gitleaks hook install blocked). The post-script runs authoritative pre-commit on the runner.

Closes #552

Post-script verification

Branch is not main/master (agent/552-add-review-outcome-labels)
Secret scan passed (gitleaks — 7f3e5cce7dbb6d8aab3616b4b084ab3375fc84f8..HEAD)
Pre-commit hooks passed (authoritative run on runner)
Tests ran inside sandbox

github-actions · 2026-05-13T12:29:50Z

fullsend review is working on this — view logs

github-actions · 2026-05-13T12:30:56Z

Site preview

Preview: https://bfa15481-site.fullsend-ai.workers.dev

Commit: e903bf9014460e0a0c44e54146e08628ffb23e94

fullsend-ai-review · 2026-05-13T12:35:10Z

Review: #874

Head SHA: e903bf9
Timestamp: 2026-05-14T00:00:00Z
Outcome: approve

Summary

This PR correctly implements outcome labels in post-review.sh, completing the label state machine documented in the bugfix workflow guide. The implementation is well-scoped to issue #552, follows existing patterns (label creation from post-fix.sh), correctly tracks protected-path downgrades via a DOWNGRADED flag, removes stale labels before applying new ones for idempotency, and includes a comprehensive test file. No critical or high findings.

Findings

Critical

None.

High

None.

Medium

[Correctness / Test integrity] post-review-test.sh — The test reimplements the label-selection logic as a standalone determine_outcome_label() function rather than sourcing or extracting it from post-review.sh. If someone updates the production script's label logic but not the test mirror, the tests will pass with stale expectations. The inline comment "Label logic is mirrored in post-review-test.sh — update both" mitigates this but does not prevent drift. Consider extracting a shared function sourced by both files, or adding a structural check (e.g., a hash or marker comment) to detect divergence.
Remediation: Extract the label-selection logic into a shared shell function file sourced by both post-review.sh and post-review-test.sh, or add a sentinel comment/hash that CI can verify stays in sync.

Low

None.

Info

[Correctness] post-review.sh — The failure action is not explicitly handled in the label if/elif chain. Stale labels are still removed (the removal block runs unconditionally), so the PR ends up with no outcome label, which is reasonable behavior. Just noting for completeness.
[Style] post-review.sh — The reject case now creates the label before applying it (gh label create "rejected" + --add-label), which is an improvement over the original code that only called --add-label without ensuring the label existed. Good consistency with the other branches.

Footer

Outcome: approve
This review applies to SHA e903bf9014460e0a0c44e54146e08628ffb23e94. Any push to the PR head clears this review and requires a new evaluation.

Previous run

Review: #874

Head SHA: c1dea91
Timestamp: 2026-05-13T12:30:00Z
Outcome: approve

Summary

This PR adds outcome-label application to post-review.sh, completing the label state machine documented in the bugfix workflow guide. The implementation is correct: it introduces a DOWNGRADED flag to distinguish genuine approvals from protected-path downgrades, applies ready-for-merge or requires-manual-review labels accordingly, and creates labels if they don't exist (matching the needs-human pattern in post-fix.sh). The test file covers all five action×downgrade combinations. No critical or high findings.

Findings

Info

[correctness] post-review-test.sh — The test reimplements the label-selection logic in a standalone determine_outcome_label() function rather than sourcing it from post-review.sh. This is a reasonable pragmatic choice (the real script requires network access and env vars), but it means the test won't catch regressions if someone modifies the label logic in post-review.sh without updating the test function. Consider adding a comment in post-review.sh near the label block pointing to the test, so future editors know to update both.
[style] post-review.sh:125-133 — The pre-existing reject block doesn't create the rejected label before applying it (unlike the new code and the post-fix.sh pattern). Not introduced by this PR, but the inconsistency is worth noting for a future cleanup.
[correctness] post-review.sh — The failure action correctly falls through all label conditions with no label applied. Verified that no label is erroneously applied for agent failures.

Dimensions evaluated

Correctness — Logic is sound. ACTION captures the original action (line 55); DOWNGRADED tracks whether the protected-path check changed it. The label conditions are complete, mutually exclusive, and handle all documented outcomes. Edge cases (failure action, empty PR files) are handled by existing guards.
Intent alignment — Matches issue post-review.sh does not apply outcome labels (ready-for-merge, requires-manual-review) #552 exactly. Scope is appropriate: label application after review posting, with tests.
Platform security — No new privilege surface. Label operations use the existing REVIEW_TOKEN. Failures are swallowed with || true to avoid blocking the post-script, consistent with existing patterns.
Content security — No user-content handling changes.
Injection defense — No new injection vectors. PR body contains no suspicious directives or non-rendering Unicode. Code changes handle only internal state (ACTION, DOWNGRADED variables) with no user-controlled interpolation.
Style/conventions — Follows existing patterns: gh label create ... 2>/dev/null || true, gh pr edit ... --add-label, descriptive echo before each block. Well-commented.

Footer

Outcome: approve
This review applies to SHA c1dea91980baa6359f6cb2d6c7ae87edbbecaabf. Any push to the PR head clears this review and requires a new evaluation.

Previous run (2)

Review: #874

Head SHA: 3fa1d00
Timestamp: 2026-05-13T12:30:00Z
Outcome: approve

Summary

This PR adds outcome-label logic to post-review.sh, completing the label state machine documented in the bugfix workflow guide. After fullsend post-review posts the verdict, the script now applies ready-for-merge (clean approve), requires-manual-review (downgraded approve or comment disposition), or no label (request_changes). The implementation correctly tracks the protected-path downgrade via a DOWNGRADED flag so the label reflects the final posted action, not the original agent output. A new test file validates the label-selection logic. No critical, high, or medium findings.

Findings

Critical

None.

High

None.

Medium

None.

Low

[style] post-review.sh:127-155 — The requires-manual-review label creation + application block is duplicated verbatim for the downgraded-approve and comment branches. Consider extracting a helper function (e.g., apply_label) to reduce duplication and ensure future label changes only need one edit.
Remediation: Extract gh label create ... && gh pr edit ... --add-label into a apply_outcome_label() function called from each branch.

Info

[correctness] post-review-test.sh — The test reimplements the label-selection logic in a standalone determine_outcome_label function rather than sourcing it from post-review.sh. This is consistent with existing test patterns in the repo (post-code-test.sh, post-triage-test.sh), but the logic could diverge if the production script is updated without updating the test. Acceptable trade-off given bash sourcing constraints.
[intent-alignment] The PR body note about pre-commit/lint not running in sandbox is expected — the post-script runs authoritative pre-commit on the runner. No concern.

Footer

Outcome: approve
This review applies to SHA 3fa1d00ce0436ea1fa7614e248774b28ae9be261. Any push to the PR head clears this review and requires a new evaluation.

waynesun09 · 2026-05-13T16:11:55Z

Review Squad Report — #874

Agents dispatched: 10 (3x claude-coder, 3x claude-researcher, 2x gemini-code-review, 2x cursor-code-review)
Total findings: 10 (after dedup and verification)
False positives removed: 3

High (1)

1. Stale labels accumulate across review cycles — no label cleanup

File: internal/scaffold/fullsend-repo/scripts/post-review.sh:127-153
Finding: The script only adds labels via --add-label but never removes conflicting outcome labels from prior review runs. If round 1 applies ready-for-merge and round 2 results in comment, the PR ends up with both ready-for-merge AND requires-manual-review — contradictory state. Issue #552 says "Labels are mutually exclusive where it matters — the pipeline enforces this," but this enforcement is missing.
Suggestion: Before applying the new label, remove all other outcome labels:

for stale_label in "ready-for-merge" "requires-manual-review" "rejected"; do
  gh pr edit "${PR_NUMBER}" --repo "${REPO_FULL_NAME}" \
    --remove-label "${stale_label}" 2>/dev/null || true
done

Consensus: 8/10 agents

Medium (4)

2. Test reimplements logic instead of testing production code — drift risk

File: internal/scaffold/fullsend-repo/scripts/post-review-test.sh:24-41
Finding: determine_outcome_label() in the test file manually reimplements the label-selection logic from post-review.sh. If someone modifies the production logic but forgets to update the test, all tests still pass against the stale copy. The test also includes reject as a unified branch, while production handles it in a separate if block.
Suggestion: Add a cross-reference comment in post-review.sh near the label block: # NOTE: label logic is mirrored in post-review-test.sh — update both. Alternatively, adopt the mock-binary approach from post-triage-test.sh.
Consensus: 7/10 agents

3. `reject` block lacks `gh label create` and is structurally separate from outcome label chain

File: internal/scaffold/fullsend-repo/scripts/post-review.sh:155-162
Finding: Two issues: (a) the pre-existing reject block does gh pr edit --add-label "rejected" without first calling gh label create, so in newly onboarded repos the label may not exist and application fails silently; (b) the reject block is a separate if statement, not part of the if/elif chain, so it doesn't participate in any future mutual-exclusivity cleanup.
Suggestion: Fold reject into the elif chain and add gh label create "rejected" before the --add-label call.
Consensus: 6/10 agents

4. `2>/dev/null || true` on `gh pr edit --add-label` masks real failures

File: internal/scaffold/fullsend-repo/scripts/post-review.sh:137-138,143-145,150-152
Finding: The 2>/dev/null || true on gh label create is defensible (expected "already exists" error). However, the same pattern on gh pr edit --add-label suppresses permission errors, rate limits, and network failures. The existing reject block uses bare || true (no stderr redirect), which is better since stderr still reaches the Actions log.
Suggestion: Keep 2>/dev/null || true on gh label create; use bare || true on gh pr edit --add-label to match the existing reject block pattern.
Consensus: 5/10 agents

5. Missing test edge cases

File: internal/scaffold/fullsend-repo/scripts/post-review-test.sh:68-88
Finding: Tests cover the 5 expected actions but miss: empty string, unknown values (e.g. "failure", "APPROVE"), and comment with downgraded=true. The failure action is a real value synthesized by the no-output fallback at line 44 of post-review.sh.
Suggestion: Add boundary tests:

run_test "comment-with-downgrade-flag" "comment" "true" "requires-manual-review"
run_test "empty-action-no-label" "" "false" "none"
run_test "failure-action-no-label" "failure" "false" "none"
run_test "unknown-action-no-label" "banana" "false" "none"

Consensus: 5/10 agents

Low (3)

6. Duplicated `requires-manual-review` label blocks

File: internal/scaffold/fullsend-repo/scripts/post-review.sh:139-152
Finding: The approve+downgraded and comment branches have identical label creation and application code. If label name, color, or description changes, both must be updated in lockstep.
Suggestion: Combine: elif { [ "${ACTION}" = "approve" ] && [ "${DOWNGRADED}" = "true" ]; } || [ "${ACTION}" = "comment" ]; then
Consensus: 6/10 agents

7. `request_changes` falls through silently with no log message

File: internal/scaffold/fullsend-repo/scripts/post-review.sh:127-153
Finding: Every action path has an echo statement documenting the disposition except request_changes. Add an explicit branch: elif [ "${ACTION}" = "request_changes" ]; then echo "Request-changes disposition — no outcome label (fix agent triggers on event)"
Consensus: 3/10 agents

8. ACTION variable semantics correct but undocumented

File: internal/scaffold/fullsend-repo/scripts/post-review.sh:55
Finding: ACTION holds the original value from agent-result.json and is intentionally never reassigned after the downgrade. This invariant is undocumented — if a future editor re-reads ACTION from the modified RESULT_FILE, the downgraded-approve branch would silently break.
Suggestion: Add a comment: # ACTION holds the original action for the entire script — not re-read after protected-path downgrade.
Consensus: 4/10 agents

Info (2)

9. Unquoted `${FAILURES}` in test

File: internal/scaffold/fullsend-repo/scripts/post-review-test.sh:90 — safe but quoting is best practice.

10. `request_changes`/`ready-to-code` design question

Issue #552's table says "Unanimous rework → ready-to-code" but its "What should happen" section says "no label needed (fix agent triggers on event; see #551)". Implementation matches the latter. Not a bug — requirements ambiguity.

Agent Agreement

Finding	Agents	Strength
Stale label cleanup	8/10	Strong consensus
Test drift risk	7/10	Strong consensus
Duplicated label blocks	6/10	Strong consensus
Reject block issues	6/10	Strong consensus
Error suppression	5/10	Consensus
Missing test cases	5/10	Consensus
ACTION docs	4/10	Moderate
Silent request_changes	3/10	Moderate

False Positives Removed

Finding	Reason
ACTION variable is stale/buggy after downgrade	Confirmed correct — ACTION retains original value, DOWNGRADED compensates
`gh label create` on every run is wasteful	Pre-existing convention matching `post-fix.sh`, not introduced by this PR
Missing integration tests for gh CLI	Out of scope; unit test pattern matches existing repo convention

Summary

The core logic is correct — the DOWNGRADED flag elegantly tracks protected-path downgrades and the label selection works as designed. The main issue is stale label accumulation (HIGH): re-runs will leave contradictory labels on PRs, violating the documented mutual exclusivity. The structural issues (reject block separation, test drift, error suppression inconsistency) are maintainability concerns worth addressing since this PR establishes the label state machine pattern.

ralphbean

Clean implementation that completes the label state machine from #552. Follows established patterns from post-fix.sh, tests match the existing conventions, and the DOWNGRADED flag is a nice minimal solution for tracking the protected-path downgrade.

One minor note inline (deferred, not blocking).

ben-alkov · 2026-05-13T19:30:08Z

/fix

github-actions · 2026-05-13T19:30:33Z

fullsend fix is working on this — view logs

Combine the downgraded-approve and comment branches into a single condition since both apply the same requires-manual-review label. This addresses the review feedback about verbatim duplication that could drift if one branch is updated without the other. Addresses review feedback on #874 Signed-off-by: fullsend-fix <fullsend-code@users.noreply.github.com>

fullsend-ai-coder · 2026-05-13T19:35:50Z

🔧 Fix agent — iteration 1 (human-triggered)

Addressed 1 review finding: deduplicated the requires-manual-review label block by combining two elif branches into one condition.

Fixed (1):

duplicated requires-manual-review label block (internal/scaffold/fullsend-repo/scripts/post-review.sh): Combined the downgraded-approve branch (lines 139-145) and comment branch (lines 146-152) into a single condition using || since both apply the identical requires-manual-review label-create + label-apply block. This eliminates the risk of one branch being updated without the other.

Tests: passed

Decision points

Chose to merge conditions with || rather than extracting a helper function (alternatives: Extract apply_requires_manual_review helper function, Use a variable to hold the label name and apply after the if/elif block; rationale: The || condition is the simplest change that eliminates duplication while keeping the script's linear structure intact. A helper function would be over-engineering for a two-command block.)

_{Updated by fullsend fix agent}

github-actions · 2026-05-13T19:36:16Z

fullsend review is working on this — view logs

After posting the review verdict, post-review.sh now applies outcome labels based on the review action, completing the label state machine documented in the bugfix workflow guide: - approve (no protected-path downgrade) -> ready-for-merge - approve (downgraded to comment) -> requires-manual-review - comment (split/conflicting review) -> requires-manual-review - request_changes -> no label (fix agent triggers on the event) Labels are created if missing, matching the needs-human pattern in post-fix.sh. A DOWNGRADED flag tracks whether the protected-path check changed the action so the correct label is applied based on the final posted action rather than the original. Closes #552 Assisted-by: Claude Code (Opus 4.6) Signed-off-by: Ben Alkov <ben.alkov@redhat.com>

github-actions · 2026-05-14T12:44:56Z

fullsend review is working on this — view logs

github-actions Bot deployed to site-preview May 13, 2026 12:30 View deployment

fullsend-ai-review Bot approved these changes May 13, 2026

View reviewed changes

ben-alkov changed the title ~~feat(#552): add outcome labels to post-review.sh~~ feat(review-agent): add outcome labels to post-review.sh May 13, 2026

ben-alkov requested review from ascerra, ralphbean and waynesun09 May 13, 2026 15:23

ralphbean approved these changes May 13, 2026

View reviewed changes

Comment thread internal/scaffold/fullsend-repo/scripts/post-review.sh

github-actions Bot deployed to site-preview May 13, 2026 19:37 View deployment

fullsend-ai-review Bot approved these changes May 13, 2026

View reviewed changes

ben-alkov force-pushed the agent/552-add-review-outcome-labels branch from c1dea91 to e903bf9 Compare May 14, 2026 12:44

github-actions Bot deployed to site-preview May 14, 2026 12:46 View deployment

fullsend-ai-review Bot approved these changes May 14, 2026

View reviewed changes

ben-alkov added this pull request to the merge queue May 14, 2026

Merged via the queue into main with commit bd4aea0 May 14, 2026
25 checks passed

ben-alkov deleted the agent/552-add-review-outcome-labels branch May 14, 2026 13:24

This was referenced May 14, 2026

Review agent: add state-lifecycle completeness check to correctness dimension fullsend-ai/.fullsend#39

Open

Fix agent: require exhaustive change reporting in fix summaries fullsend-ai/.fullsend#40

Open

Conversation

fullsend-ai-coder Bot commented May 13, 2026

Post-script verification

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Site preview

Uh oh!

fullsend-ai-review Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: #874

Summary

Findings

Critical

High

Medium

Low

Info

Footer

Review: #874

Summary

Findings

Info

Dimensions evaluated

Footer

Review: #874

Summary

Findings

Critical

High

Medium

Low

Info

Footer

Uh oh!

waynesun09 commented May 13, 2026

Review Squad Report — #874

High (1)

1. Stale labels accumulate across review cycles — no label cleanup

Medium (4)

2. Test reimplements logic instead of testing production code — drift risk

3. reject block lacks gh label create and is structurally separate from outcome label chain

4. 2>/dev/null || true on gh pr edit --add-label masks real failures

5. Missing test edge cases

Low (3)

6. Duplicated requires-manual-review label blocks

7. request_changes falls through silently with no log message

8. ACTION variable semantics correct but undocumented

Info (2)

9. Unquoted ${FAILURES} in test

10. request_changes/ready-to-code design question

Agent Agreement

False Positives Removed

Summary

Uh oh!

ralphbean left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ben-alkov commented May 13, 2026

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

fullsend-ai-coder Bot commented May 13, 2026

🔧 Fix agent — iteration 1 (human-triggered)

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

github-actions Bot commented May 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

github-actions Bot commented May 13, 2026 •

edited

Loading

fullsend-ai-review Bot commented May 13, 2026 •

edited

Loading

3. `reject` block lacks `gh label create` and is structurally separate from outcome label chain

4. `2>/dev/null || true` on `gh pr edit --add-label` masks real failures

6. Duplicated `requires-manual-review` label blocks

7. `request_changes` falls through silently with no log message

9. Unquoted `${FAILURES}` in test

10. `request_changes`/`ready-to-code` design question