Skip to content

feat(cli): migrate pre-code.sh to fullsend gate code command#616

Open
waynesun09 wants to merge 6 commits into
mainfrom
fix-pr473-review-issues
Open

feat(cli): migrate pre-code.sh to fullsend gate code command#616
waynesun09 wants to merge 6 commits into
mainfrom
fix-pr473-review-issues

Conversation

@waynesun09
Copy link
Copy Markdown
Contributor

@waynesun09 waynesun09 commented May 2, 2026

Summary

Migrates both pre-code.sh and pre-fix.sh bash scripts to Go CLI commands under fullsend gate, addressing review findings from #473 and improving testability, type safety, and maintainability.

fullsend gate code (replaces scripts/pre-code.sh)

  • Validates workflow_dispatch inputs (ISSUE_NUMBER, REPO_FULL_NAME, GITHUB_ISSUE_URL) with cross-validation
  • Uses GitHub Timeline API to detect existing human PRs linked to the issue
  • Applies pr-open label and posts idempotent comment when human PR exists
  • Writes skip=true to GITHUB_OUTPUT — downstream workflow steps conditionally skip via if: steps.gate.outputs.skip != 'true'
  • Filters bot PRs by configurable bot login (FULLSEND_BOT_LOGIN)
  • Fails open on API errors (warns and proceeds)

fullsend gate fix (replaces scripts/pre-fix.sh)

  • Validates workflow_dispatch inputs (PR_NUMBER, REPO_FULL_NAME, TRIGGER_SOURCE)
  • Enforces iteration caps: bot-triggered (default 5), human-triggered (default 10)
  • Validates human instruction length (max 10KB)
  • Distinguishes bot vs human triggers for cap selection and error messaging

Review findings addressed

  • Consolidated duplicate AddIssueComment into existing CreateIssueComment interface method
  • Gate errors from EnsureLabel/AddIssueLabels/comment posting now logged as ::warning:: instead of silently discarded

Changes

  • Added: internal/cli/gate.gofullsend gate code and fullsend gate fix commands
  • Added: internal/cli/gate_test.go — 34 test cases (19 code + 15 fix)
  • Updated: internal/forge/forge.go — Added TimelineEvent, ListIssueTimeline, EnsureLabel, AddIssueLabels to Client interface
  • Updated: internal/forge/fake.go — FakeClient implementations with data maps and recorders
  • Updated: internal/forge/github/github.go — LiveClient implementations (Timeline API pagination, label upsert)
  • Updated: .github/workflows/code.yml — Uses fullsend gate code with skip output; all downstream steps guarded
  • Updated: .github/workflows/fix.yml — Uses fullsend gate fix instead of bash scripts/pre-fix.sh
  • Updated: harness/code.yaml, harness/fix.yaml — Removed pre_script (gate runs on runner before sandbox)
  • Deleted: scripts/pre-code.sh, scripts/pre-code-test.sh, scripts/pre-fix.sh

Closes #460. Supersedes #473.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

Site preview

Preview: https://4bf34558-site.fullsend-ai.workers.dev

Commit: 3bd14e111d3a862078a5c0740464e0bfaf618216

@fullsend-ai-review
Copy link
Copy Markdown

fullsend-ai-review Bot commented May 2, 2026

Review: #616

Head SHA: 3bd14e1
Timestamp: 2026-05-02T20:00:00Z
Outcome: approve

Summary

This PR cleanly migrates pre-code.sh and pre-fix.sh bash scripts to Go CLI commands under fullsend gate, improving testability, type safety, and maintainability. The code is well-structured, follows existing codebase patterns, includes thorough input validation with cross-field checks, properly handles API errors with a fail-open strategy, and has strong test coverage (34 cases). No critical, high, or medium findings were identified. The scope aligns with the linked issue #460 (detecting existing human PRs to avoid duplicate agent PRs) and the broader goal of replacing shell scripts with Go.

Findings

Critical

(none)

High

(none)

Medium

(none)

Low

  • [Style/conventions] internal/forge/github/github.go:ListIssueTimeline — Uses c.do(ctx, http.MethodGet, ...) directly instead of the c.get() helper, inconsistent with other paginated GET methods in the same file (e.g., ListPullRequestReviews uses c.get). Functionally equivalent but deviates from the established pattern.
    Remediation: Replace c.do(ctx, http.MethodGet, path, nil) + manual checkStatus with c.get(ctx, path) and wrap errors as needed.

Info

  • [Correctness] internal/cli/gate.go — Behavioral change from shell: bot-triggered instructions now have a 1MB length cap (maxBotInstructionBytes) that didn't exist in pre-fix.sh. The shell script only enforced the 10KB cap for human triggers. This is a reasonable safety improvement and is documented in the code comment.

  • [Style/conventions] internal/forge/forge.go:TimelineEvent — The PRURL field is populated by ListIssueTimeline but never used in gate logic. Not harmful (available for future use) but worth noting for maintainability.

Footer

Outcome: approve
This review applies to SHA 3bd14e111d3a862078a5c0740464e0bfaf618216. Any push to the PR head clears this review and requires a new evaluation.

Previous run

Review: #616

Head SHA: bab3435
Timestamp: 2026-05-02T00:00:00Z
Outcome: approve

Summary

This PR cleanly migrates two bash pre-scripts (pre-code.sh, pre-fix.sh) to type-safe, testable Go CLI commands under fullsend gate. The code gate adds new functionality (human PR detection via the Timeline API, label application, and idempotent commenting) that directly addresses issue #460. The fix gate faithfully reproduces iteration cap enforcement with an improvement: bot-triggered instructions now get a 1MB cap instead of being unchecked. Input validation is thorough with strict regex patterns, GITHUB_OUTPUT newline injection is properly defended against, and the 34-case test suite exercises the main paths, edge cases, and error injection. Workflow integration correctly uses step outputs to guard downstream steps in code.yml. The forge interface additions (ListIssueTimeline, EnsureLabel, AddIssueLabels) follow existing patterns with proper FakeClient implementations.

Findings

Medium

  • [Correctness] internal/cli/gate.go:233-236ListIssueComments error is silently swallowed during the idempotency check. If the API call consistently fails but CreateIssueComment succeeds, duplicate comments will be posted on every gate invocation. The gate's fail-open design is correct, but this specific case should emit a ::warning:: annotation (consistent with how EnsureLabel/AddIssueLabels errors are handled on lines 215-219) so operators can diagnose repeated comment duplication.
    Remediation: Replace comments = nil with a printer.Raw(fmt.Sprintf("::warning::...")) before setting comments to nil.

Low

  • [Style] internal/cli/gate.go:359-363 — The comment "Instruction length cap — lower for humans, higher for bots" inverts the intuitive reading (humans get the lower cap at 10KB, bots get higher at 1MB). Consider rewording to "tighter for human input, larger for bot-generated review bodies" to match the security rationale: human input is attack surface, bot input is structured.

Info

  • [Style] internal/forge/forge.go:82TimelineEvent.PRURL is populated by the LiveClient but unused by the gate code (which constructs #N autolinks instead). Keeping it is reasonable for the data model, but it's worth noting as dead data if someone later wonders why the URL isn't rendered.

  • [Correctness] internal/scaffold/fullsend-repo/.github/workflows/fix.yml — The gate step has id: gate but no downstream step references steps.gate.outputs.skip (unlike code.yml). This is correct because gate fix errors out rather than writing skip output, but the unused id attribute could confuse future maintainers. Consider adding a comment or removing the id.

Footer

Outcome: approve
This review applies to SHA bab3435a4ec83b85c5934b86d520d76f0ef60c13. Any push to the PR head clears this review and requires a new evaluation.

Previous run (2)

Review: #616

Head SHA: 3b5051e
Timestamp: 2026-05-02T00:00:00Z
Outcome: comment-only

Summary

This is a well-structured migration from bash scripts to Go CLI commands. The code is clean, well-tested (34 test cases covering happy paths, edge cases, error injection, and idempotency), and the migration is behaviorally faithful to the original scripts. The "fail open" design for API errors is appropriate for a gate that shouldn't block the pipeline on transient failures. The workflow changes correctly guard all downstream steps with the skip output. A few minor findings are worth noting but none are blocking.

Findings

Medium

  • [correctness] internal/cli/gate.go:396-406writeGateOutput silently discards all errors (both OpenFile and Fprintf). If writing skip=true to GITHUB_OUTPUT fails, runGateCode returns nil (success), so the workflow step succeeds but no skip output is set — causing all downstream steps to execute despite a human PR being detected. While GITHUB_OUTPUT file failures are rare in practice, the function is the critical mechanism that prevents duplicate PRs. Consider returning an error or at minimum emitting a ::warning:: annotation.
    Remediation: Return an error from writeGateOutput and handle it in runGateCode, or log a warning via printer.Raw.

  • [correctness] internal/cli/gate.go:342-360 — Invalid (non-numeric) values for FIX_ITERATION, ITERATION_CAP, and ITERATION_CAP_HUMAN are silently ignored, falling back to defaults. If FIX_ITERATION is set to a non-numeric string, strconv.Atoi fails silently and iteration stays at 1, potentially allowing a run that should be blocked. While this is "fail open" behavior, it differs from the bash original which would fail on non-numeric comparisons.
    Remediation: Log a ::warning:: when strconv.Atoi fails for these values so operators can detect misconfiguration.

Low

  • [style] internal/cli/gate.go:113 — The botLoginRe regex ^[][a-zA-Z0-9._-]+$ uses a character class starting with [] which is valid but unusual and hard to read at a glance. A comment explaining the intent ("allows literal [ and ] for bot logins like name[bot]") would aid future readers.

  • [correctness] internal/cli/gate.go:335 — The human instruction length check uses len() which counts bytes, not characters. For multi-byte UTF-8 instructions, a 10KB limit on bytes is stricter than on characters. This matches the original bash behavior (${#var} also counts bytes in most contexts), so it is consistent, but worth documenting.

Info

  • [intent-alignment] The PR title says "migrate pre-code.sh" but the change also migrates pre-fix.sh. The PR body correctly describes both migrations, but the title underrepresents scope. Not blocking since the body and commit are accurate.

  • [correctness] The TimelineEvent.PRState field can receive "open" or "closed" from the GitHub API, but merged PRs show state: "closed" with a separate pull_request.merged_at field. The current filter (PRState == "open") correctly excludes merged PRs, so this is functionally correct — just noting for awareness if future code needs to distinguish closed-unmerged from merged.

Footer

Outcome: comment-only
This review applies to SHA 3b5051e7404ac16684c68df63b196ccd2a229b8e. Any push to the PR head clears this review and requires a new evaluation.

Previous run (3)

Review: #616

Head SHA: f729f09
Timestamp: 2026-05-02T00:00:00Z
Outcome: approve

Summary

This is a well-structured migration of pre-code.sh to a Go CLI command (fullsend gate code) that adds timeline API–based human-PR detection, input cross-validation, label/comment idempotency, and GITHUB_OUTPUT skip signaling. The code is correct, has 19 focused unit tests, follows existing forge interface patterns, and poses no security concerns. Two minor design observations are noted below but neither is blocking.

Findings

Medium / Low / Info

  • [style/conventions] internal/forge/forge.go / internal/forge/github/github.goAddIssueComment duplicates the existing CreateIssueComment interface method. Both POST to the same endpoint; the only difference is the return type (error vs (*IssueComment, error)). The gate code discards the returned comment anyway. Consider using the existing method to avoid growing the Client interface surface unnecessarily.
    Remediation: Remove AddIssueComment from the interface and implementations; use CreateIssueComment in gate.go (discarding the *IssueComment return).

  • [correctness] internal/cli/gate.go:203-204,232 — Errors from EnsureLabel, AddIssueLabels, and AddIssueComment are silently discarded (_ =). If labeling or commenting fails, the gate still writes skip=true, so the code agent is skipped but the issue has no visible indicator of why. Consider logging these errors as warnings (via ::warning::) for operator visibility while still allowing the skip to proceed.
    Remediation: Log errors with printer.Raw(fmt.Sprintf("::warning::...", err)) instead of silently discarding.

Footer

Outcome: approve
This review applies to SHA f729f093a186c1e00c2eebb7f6ad22fbc2d2cff0. Any push to the PR head clears this review and requires a new evaluation.

Previous run (4)

Review: #616

Head SHA: 62259fe
Timestamp: 2026-05-02T00:00:00Z
Outcome: approve

Summary

This PR correctly addresses all five findings from the #473 review. The timeline API replacement eliminates substring false positives, FULLSEND_BOT_LOGIN validation prevents jq injection, API failure handling distinguishes errors from empty results with appropriate fail-open behavior, the SKIP_PR_CHECK flag avoids early exit 0, and comment idempotency prevents duplicates. Test coverage is comprehensive with 17 cases covering happy paths, error paths, and edge cases. No blocking findings.

Findings

Medium / Low / Info

  • [correctness] pre-code.sh:107 — Comment idempotency check does not paginate
    The gh api .../comments --jq ... call returns only the first page (30 comments). On busy issues, a previous bot comment beyond page 1 would be missed, causing a duplicate post. Consider adding --paginate or using --per-page 100 to reduce the risk.
    Remediation: Add --paginate to the comments API call, or accept this as a known limitation (consequence is only a duplicate comment).

  • [style] pre-code.sh:82HUMAN_PR_LINES holds stderr on API failure
    When the timeline API call fails, 2>&1 captures stderr into HUMAN_PR_LINES. The variable is properly cleared on error, but the name is misleading during the error path. Info-level — the code is functionally correct.

Footer

Outcome: approve
This review applies to SHA 62259fe5f41789f6d97736c8dabc7f2f1ea929e1. Any push to the PR head clears this review and requires a new evaluation.

Copy link
Copy Markdown

@fullsend-ai-review fullsend-ai-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

fullsend-code and others added 3 commits May 2, 2026 12:15
The pre-code script now checks for existing open PRs linked to
the target issue before the sandbox is created. If a human-authored
PR is found (not from the fullsend bot), the script applies a
"pr-open" label and posts a comment linking the existing PR(s),
then exits cleanly — preventing the code agent from creating a
duplicate competing PR.

The check is best-effort: it only runs when GH_TOKEN is available
and can be overridden with CODE_FORCE=true (set when a user
comments `/code --force` on the issue). Bot-authored PRs are
filtered out so the agent can still update its own previous work.

Uses gh's built-in --jq flag for filtering, avoiding a standalone
jq dependency.

Note: go-test and go-vet could not run (Go not available in
sandbox). post-triage-test.sh has pre-existing failures (jq not
available in sandbox). pre-code-test.sh passed. Manual
verification of Go tests is required.

Closes #460
…potency

Replace gh pr list --search with the timeline API to avoid substring
false positives (e.g. issue #42 matching PRs mentioning #421). Add
BOT_LOGIN regex validation to prevent jq injection, capture API errors
instead of silent fail-open, refactor exit-on-missing-token to skip
only the check block, and add comment idempotency to avoid duplicate
bot comments.

Update test suite with 7 new test cases covering API failure handling,
bot login validation, comment idempotency, and timeline API usage.

Signed-off-by: Wayne Sun <gsun@redhat.com>
Replace the bash pre-code.sh script with a Go CLI command that runs
in the workflow before sandbox creation. This addresses the review
findings on PR #473:

- Fix skip semantics: write skip=true to GITHUB_OUTPUT and gate all
  downstream workflow steps with if: steps.gate.outputs.skip != 'true'
- Fix dead GH_TOKEN: pass push-token to the gate step env
- Fix search false positives: use timeline API instead of text search
- Add cross-validation between ISSUE_NUMBER/REPO_FULL_NAME/GITHUB_ISSUE_URL
- Add bot-login regex validation against injection

Extend forge.Client with ListIssueTimeline, AddIssueComment,
EnsureLabel, and AddIssueLabels. Implement on both LiveClient
(GitHub API) and FakeClient (test double).

Closes #460

Signed-off-by: Wayne Sun <gsun@redhat.com>
@waynesun09 waynesun09 force-pushed the fix-pr473-review-issues branch from 62259fe to f729f09 Compare May 2, 2026 17:03
@waynesun09 waynesun09 changed the title fix: harden pre-code PR detection and add input validation feat(cli): migrate pre-code.sh to fullsend gate code command May 2, 2026
@waynesun09 waynesun09 changed the base branch from agent/460-skip-duplicate-human-pr to main May 2, 2026 17:05
Copy link
Copy Markdown

@fullsend-ai-review fullsend-ai-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

- Add `fullsend gate fix` subcommand migrating pre-fix.sh to Go:
  input validation, iteration cap enforcement, instruction length check
- Address review findings: consolidate AddIssueComment into existing
  CreateIssueComment, log gate errors as ::warning:: instead of
  silently discarding
- Update fix.yml workflow to use `fullsend gate fix` instead of
  `bash scripts/pre-fix.sh`
- Remove pre_script from fix harness (gate runs on runner before sandbox)
- Delete scripts/pre-fix.sh (102 lines of bash replaced by tested Go)
- Add 15 new gate fix test cases covering validation, iteration caps,
  instruction limits, bot vs human paths

Signed-off-by: Wayne Sun <gsun@redhat.com>
Copy link
Copy Markdown

@fullsend-ai-review fullsend-ai-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

- Tighten botLoginRe to reject leading brackets (e.g. "]evil")
- Log resolveToken error instead of discarding it
- Move ::notice:: emission after input validation passes
- Add TRIGGER_SOURCE format validation via triggerSourceRe
- Reject negative and non-numeric iteration/cap values
- Extract parsePositiveIntOrDefault helper to reduce duplication
- Make writeGateOutput return error with close-error handling
- Add newline injection guard to writeGateOutput
- Add bot instruction length cap (1 MB) for defense in depth
- Add id: gate to fix.yml validate step for parity with code.yml
- Add deletion assertion tests for pre-code.sh and pre-fix.sh
- Add tests for negative iteration, non-numeric caps, newline
  rejection, leading bracket bot login, trigger source format

Signed-off-by: Wayne Sun <gsun@redhat.com>
Copy link
Copy Markdown

@fullsend-ai-review fullsend-ai-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

Copy link
Copy Markdown
Contributor

@ralphbean ralphbean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architectural concern: agent-specific code in the CLI binary

The problem this PR solves is real and well-motivated — the bash scripts in PR #473 had critical bugs (non-functional skip, dead GH_TOKEN, substring false positives), and migrating to tested code is the right call.

However, introducing fullsend gate code and fullsend gate fix as hardcoded CLI subcommands creates a new coupling pattern that conflicts with established architecture:

  • ADR-0024 states: "Multi-agent orchestration lives at the CI layer, not in the runner. The runner's job is one harness, one agent, one sandbox."
  • No existing precedent — all current CLI commands (admin, run, scan, postreview, postcomment) are agent-agnostic. This PR introduces the first agent-specific subcommands.
  • Issue #101 established consensus that agents should be platform-agnostic with well-defined wrapper interfaces.
  • Issues #579 and #604 point toward declarative agent configuration, not hardcoded agent names.
  • The pre_script field in harness YAML was the intended abstraction for agent-specific pre-validation. This PR removes that abstraction.

The concern is extensibility: adding a new agent type would require recompiling the CLI rather than adding a YAML file + script.

Suggested alternatives

  1. Generic fullsend gate --config gate-code.yaml — reads validation rules from a declarative config
  2. Keep gates as scripts but add a Go test harness — use TestScript patterns to unit-test bash
  3. Rewrite scripts in Python — easier to test than bash while remaining decoupled from the binary
  4. Generic fullsend validate — a schema-driven validation command that's agent-agnostic

The code quality, test coverage, and correctness of the implementation itself is solid. The request for changes is purely about where this logic lives architecturally.

Comment thread internal/cli/gate.go
"github.com/fullsend-ai/fullsend/internal/ui"
)

func newGateCmd() *cobra.Command {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[important — architecture] Introducing gate code and gate fix as hardcoded CLI subcommands bakes agent names into the binary, creating a coupling pattern that conflicts with ADR-0024 ("The runner's job is one harness, one agent, one sandbox") and the existing agent-agnostic CLI design.

All current CLI commands (admin, run, scan, postreview, postcomment) are agent-agnostic. This is the first agent-specific subcommand.

The pre_script field in harness YAML was the intended abstraction for agent-specific logic. Some alternatives that preserve testability without hardcoding agent names:

  1. Generic fullsend gate --config gate-code.yaml that reads validation rules from declarative config
  2. Keep as scripts, add Go test harness (TestScript patterns)
  3. Rewrite scripts in Python for easier testability while staying decoupled
  4. Generic fullsend validate with schema-driven validation

The implementation quality is solid — the concern is purely about where this logic lives.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on the pattern — worth discussing. A few counter-points:

Gates are CI-layer logic, not runner logic. ADR-0024 scopes "one harness, one agent, one sandbox" to the runner (fullsend run). Gates run on the GHA runner before sandbox creation — they're the CI-level orchestration that ADR-0024 explicitly says belongs outside the harness: "Multi-agent sequencing — for example, running a code agent then a review agent with a gate — belongs in the CI pipeline definition, not in the harness YAML" (ADR-0024 §Decision). The pre_script field was the pre-gate mechanism, and these commands replace it with the same semantics (run before sandbox, fail = abort).

The existing CLI isn't purely agent-agnostic. postreview and postcomment are stage-specific post-processing commands. scan is security-layer specific. The pattern is: generic runner (run), stage-specific CI helpers (gate, postreview, postcomment, scan).

Why not generic fullsend gate --config? The gate logic for code (timeline API → PR detection → label + comment → skip output) and fix (iteration cap enforcement → fail) share no validation rules — they share an infrastructure pattern (validate inputs, call forge APIs, write GITHUB_OUTPUT). A declarative config would either be so simple it's just passing env var names to regex patterns (losing the PR detection logic), or so complex it reinvents the Go code with YAML indirection. The real value is in the forge.Client integration and testability, not in the subcommand name.

Concrete motivation: #460 and #473. The bash pre-scripts had 3 critical bugs that were undetectable without running them in a live GHA environment: non-functional skip output, dead GH_TOKEN, and substring false positives in comment deduplication. The Go migration caught all 3 with unit tests using FakeClient. Moving back to scripts (even tested ones) loses the forge.Client seam.

The extensibility concern is real but bounded. If we add more stages (e.g. gate triage, gate review), each is ~50-100 lines of stage-specific logic. If we reach 5+ gates with shared patterns, refactoring to fullsend gate --config with a shared validation engine makes sense. For 2 stages, the concrete approach is clearer.

Happy to discuss further — if the preference is to defer the merge until #579 lands a generic parameterization interface, we could keep this as a draft.

Add ::warning:: annotation when ListIssueComments fails during the
idempotency check, matching how EnsureLabel/AddIssueLabels errors
are handled. Without this, repeated API failures would silently
cause duplicate comment posting.

Signed-off-by: Wayne Sun <gsun@redhat.com>
Copy link
Copy Markdown

@fullsend-ai-review fullsend-ai-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the review comment for full details.

@rh-hemartin
Copy link
Copy Markdown
Contributor

If the problem is testing I think we can first look at migrating them to Python

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Code agent creates duplicate PR when a human PR already exists for the issue

3 participants