feat(cli): migrate pre-code.sh to fullsend gate code command#616
feat(cli): migrate pre-code.sh to fullsend gate code command#616waynesun09 wants to merge 6 commits into
Conversation
Site previewPreview: https://4bf34558-site.fullsend-ai.workers.dev Commit: |
Review: #616Head SHA: 3bd14e1 SummaryThis PR cleanly migrates FindingsCritical(none) High(none) Medium(none) Low
Info
FooterOutcome: approve Previous runReview: #616Head SHA: bab3435 SummaryThis PR cleanly migrates two bash pre-scripts ( FindingsMedium
Low
Info
FooterOutcome: approve Previous run (2)Review: #616Head SHA: 3b5051e SummaryThis is a well-structured migration from bash scripts to Go CLI commands. The code is clean, well-tested (34 test cases covering happy paths, edge cases, error injection, and idempotency), and the migration is behaviorally faithful to the original scripts. The "fail open" design for API errors is appropriate for a gate that shouldn't block the pipeline on transient failures. The workflow changes correctly guard all downstream steps with the skip output. A few minor findings are worth noting but none are blocking. FindingsMedium
Low
Info
FooterOutcome: comment-only Previous run (3)Review: #616Head SHA: f729f09 SummaryThis is a well-structured migration of FindingsMedium / Low / Info
FooterOutcome: approve Previous run (4)Review: #616Head SHA: 62259fe SummaryThis PR correctly addresses all five findings from the #473 review. The timeline API replacement eliminates substring false positives, FindingsMedium / Low / Info
FooterOutcome: approve |
The pre-code script now checks for existing open PRs linked to the target issue before the sandbox is created. If a human-authored PR is found (not from the fullsend bot), the script applies a "pr-open" label and posts a comment linking the existing PR(s), then exits cleanly — preventing the code agent from creating a duplicate competing PR. The check is best-effort: it only runs when GH_TOKEN is available and can be overridden with CODE_FORCE=true (set when a user comments `/code --force` on the issue). Bot-authored PRs are filtered out so the agent can still update its own previous work. Uses gh's built-in --jq flag for filtering, avoiding a standalone jq dependency. Note: go-test and go-vet could not run (Go not available in sandbox). post-triage-test.sh has pre-existing failures (jq not available in sandbox). pre-code-test.sh passed. Manual verification of Go tests is required. Closes #460
…potency Replace gh pr list --search with the timeline API to avoid substring false positives (e.g. issue #42 matching PRs mentioning #421). Add BOT_LOGIN regex validation to prevent jq injection, capture API errors instead of silent fail-open, refactor exit-on-missing-token to skip only the check block, and add comment idempotency to avoid duplicate bot comments. Update test suite with 7 new test cases covering API failure handling, bot login validation, comment idempotency, and timeline API usage. Signed-off-by: Wayne Sun <gsun@redhat.com>
Replace the bash pre-code.sh script with a Go CLI command that runs in the workflow before sandbox creation. This addresses the review findings on PR #473: - Fix skip semantics: write skip=true to GITHUB_OUTPUT and gate all downstream workflow steps with if: steps.gate.outputs.skip != 'true' - Fix dead GH_TOKEN: pass push-token to the gate step env - Fix search false positives: use timeline API instead of text search - Add cross-validation between ISSUE_NUMBER/REPO_FULL_NAME/GITHUB_ISSUE_URL - Add bot-login regex validation against injection Extend forge.Client with ListIssueTimeline, AddIssueComment, EnsureLabel, and AddIssueLabels. Implement on both LiveClient (GitHub API) and FakeClient (test double). Closes #460 Signed-off-by: Wayne Sun <gsun@redhat.com>
62259fe to
f729f09
Compare
- Add `fullsend gate fix` subcommand migrating pre-fix.sh to Go: input validation, iteration cap enforcement, instruction length check - Address review findings: consolidate AddIssueComment into existing CreateIssueComment, log gate errors as ::warning:: instead of silently discarding - Update fix.yml workflow to use `fullsend gate fix` instead of `bash scripts/pre-fix.sh` - Remove pre_script from fix harness (gate runs on runner before sandbox) - Delete scripts/pre-fix.sh (102 lines of bash replaced by tested Go) - Add 15 new gate fix test cases covering validation, iteration caps, instruction limits, bot vs human paths Signed-off-by: Wayne Sun <gsun@redhat.com>
- Tighten botLoginRe to reject leading brackets (e.g. "]evil") - Log resolveToken error instead of discarding it - Move ::notice:: emission after input validation passes - Add TRIGGER_SOURCE format validation via triggerSourceRe - Reject negative and non-numeric iteration/cap values - Extract parsePositiveIntOrDefault helper to reduce duplication - Make writeGateOutput return error with close-error handling - Add newline injection guard to writeGateOutput - Add bot instruction length cap (1 MB) for defense in depth - Add id: gate to fix.yml validate step for parity with code.yml - Add deletion assertion tests for pre-code.sh and pre-fix.sh - Add tests for negative iteration, non-numeric caps, newline rejection, leading bracket bot login, trigger source format Signed-off-by: Wayne Sun <gsun@redhat.com>
ralphbean
left a comment
There was a problem hiding this comment.
Architectural concern: agent-specific code in the CLI binary
The problem this PR solves is real and well-motivated — the bash scripts in PR #473 had critical bugs (non-functional skip, dead GH_TOKEN, substring false positives), and migrating to tested code is the right call.
However, introducing fullsend gate code and fullsend gate fix as hardcoded CLI subcommands creates a new coupling pattern that conflicts with established architecture:
- ADR-0024 states: "Multi-agent orchestration lives at the CI layer, not in the runner. The runner's job is one harness, one agent, one sandbox."
- No existing precedent — all current CLI commands (
admin,run,scan,postreview,postcomment) are agent-agnostic. This PR introduces the first agent-specific subcommands. - Issue #101 established consensus that agents should be platform-agnostic with well-defined wrapper interfaces.
- Issues #579 and #604 point toward declarative agent configuration, not hardcoded agent names.
- The
pre_scriptfield in harness YAML was the intended abstraction for agent-specific pre-validation. This PR removes that abstraction.
The concern is extensibility: adding a new agent type would require recompiling the CLI rather than adding a YAML file + script.
Suggested alternatives
- Generic
fullsend gate --config gate-code.yaml— reads validation rules from a declarative config - Keep gates as scripts but add a Go test harness — use
TestScriptpatterns to unit-test bash - Rewrite scripts in Python — easier to test than bash while remaining decoupled from the binary
- Generic
fullsend validate— a schema-driven validation command that's agent-agnostic
The code quality, test coverage, and correctness of the implementation itself is solid. The request for changes is purely about where this logic lives architecturally.
| "github.com/fullsend-ai/fullsend/internal/ui" | ||
| ) | ||
|
|
||
| func newGateCmd() *cobra.Command { |
There was a problem hiding this comment.
[important — architecture] Introducing gate code and gate fix as hardcoded CLI subcommands bakes agent names into the binary, creating a coupling pattern that conflicts with ADR-0024 ("The runner's job is one harness, one agent, one sandbox") and the existing agent-agnostic CLI design.
All current CLI commands (admin, run, scan, postreview, postcomment) are agent-agnostic. This is the first agent-specific subcommand.
The pre_script field in harness YAML was the intended abstraction for agent-specific logic. Some alternatives that preserve testability without hardcoding agent names:
- Generic
fullsend gate --config gate-code.yamlthat reads validation rules from declarative config - Keep as scripts, add Go test harness (
TestScriptpatterns) - Rewrite scripts in Python for easier testability while staying decoupled
- Generic
fullsend validatewith schema-driven validation
The implementation quality is solid — the concern is purely about where this logic lives.
There was a problem hiding this comment.
Good catch on the pattern — worth discussing. A few counter-points:
Gates are CI-layer logic, not runner logic. ADR-0024 scopes "one harness, one agent, one sandbox" to the runner (fullsend run). Gates run on the GHA runner before sandbox creation — they're the CI-level orchestration that ADR-0024 explicitly says belongs outside the harness: "Multi-agent sequencing — for example, running a code agent then a review agent with a gate — belongs in the CI pipeline definition, not in the harness YAML" (ADR-0024 §Decision). The pre_script field was the pre-gate mechanism, and these commands replace it with the same semantics (run before sandbox, fail = abort).
The existing CLI isn't purely agent-agnostic. postreview and postcomment are stage-specific post-processing commands. scan is security-layer specific. The pattern is: generic runner (run), stage-specific CI helpers (gate, postreview, postcomment, scan).
Why not generic fullsend gate --config? The gate logic for code (timeline API → PR detection → label + comment → skip output) and fix (iteration cap enforcement → fail) share no validation rules — they share an infrastructure pattern (validate inputs, call forge APIs, write GITHUB_OUTPUT). A declarative config would either be so simple it's just passing env var names to regex patterns (losing the PR detection logic), or so complex it reinvents the Go code with YAML indirection. The real value is in the forge.Client integration and testability, not in the subcommand name.
Concrete motivation: #460 and #473. The bash pre-scripts had 3 critical bugs that were undetectable without running them in a live GHA environment: non-functional skip output, dead GH_TOKEN, and substring false positives in comment deduplication. The Go migration caught all 3 with unit tests using FakeClient. Moving back to scripts (even tested ones) loses the forge.Client seam.
The extensibility concern is real but bounded. If we add more stages (e.g. gate triage, gate review), each is ~50-100 lines of stage-specific logic. If we reach 5+ gates with shared patterns, refactoring to fullsend gate --config with a shared validation engine makes sense. For 2 stages, the concrete approach is clearer.
Happy to discuss further — if the preference is to defer the merge until #579 lands a generic parameterization interface, we could keep this as a draft.
Add ::warning:: annotation when ListIssueComments fails during the idempotency check, matching how EnsureLabel/AddIssueLabels errors are handled. Without this, repeated API failures would silently cause duplicate comment posting. Signed-off-by: Wayne Sun <gsun@redhat.com>
|
If the problem is testing I think we can first look at migrating them to Python |
Summary
Migrates both
pre-code.shandpre-fix.shbash scripts to Go CLI commands underfullsend gate, addressing review findings from #473 and improving testability, type safety, and maintainability.fullsend gate code(replacesscripts/pre-code.sh)pr-openlabel and posts idempotent comment when human PR existsskip=truetoGITHUB_OUTPUT— downstream workflow steps conditionally skip viaif: steps.gate.outputs.skip != 'true'FULLSEND_BOT_LOGIN)fullsend gate fix(replacesscripts/pre-fix.sh)Review findings addressed
AddIssueCommentinto existingCreateIssueCommentinterface methodEnsureLabel/AddIssueLabels/comment posting now logged as::warning::instead of silently discardedChanges
internal/cli/gate.go—fullsend gate codeandfullsend gate fixcommandsinternal/cli/gate_test.go— 34 test cases (19 code + 15 fix)internal/forge/forge.go— AddedTimelineEvent,ListIssueTimeline,EnsureLabel,AddIssueLabelsto Client interfaceinternal/forge/fake.go— FakeClient implementations with data maps and recordersinternal/forge/github/github.go— LiveClient implementations (Timeline API pagination, label upsert).github/workflows/code.yml— Usesfullsend gate codewith skip output; all downstream steps guarded.github/workflows/fix.yml— Usesfullsend gate fixinstead ofbash scripts/pre-fix.shharness/code.yaml,harness/fix.yaml— Removedpre_script(gate runs on runner before sandbox)scripts/pre-code.sh,scripts/pre-code-test.sh,scripts/pre-fix.shCloses #460. Supersedes #473.