Skip to content

feat(#704): surface agent errors in workflow logs#764

Open
fullsend-ai-coder[bot] wants to merge 2 commits into
mainfrom
agent/704-surface-transcript-errors
Open

feat(#704): surface agent errors in workflow logs#764
fullsend-ai-coder[bot] wants to merge 2 commits into
mainfrom
agent/704-surface-transcript-errors

Conversation

@fullsend-ai-coder
Copy link
Copy Markdown

When an agent run fails, the root cause error is buried in the transcript JSONL artifact. Operators must download and manually search the artifact to diagnose failures.

This adds two improvements:

  1. In-process error surfacing (run.go): After transcript extraction,
    parse the JSONL for result events with is_error=true and emit
    ::error:: annotations to stderr. These appear in the GitHub
    Actions job summary, making failures diagnosable without
    downloading artifacts.

  2. Standalone extraction script (extract-transcript-error.sh): A
    shell script that extracts errors from transcript files or
    directories. Usable by post-scripts, the triage agent (after
    downloading artifacts via gh run download), and operators
    debugging failed runs.

The transcript parser is intentionally lightweight: it scans only for result-type lines (fast string contains check) and parses the last one. Error messages are truncated to 2000 chars and sanitized to prevent GHA command injection.

New files:

  • internal/cli/transcript.go: JSONL parsing and error emission
  • internal/cli/transcript_test.go: 15 unit tests
  • internal/scaffold/.../extract-transcript-error.sh: standalone tool

Modified:

  • internal/cli/run.go: call extractTranscriptErrors after iteration
    loop when agent exits non-zero

Closes #704

Post-script verification

  • Branch is not main/master (agent/704-surface-transcript-errors)
  • Secret scan passed (gitleaks — a97ff3b4fbc3f8942a369347ee2f47240737314f..HEAD)
  • Pre-commit hooks passed (authoritative run on runner)
  • Tests ran inside sandbox

When an agent run fails, the root cause error is buried in the
transcript JSONL artifact. Operators must download and manually
search the artifact to diagnose failures.

This adds two improvements:

1. In-process error surfacing (run.go): After transcript extraction,
   parse the JSONL for result events with is_error=true and emit
   ::error:: annotations to stderr. These appear in the GitHub
   Actions job summary, making failures diagnosable without
   downloading artifacts.

2. Standalone extraction script (extract-transcript-error.sh): A
   shell script that extracts errors from transcript files or
   directories. Usable by post-scripts, the triage agent (after
   downloading artifacts via gh run download), and operators
   debugging failed runs.

The transcript parser is intentionally lightweight: it scans only
for result-type lines (fast string contains check) and parses the
last one. Error messages are truncated to 2000 chars and sanitized
to prevent GHA command injection.

New files:
- internal/cli/transcript.go: JSONL parsing and error emission
- internal/cli/transcript_test.go: 15 unit tests
- internal/scaffold/.../extract-transcript-error.sh: standalone tool

Modified:
- internal/cli/run.go: call extractTranscriptErrors after iteration
  loop when agent exits non-zero

Closes #704

Signed-off-by: fullsend-code <fullsend-code@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

fullsend review is working on this — view logs

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

Site preview

Preview: https://c7b29a71-site.fullsend-ai.workers.dev

Commit: 1a4e16688b0bc62a3b42978eae2e5ba415645351

@fullsend-ai-review
Copy link
Copy Markdown

fullsend-ai-review Bot commented May 8, 2026

Review: #764

Head SHA: 1a4e166
Timestamp: 2026-05-08T00:00:00Z
Outcome: approve

Summary

This PR adds well-structured transcript error surfacing for agent failures, directly addressing issue #704. The Go implementation (transcript.go) is clean, with a fast isResultLine pre-filter, proper JSON parsing, UTF-8-safe truncation, and GHA command injection prevention via the existing sanitizeOutput helper. The standalone shell script mirrors the Go logic faithfully with appropriate sanitization. The integration point in run.go is minimal and correctly scoped to non-zero exit codes on the last iteration. Test coverage is thorough with 15 unit tests covering happy path, edge cases (empty files, missing files, spaced JSON), multi-result semantics, truncation, and GHA annotation output. No critical or high findings.

Findings

Medium

None.

Low

  • [Correctness] internal/cli/transcript.go:86-105scanner.Err() is never checked after the scan loop. If a JSONL line exceeds maxTranscriptLineSize (1 MB), bufio.Scanner stops scanning entirely and sets scanner.Err() to bufio.ErrTooLong. Any result event after the oversized line would be missed. This is a best-effort diagnostic feature and the 1 MB limit is generous, so the practical risk is very low, but checking scanner.Err() and logging a warning would improve debuggability.
    Remediation: After the scan loop, check if err := scanner.Err(); err != nil and optionally log a warning.

  • [Correctness] internal/scaffold/fullsend-repo/scripts/extract-transcript-error.sh:88 — The ${basename} variable in the ::error title=... annotation is not sanitized for GHA command injection (unlike safe_msg). A transcript filename containing :: could inject GHA workflow commands. Practical risk is negligible since transcript filenames are system-generated, but for defense-in-depth it would be worth applying the same sanitization.
    Remediation: Apply ${basename//::/ :} before using it in the annotation.

Info

  • [Style] internal/cli/run.go:592 — The step label 9e-bis is a creative numbering scheme. Consider using 9e for the new step and shifting the output scan to 9f (which was already done), or just numbering sequentially without the -bis suffix for clarity.

  • [Intent alignment] The PR addresses improvement 1 from issue Agent errors buried in transcript artifacts, not surfaced in workflow logs #704 (surfacing errors in workflow logs). Improvement 2 (letting agents read workflow artifacts) is not addressed, which is appropriate — the issue describes them as separate improvements with different priorities.

Footer

Outcome: approve
This review applies to SHA 1a4e16688b0bc62a3b42978eae2e5ba415645351. Any push to the PR head clears this review and requires a new evaluation.

Previous run

Review: #764

Head SHA: b8e7569
Timestamp: 2026-05-08T00:00:00Z
Outcome: approve

Summary

Clean, well-scoped change that adds transcript error surfacing to GitHub Actions workflow logs. The Go implementation is solid — proper use of sanitizeOutput for GHA command injection prevention, efficient JSONL scanning with a fast-path string check, sensible truncation, and thorough test coverage (15 tests covering error/success/empty/missing/multi-result cases). The integration point in run.go is correct: it targets the last iteration directory using the 1-indexed runCount, only fires on non-zero exit, and is positioned before the output scan step. Three non-blocking observations below.

Findings

Medium

  • [content security] internal/scaffold/fullsend-repo/scripts/extract-transcript-error.sh:82 — The shell script sanitizes :: to prevent GHA command injection but does not strip URL-encoded newlines (%0A, %0D) that GitHub Actions also interprets as line breaks in workflow commands. An attacker-controlled transcript error message containing %0A::warning::spoofed would inject additional annotations. The Go code handles this correctly via sanitizeOutput (which strips %0A/%0D), but the shell script is inconsistent. Consider adding: safe_msg="${safe_msg//%0A/ }" and similar for %0a, %0D, %0d.
    Remediation: Add %0A/%0D (case-insensitive) replacement to the shell script's sanitization, matching the Go sanitizeOutput behavior.

Low

  • [correctness] internal/cli/transcript.go:126truncateError truncates by byte length (len(msg)) rather than rune count. If the 2000-byte boundary falls mid-way through a multi-byte UTF-8 character, the resulting string will contain an incomplete rune. This is unlikely to cause problems in practice (the message is printed to stderr and GHA annotations), but utf8.RuneCountInString would be more correct.

Info

  • [style/conventions] internal/cli/transcript_test.go:149 — Test function name has a typo: TestExtractTranscriptErrors_MultiplFiles is missing the 'e' in 'Multiple'. Consider renaming to TestExtractTranscriptErrors_MultipleFiles.

Footer

Outcome: approve
This review applies to SHA b8e756993211deff48899f4d5cf8ccf1ef3ff50e. Any push to the PR head clears this review and requires a new evaluation.

Copy link
Copy Markdown
Contributor

@ralphbean ralphbean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-scoped, useful feature that directly addresses #704. The Go implementation is solid overall — clean separation of concerns, good test coverage, and correct integration point in run.go. Five findings below, the most important being unsanitized values interpolated into GHA workflow commands (filename and subtype) and inconsistent sanitization in the shell script.

// agent failures diagnosable without downloading artifacts.
func emitTranscriptErrors(w io.Writer, summaries []transcriptErrorSummary) {
for _, s := range summaries {
// Sanitize the error message to prevent GHA command injection.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[critical] s.Source (derived from the JSONL filename via filepath.Base) is interpolated directly into the ::error:: workflow command without sanitization. Only msg goes through sanitizeOutput. A crafted filename containing %0A or :: sequences could inject additional GHA commands.

The fallback on line 142 has the same issue with s.Subtype.

Fix:

fmt.Fprintf(w, "::error title=Agent Error (%s)::%s\n", sanitizeOutput(s.Source), msg)

And for the fallback:

msg = fmt.Sprintf("agent terminated with error (subtype: %s)", sanitizeOutput(s.Subtype))


# Emit GHA annotation if running in GitHub Actions.
if [[ "${GITHUB_ACTIONS:-}" == "true" ]]; then
# Sanitize for GHA: replace :: with : : to prevent command injection.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[important] The sanitization here only strips :: but does not handle %0A/%0D (URL-encoded newlines) that GitHub Actions also interprets as line breaks in workflow commands. The Go sanitizeOutput function handles these correctly. An attacker-controlled transcript message containing %0A::warning::spoofed would inject additional annotations.

Fix — add after the existing :: replacement:

safe_msg="${safe_msg//%0A/ }"
safe_msg="${safe_msg//%0a/ }"
safe_msg="${safe_msg//%0D/ }"
safe_msg="${safe_msg//%0d/ }"

Comment thread internal/cli/transcript.go Outdated
}

// truncateError trims an error message to maxTranscriptErrorLength.
// If truncated, appends an ellipsis indicator.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[moderate] truncateError uses byte length (len(msg)) and byte slicing (msg[:maxTranscriptErrorLength]). If the 2000-byte boundary falls in the middle of a multi-byte UTF-8 character (emoji, CJK, etc.), the result contains an incomplete rune.

Fix — walk back to a valid rune boundary, e.g.:

truncated := msg[:maxTranscriptErrorLength]
for len(truncated) > 0 && !utf8.Valid([]byte(truncated)) {
    truncated = truncated[:len(truncated)-1]
}
return truncated + "… (truncated)"


// isResultLine does a fast prefix/contains check to avoid parsing every
// JSONL line. Claude Code transcripts can be very large.
func isResultLine(line []byte) bool {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] isResultLine converts []byte to string via strings.Contains(string(line), ...), which allocates a new string (up to 1 MB per line, twice). Since this is meant to be a cheap pre-filter to avoid json.Unmarshal, use bytes.Contains instead to avoid the allocation:

func isResultLine(line []byte) bool {
    return bytes.Contains(line, []byte(`"type":"result"`)) ||
        bytes.Contains(line, []byte(`"type": "result"`))
}

// File 2: success result (should not appear in summaries).
ok2 := `{"type":"result","subtype":"success","is_error":false,"result":"all good"}`
if err := os.WriteFile(filepath.Join(dir, "agent2.jsonl"), []byte(ok2), 0o644); err != nil {
t.Fatal(err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Typo: TestExtractTranscriptErrors_MultiplFilesTestExtractTranscriptErrors_MultipleFiles (missing 'e' in 'Multiple').

@ralphbean
Copy link
Copy Markdown
Contributor

/fix those inline comments, please

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

fullsend fix is working on this — view logs

- Sanitize s.Source and s.Subtype in emitTranscriptErrors to prevent
  GHA command injection via crafted filenames or subtypes [critical]
- Add %0A/%0D sanitization to shell script's GHA annotation output
  to match the Go sanitizeOutput behavior [important]
- Fix truncateError to walk back to valid UTF-8 rune boundary before
  truncating, avoiding incomplete multi-byte characters [moderate]
- Switch isResultLine from strings.Contains(string(line)) to
  bytes.Contains to avoid unnecessary allocations [minor]
- Fix typo: MultiplFiles → MultipleFiles in test name [nit]

Addresses review feedback on #764

Signed-off-by: fullsend-fix <fullsend-code@users.noreply.github.com>
@fullsend-ai-coder
Copy link
Copy Markdown
Author

🔧 Fix agent — iteration 1 (human-triggered)

Addressed all 5 inline review findings from ralphbean: sanitized Source/Subtype in GHA commands (critical), added %0A/%0D sanitization to shell script (important), fixed UTF-8 truncation (moderate), switched to bytes.Contains (minor), fixed test name typo (nit). All 14 transcript tests pass.

Fixed (5):

  1. unsanitized s.Source and s.Subtype in GHA workflow commands (internal/cli/transcript.go): Wrapped s.Source and s.Subtype with sanitizeOutput() in emitTranscriptErrors to prevent GHA command injection via crafted filenames or subtypes containing %0A or :: sequences.
  2. shell script missing %0A/%0D sanitization (internal/scaffold/fullsend-repo/scripts/extract-transcript-error.sh): Added %0A/%0a/%0D/%0d replacement to safe_msg in the GHA annotation block, matching the Go sanitizeOutput behavior.
  3. truncateError splits multi-byte UTF-8 characters (internal/cli/transcript.go): Added utf8.Valid walk-back loop so truncation never splits a multi-byte rune. Added unicode/utf8 import.
  4. isResultLine allocates via string conversion (internal/cli/transcript.go): Replaced strings.Contains(string(line), ...) with bytes.Contains(line, []byte(...)) to avoid unnecessary heap allocations on lines up to 1 MB. Added bytes import.
  5. typo in test name MultiplFiles (internal/cli/transcript_test.go): Renamed TestExtractTranscriptErrors_MultiplFiles to TestExtractTranscriptErrors_MultipleFiles.

Tests: passed

Updated by fullsend fix agent

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 8, 2026

fullsend review is working on this — view logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent errors buried in transcript artifacts, not surfaced in workflow logs

1 participant