Skip to content

Codex provider: output_format: json fails on JSONL event stream #361

@pocky

Description

@pocky

Summary

When an agent step uses provider: codex and output_format: json, the step fails during output processing:

step failed step=review_plan error=step review_plan: output format processing:
invalid JSON: invalid character '{' after top-level value
(content: {"type":"thread.started","thread_id":"..."}
{"type":"turn.started"}
{"type":"item.started","item":{"id":"item_0","type":"command_execution", ...

The Codex CLI (codex exec --json) emits a JSONL event stream (one JSON object per line: thread.started, turn.started, item.started, item.completed, turn.completed), but output_format: json expects a single JSON document. The validator chokes on the second top-level object.

The agent's real answer is present — nested as a JSON string inside the item.completed / agent_message event:

{"type":"item.completed","item":{"id":"item_7","type":"agent_message","text":"{\"status\":\"REJECTED\",\"score\":74,...}"}}

Root cause

The codex provider extracts the assistant message for display only, but does not feed cleaned text into the output_format pipeline. This contrasts with the other providers, where the skill docs (references/agent-steps.md) explicitly guarantee unconditional text extraction:

  • Claude (l.66): "Text extraction from the NDJSON stream is unconditional… When output_format: json is set, result.Response is additionally populated… works correctly even when lifecycle hooks inject extra NDJSON events."
  • Gemini (l.114) and OpenCode (l.134): same explicit guarantee.
  • Codex (l.173): "Output is JSONL streaming; AWF parses assistant.message … events for display" — no output_format: json guarantee.

So output_format: json is effectively unsupported on the Codex provider, while it works on Claude / Gemini / OpenCode.

Reproduction

review:
  type: agent
  provider: codex
  prompt: "Return a JSON object with a 'status' field."
  output_format: json        # <- breaks: raw JSONL stream hits the JSON validator
  options:
    model: gpt-5-codex

Expected: AWF extracts the final agent_message.text from the JSONL stream and validates that as the JSON document, populating .JSON / .Response — same contract as Claude/Gemini/OpenCode.

Actual: the raw multi-object JSONL stream is passed to the JSON validator → invalid character '{' after top-level value → step fails.

Suggested fix

In the codex provider, make text extraction unconditional (mirror the Claude/Gemini/OpenCode behavior): pull the last item.completed event whose item.type == "agent_message", use its .text as the cleaned output, and run output_format processing on that. Equivalent extraction in jq:

jq -rs '[ .[] | select(.type=="item.completed" and .item.type=="agent_message") | .item.text ] | last // empty'

Impact / workaround

Any workflow with a codex agent step that needs structured JSON output is affected (review/analysis/extraction steps that gate on .JSON.status, etc.). Current workaround is to drop output_format: json and extract the inner JSON from the raw stream manually in a downstream step — fragile and loses the .JSON accessor in transitions/log hooks/terminal messages.

Environment

  • AWF skill reference: awf-knowledge 0.9.3
  • Codex invocation: codex exec --json <prompt> (per references/agent-steps.md l.90)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions