Skip to content

plan-design-review can drop generated design variants during iterate/recommended rounds #1529

@SakenW

Description

@SakenW

Summary

When plan-design-review performs an iterate/recommended design round, multiple successful image generation calls can collapse into a single saved artifact such as variant-recommended.png. The earlier generated candidate is not preserved as a separate PNG and is not included in the comparison board, even though the image generation call succeeded and was billed.

What happened

In a Codex-hosted plan-design-review flow, I generated an initial 3-option board successfully:

  • variant-A.png
  • variant-B.png
  • variant-C.png
  • design-board.html

Then I asked for a recommended follow-up. The image provider logs showed two successful gpt-image-2 calls in that recommended round, but the design directory only contained one recommended output:

  • variant-recommended.png
  • design-board-recommended.html

The first recommended candidate was not saved as a unique file and was not visible in the board.

Expected behavior

Every successful image generation call should produce a durable, non-overwritten artifact and the board should include every generated candidate for that round.

For example, a recommended round with two generated candidates should create something like:

  • variant-recommended-A.png
  • variant-recommended-B.png
  • optionally variant-recommended.png as a selected/final alias only after all candidates are preserved

And design-board-recommended.html should compare both variant-recommended-A.png and variant-recommended-B.png.

Actual behavior

Only the last/final recommended image was saved as variant-recommended.png, so the user saw one PNG despite two successful generation calls.

Why this matters

This is a paid-generation artifact accounting problem: the user pays for multiple image calls but may only receive one visible candidate. It also makes design review less trustworthy because the comparison board no longer reflects all generated options.

Suggested fix

Add explicit artifact accounting to the design skill/tooling:

  1. Never reuse the same output path for multiple image generation calls in a round.
  2. Use sequence-stable names for recommended/iteration candidates, e.g. variant-recommended-A.png, variant-recommended-B.png, variant-iteration-01-A.png.
  3. After generation, verify that the number of saved PNGs matches the requested/generated candidate count.
  4. Build $D compare --images ... from all newly generated candidates, not only the last output.
  5. Treat variant-recommended.png as an optional alias/copy after preserving all candidates, not as the only output path.

Environment

  • gstack version: 1.33.2.0
  • local gstack commit: dc6252d
  • host: Codex wrapper running the upstream plan-design-review workflow

I added a local project wrapper override as a workaround, but this likely belongs upstream in the design artifact workflow so future runs cannot silently drop generated candidates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions