Add 'Spot-check + global metric for bulk regeneration after a shared-path fix' to ai-workflow.md

Pattern from wuseria S147 (#1122 cohort log refresh) — cost-effective shape for bulk-artifact regeneration.

## Pattern

When a fix touches a shared code path with N downstream artifacts to regenerate, full per-artifact inspection is expensive (N visual checks) and unnecessary. Pair a small spot-check (3-5 representative cases) with a global numeric metric (reference-set calibration, full test suite, link check, accessibility score). The spot-check catches obviously broken regeneration; the global metric catches drift across the population. Together they cover the same regression class as per-artifact review at a fraction of the time cost.

The reverse failure mode — global metric alone, no spot-check — misses regeneration bugs that pass numeric thresholds while producing visibly wrong artifacts (e.g. an SVG renderer that swaps a curve identity but keeps the same number of polylines). The other reverse mode — full per-artifact glance with no global metric — discovers post-merge that the change had a population-level effect outside the sample.

## Concrete example

wuseria/me-fuji#1147 refreshed 12 TTartisan production digitization logs after the #1122 chrome-strip fix. Full glance would have meant 24 overlay PNGs to inspect (max + stopped aperture per lens). Instead:

- Spot-check 3 lenses: the fix's primary target, the most-artifact-rich case, a clean baseline. All traced cleanly under visual inspection.
- Global metric: reference-set calibration showed 583/626 paired comparisons (93.1%) within ±0.05, with the only changed chart being viltrox freq30S (NOT a TTartisan lens), confirming the change is safe across the full TTartisan cohort population.

Total verification cost: ~5 minutes. Per-artifact equivalent would have been ~45 minutes. Same regression coverage.

## When to use

- Code change affects N artifacts that share a generator/renderer
- A global numeric metric exists that aggregates across N
- The artifacts have a visual or structural dimension the metric doesn't capture (overlay PNGs, layout SVGs, rendered HTML)

## When NOT to use

- Each artifact has independent semantics the global metric can't summarize
- The "shared path" has different effects on each artifact (no population assumption)
- N is small enough (< 5) that per-artifact glance is already cheap

## Why this belongs upstream

`templates/base/workflow/ai-workflow.md` already has "Review continuously" — review after each task, not after queuing 10 up. The spot-check + global metric pattern is the operational form of that principle for bulk-artifact cases where "after each" would dominate the change's actual work.

## Proposed wording

Add to `templates/base/workflow/ai-workflow.md` Lessons Learned section (alongside the candidates in #468):

> ### Spot-check + global metric for bulk regeneration after a shared-path fix
>
> When a fix touches a shared code path with N downstream artifacts to regenerate, full per-artifact inspection is expensive and unnecessary. Pair a small spot-check (3-5 representative cases — the fix's primary target, the most artifact-rich case, a clean baseline) with a global numeric metric that aggregates across the population (test suite pass rate, calibration result, link check, accessibility score). The spot-check catches obviously broken regeneration; the metric catches drift across the population. Same regression coverage as per-artifact review at a fraction of the cost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 'Spot-check + global metric for bulk regeneration after a shared-path fix' to ai-workflow.md #470

Pattern

Concrete example

When to use

When NOT to use

Why this belongs upstream

Proposed wording

Spot-check + global metric for bulk regeneration after a shared-path fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add 'Spot-check + global metric for bulk regeneration after a shared-path fix' to ai-workflow.md #470

Description

Pattern

Concrete example

When to use

When NOT to use

Why this belongs upstream

Proposed wording

Spot-check + global metric for bulk regeneration after a shared-path fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions