on-device benchmark harness (v1.1) by 2dubu · Pull Request #1 · 2dubu/PaletteKit

2dubu · 2026-04-28T16:20:38Z

Summary

Add an on-device benchmark harness in Examples/PaletteKitDemo so PaletteKit performance claims can be measured per-device, per-content. Pick a real photo or a synthesized fixture, vary size /
quantizer / downsample, and export per-stage timings as CSV.
Apply bench-first methodology — the harness gates v1.x decisions on CPU/GPU work. The first round on iPhone 15 Pro confirmed that single-shot extraction is fast enough at default options that
progressive extraction is not worth shipping; v1.1 ships the harness itself instead.
Trim README API surface to a method skeleton + DocC link to reduce drift; add a "Benchmark on your device" section.

What's included

Bench harness (`Examples/PaletteKitDemo/PaletteKitDemo/Bench/`)

BenchModels — case definitions (size × quantizer × downsample), per-sample structures, summary aggregates with mean per-stage timings, marketing-name mapping for hardware identifiers.
BenchFixture — deterministic synthesized image (gradient + 5 colored blobs + per-pixel noise), plus resizeToSquare(_:side:) for real-photo input center-cropped to each grid size.
BenchRunner (@MainActor ObservableObject) — orchestrates warmup + measured runs, captures ExtractionTimings per sample, computes p50/p95/min/max + per-stage means, surfaces failure
counts.
BenchView — Source picker (Synthesized / Photo) with PhotosPicker, Configuration card with per-row InfoButton popovers, Run/Reset/Cancel state machine, Grid-based summary table,
stacked-bar chart, raw samples disclosure.
BenchChart (Swift Charts) — horizontal stacked bars showing decode / sample / quantize means per case.
BenchExport — Raw and Summary CSV with headers including PaletteKit version, device identifier, marketing name, source descriptor (synthesized or photo WxH), and optional run note.
BenchInfo — popover content for each Configuration field; reusable InfoButton component (compact-adapted on iPhone, fixed-width with text wrap).

Repo

README "## API at a glance" table (~30 lines) replaced by a method skeleton and a DocC reference link. ExtractionOptions defaults now live in Options.md as the single source of truth.
README "## Benchmark on your device" section introduces the harness and the benchmark/ convention.
.gitignore adds benchmark/ so CSV exports stay local without leaking device-specific results into the repo.

Methodology notes

The harness exists because v1.0 shipped with provisional CPU/GPU thresholds (metalAutoThreshold = 500_000) and a quality.stride default that no one had measured under load. v1.1 ships the
measurement infrastructure first; v1.x feature work queues behind it.

Findings from the first round (iPhone 15 Pro, both synthesized and a 12 MP photo):

Default options (Downsample.automatic(maxPixels: 1_000_000)) flatten quantize cost to ~80–140 ms regardless of input size. Metal vs CPU differ by <5 ms (within noise).
Metal becomes meaningfully faster (~5–10%) only when Downsample.disabled AND sampled pixel count ≥ ~1M — essentially the 4096²+ raw case.
The synthesized fixture underestimates real-world latency by 30–60% at small sizes — real photos populate more histogram bins, so MMCQ's median-cut PQ runs longer.
Decode + sample become the dominant cost in raw mode at 4096²+ inputs (>60% of total at 8K). Metal does not accelerate those stages, capping its real-world ceiling.

Findings stay local (memory/); README / DESIGN_SPEC / threshold corrections accumulate for a single batch update around v1.2 instead of churning per-discovery.

Test plan

swift test — existing PaletteKit tests pass (no library changes in this PR).
make demo-app regenerates the Xcode project; iOS simulator build succeeds.
Bench screen renders; info popovers expand with text wrap; source picker switches Synthesized ↔ Photo; PhotosPicker loads CGImage and shows thumbnail with original dimensions.
On-device runs (iPhone 15 Pro / A17 Pro): three matrix configurations executed (auto-only, raw-only, photo+raw); CSV exports include all expected header lines and per-stage columns; chart
renders bars correctly.
Reset returns the screen to first-entry state (configuration + samples + photo selection cleared).
CSV header reflects PaletteKit version, hardware identifier, marketing name, source descriptor, optional run note.

Seven additions on top of the initial bench harness: - Per-row info ⓘ popovers explaining each Configuration field. - Swift Charts stacked-bar showing decode/sample/quantize means. - Photo source — pick from library, center-cropped to each grid size. - '.auto' quantizer added so the gating decision itself is measured. - Run note field, flowed into CSV header. - CSV header gains palettekit_version, device_marketing, source. - Summary CSV gains decode_mean / sample_mean / quantize_mean. Plus polish: marketing-name device labels, tighter fonts, Grid-based summary table, spinner progress, failure surfacing, Run/Reset/Cancel state separation.

…mark/ Three small repo-level changes for the v1.1 release: - README gains a 'Benchmark on your device' section so the new bench harness is discoverable. - 'API at a glance' replaced by a concise method skeleton plus a link to the DocC reference. ExtractionOptions defaults move to Options.md (single source of truth, no README drift). - benchmark/ is gitignored — the convention for storing CSV exports locally without leaking device-specific results into the repo.

2dubu added 3 commits April 26, 2026 22:55

feat(demo): on-device benchmark harness for v1.1 measurement gate

68dc26f

2dubu self-assigned this Apr 28, 2026

2dubu merged commit 94e6d34 into main Apr 28, 2026
1 check passed

2dubu deleted the feature/v1.1-benchmark branch April 28, 2026 16:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

on-device benchmark harness (v1.1)#1

on-device benchmark harness (v1.1)#1
2dubu merged 3 commits intomainfrom
feature/v1.1-benchmark

2dubu commented Apr 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

2dubu commented Apr 28, 2026

Summary

What's included

Bench harness (Examples/PaletteKitDemo/PaletteKitDemo/Bench/)

Repo

Methodology notes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bench harness (`Examples/PaletteKitDemo/PaletteKitDemo/Bench/`)