Skip to content

Prototype: CI smoke with stub asm from layout manifest#19

Open
andreadellacorte wants to merge 24 commits into
ser-pounce:mainfrom
andreadellacorte:prototype/ci-asm-manifest
Open

Prototype: CI smoke with stub asm from layout manifest#19
andreadellacorte wants to merge 24 commits into
ser-pounce:mainfrom
andreadellacorte:prototype/ci-asm-manifest

Conversation

@andreadellacorte
Copy link
Copy Markdown
Contributor

@andreadellacorte andreadellacorte commented May 24, 2026

Summary

Disk-free CI smoke that links every PRG end-to-end, with a footprint that doesn't grow with decomp progress.

The decomp devex contract: a contributor decompiling funcX edits src/, runs make check locally to byte-verify, and pushes. No fixture changes. Stale entries in the manifests are filtered out at CI time.

How it works

make ci-smoke (run by GitHub Actions on ubuntu-latest + macos-latest):

  1. Render asm stubs from fixtures/ci/asm-manifest.json (~800 KB) — per-region layout (section, label, size, alignment) of every .s splat would generate. tools/ci/asm_manifest.py --render emits stubs with .zero <N> between glabel/dlabel boundaries. Stale entries for already-decompiled functions are skipped automatically by scanning src/**/*.c for active INCLUDE_ASM macros.
  2. Render PNG stubs from fixtures/ci/png-manifest.json (~16 KB) — per-PNG shape (w, h, bitdepth, clut counts). Downstream .img.dat / .img.o rules produce bytes of the expected size.
  3. Unpack yaml + undefined bundles — 51 yaml asset descriptors + 42 undefined-symbol files unpacked from two JSON bundles.
  4. Generate link.d per binary from config/<binary>/splat.yaml + the asm-manifest + src/ walk + the INCLUDE_ASM scan. 18/21 exact-match the committed seeds; 3/21 are set-equal with different prereq order (benign for make).
  5. Copy seeded splat outputs: linker scripts (link.ld), splat headers (include/*.inc), lbas.h, undefined_*.txt — files that change rarely.
  6. Write synthetic data to data/<rel> (deterministic filler from data-manifest.tsv).
  7. Build every PRG under ROOD_CI=1.

Footprint comparison

PR #18 (committed seeds) This PR
Total files ~4,500 53
Total bytes ~45 MB ~2.2 MB
Per-decomp diff seeds need refresh 0 fixture files

Breakdown of the 53 files:

  • 4 manifests (~1.2 MB): asm, png, yaml-bundle, undefined-bundle, data-manifest
  • 27 seeds (~1 MB): 21 link.ld + 4 splat-generated include/*.inc + lbas.h + SLUS-01040_LBA.txt
  • 8 tooling scripts: asm_manifest.py, png_manifest.py, yaml_bundle.py, undefined_bundle.py, link_d_generator.py, setup_fixture.py, generate_manifest.py, tools/make/ci.mk
  • 5 test files + __init__.py + conftest.py: 43 pytest unit tests
  • 4 modified build files: Makefile, tools/make/shell.mk, .gitignore, .github/workflows/ci.yml
  • 1 modified: requirements.txt (pytest pin)
  • 2 modified: tools/make/objdiff.mk, tools/make/psxiso.mk (scope to !ROOD_CI)

Verification

  • 43 pytest unit tests (make ci-test) on the CI tooling itself — byte-count math in _line_bytes, parser round-trips, INCLUDE_ASM filtering, splat.yaml subsegment derivation.
  • make ci-smoke ends-to-end links 21 PRGs from stubs + synthetic data + generated link.d.
  • Wired into the GitHub Actions workflow as two steps: ci-test (fast-fail), then ci-smoke.

What CI verifies

make exits 0 and every PRG links. No byte-verification — the linked outputs are non-retail by design (synthetic input bytes). Retail matching remains make check locally with a disk dump. This catches: broken Makefile changes, missing INCLUDE_ASM references, link errors from stale fixture entries, regressions in the CI tooling parsers, broken nix dev shell.

Test plan

  • make ci-test passes (43/43 tests)
  • make ci-smoke passes locally
  • GitHub Actions green on ubuntu-latest + macos-latest
  • Manually verify: deleting one INCLUDE_ASM macro + adding a stub C body produces a src/ diff only (no fixture changes)

Replace the ~4,500 committed splat .s files with a single asm-manifest.json
(~800 KB) describing each file's section/label/size layout. At CI time,
tools/ci/asm_manifest.py --render walks the manifest and emits minimal
stub .s files (`.zero <N>` between glabel/dlabel boundaries) that assemble
to the same byte layout as splat's output.

Sizes vs PR ser-pounce#18:
  - .s files:     0 committed (was ~28 MB / 3,606 files)
  - asm-manifest: ~800 KB / 1 file
  - other seeds (build/config, build/assets, include/*.inc): ~5 MB / 405 files
  - net: ~6 MB / ~430 files (was ~45+ MB / 4,500+ files)

ci-smoke output:
  CI smoke build (stub asm + synthetic data) — `make ci-smoke` links all 21
  PRGs from stubs + synthetic data, exit 0 = pipeline ok. No byte-verify
  (the linked outputs are non-retail by design; matching is `make check`).

Per-PR diff for a typical decomp commit:
  - source: -INCLUDE_ASM line, +C function
  - asm-manifest.json: -1 label entry (removed funcX from the .s layout)
  - No 50-line .s file removal cluttering the diff.

Maintainer workflow:
  make check          # local build with disk
  make ci-fixture-asm-manifest  # refresh layout from build/src
build/assets seeds now exclude regenerable compile artifacts (.o, .obj, .h,
.img.dat) — the build pipeline reproduces them from .img.png sources via
tools/make/img.mk. Keeps only what splat actually extracts: .png, .yaml,
.vsString, .bin. Cuts asset seeds from 316 files / 4.2 MB to 241 / 3.0 MB.

Also: setup_fixture.py now touches each copied seed so cloned source files
(splat.yaml etc., with clone-time mtime) aren't treated as newer than the
seeded link.d files — which was making CI re-run splat and fail on the
missing disk image (PR ser-pounce#19 ubuntu-latest + macos-latest).
`.bin` (29 files) and `.vsString` (30 files) under fixtures/ci/seeds/build/assets
are not consumed by the build pipeline — verified locally by dropping them
and confirming `make ci-smoke` still links all 21 PRGs. `.png` and `.yaml`
must stay: .png feeds the .img.dat/.img.o rule chain; .yaml is consumed by
the vsString assembler.

Also seed build/src/include/lbas.h (the LBA constants header). Splat
normally generates it via mkpsxiso → make_lba_import.py; under ROOD_CI=1
we skip the disk so neither runs. The ubuntu-latest CI was tripping on a
nix-environment libstdc++ issue trying to invoke mkpsxiso transitively.
Seeding the header skips that path entirely.

Seed footprint: 271 files / 3.3 MB (was 305 files / 3.9 MB).
The 131 .png files under fixtures/ci/seeds/build/assets/ become a
20 KB shape manifest (fixtures/ci/png-manifest.json) — per-PNG
width/height/bitdepth/clut-count. At CI time, tools/ci/png_manifest.py
--render writes blank PNGs of the same shape using the existing pypng
plumbing in tools/etc/psx_img.py. Downstream .img.dat / .img.o
conversion produces bytes of the expected size.

The 51 .yaml files (real content used by the build — vsString tables,
asset descriptors) get bundled into a single JSON map of {path: text}
at fixtures/ci/yaml-bundle.json, then rendered out at CI time. Same
total bytes, but one file in PR diffs instead of 51.

Also: CI_SETUP in tools/make/ci.mk had `:=` immediate assignment, so
$(VPYTHON) expanded to empty (python.mk hadn't been included yet) and
setup_fixture.py ran under brew Python without the venv's pypng.
Switched to `=` deferred assignment. Same issue would've cropped up
elsewhere — worth a glance over the rest of ci.mk eventually.

Seed footprint:
  seeds/:        90 files / 1.2 MB  (was 271 / 3.3 MB)
  asm-manifest:  804 KB
  png-manifest:   16 KB
  yaml-bundle:   356 KB
  Total:         ~93 files / ~2.4 MB
CI_SETUP runs setup_fixture.py via VPYTHON (tools/python/bin/python3),
but VPYTHON doesn't exist until the venv-bootstrap rule creates it via
tools/python/.requirements.stamp. The earlier recipe ran setup first,
which fails on a fresh CI checkout with:

  bash: tools/python/bin/python3: No such file or directory

Reorder ci-smoke so the venv bootstrap runs first.
The 42 splat-generated undefined_funcs_auto.txt / undefined_syms_auto.txt
files under fixtures/ci/seeds/build/config/ contain real symbol-address
lines the linker needs (verified: emptying them caused R_MIPS_26 reloc
overflow), so they can't be stubbed — but they can be bundled. Same
trick as yaml-bundle.json: one JSON map of {path: text} renders out at
CI time via tools/ci/undefined_bundle.py.

Also: scope `BUILDDEPS += $(OBJDIFF)` and `BUILDDEPS += $(DUMPSXISO)` to
!ROOD_CI. Under ROOD_CI=1 BINARY_DEPS pulls BUILDDEPS to materialize the
period GCCs (which nix provides as prebuilt binaries via setup-old-gcc),
but it was also dragging in objdiff (Cargo build) and mkpsxiso (cmake
build) which ci-smoke doesn't use. CI was spending several minutes on
those — now skipped.

PR file count: 104 → 64.
The ci-smoke recipe recurses with `tools/.sysdeps` and
`tools/python/.requirements.stamp` to materialize the venv before
setup_fixture runs. But that submake's MAKECMDGOALS doesn't contain
ci-smoke, so the SKIPSPLAT filter at the bottom of the Makefile didn't
suppress `include $(BINARY_DEPS)` — make then tried to build the link.d
files via splat (no disk available) and died with "Splitting SLUS_010.40".

Include the two bootstrap targets in SKIPSPLAT so the include is skipped
in those recursive calls too.
asm_manifest.render_stubs now scans src/**/*.c for active INCLUDE_ASM
macros and skips rendering any per-function stub whose glabel symbol no
longer appears in any C file. The manifest can carry stale entries for
decompiled functions without causing duplicate-symbol link errors — the
fixture stays stable as the decomp percentage climbs.

11 stubs skipped on this branch (already-decompiled funcs in asm-manifest).

Remaining decomp friction: link.d still lists the .o for the stale
nonmatching, so make tries to compile a stub that no longer renders and
hits "missing .s". The link.d generator (WIP in a separate prototype)
closes that gap.
@andreadellacorte
Copy link
Copy Markdown
Contributor Author

working through reducing pre-baked in files + avoiding the decomp process having to touch / regenerate the fixtures

29 pytest tests under tools/ci/tests/ cover the load-bearing logic in
the CI-side platform code that PR ser-pounce#19 introduced:

- asm_manifest._line_bytes: byte-count for every directive type (.word,
  .short, .byte, .zero, .ascii/asciz incl. \\x and \\nnn escapes,
  .double/.quad/.float, instructions, no-op metadata). 23 tests.
- asm_manifest._ascii_bytes, _apply_align: edge cases.
- asm_manifest.parse_s: function/data/mixed/anonymous section shapes.
- asm_manifest.render(): round-trip preserves layout.
- asm_manifest._active_include_asm_names: INCLUDE_ASM scan.
- yaml_bundle, undefined_bundle: build + render round-trip.
- png_manifest: 4bpp/8bpp/16bpp build + render preserves shape + clut
  counts.

Wired in as a new `make ci-test` target that runs before ci-smoke, both
locally and in GitHub Actions. Adds pytest==8.3.4 to requirements.txt.

The byte-counting tests are the highest-ROI: if _line_bytes drifts even
by one, every stub .s assembles to a different size, addresses shift,
the linker either fails late or silently produces a broken binary.
tools/ci/link_d_generator.py reads:
  - config/<binary>/splat.yaml (text/data/bss subsegments + image segments)
  - fixtures/ci/asm-manifest.json (per-section .o derivation)
  - src/<binary>/**.c walk (for compiled C objects)
  - INCLUDE_ASM scan in src/ (filter pending nonmatchings)

and emits 21 build/config/<binary>/link.d files at CI fixture-setup time.

Validates as 18/21 exact match against the prior committed seeds; the
remaining 3 (SLUS_010.40, BATTLE/BATTLE, ENDING/ENDING) are set-equal
with different prereq order — benign for make.

Combined with the INCLUDE_ASM filter in asm_manifest.render_stubs, this
closes the decomp devex gap: a contributor decompiling funcX edits only
src/, the asm-manifest can carry stale funcX entries harmlessly, and
the regenerated link.d drops nonmatchings/funcX.o automatically.

Drops 21 committed link.d files from fixtures/ci/seeds/build/config/.
PR file count: 64 → 51.
14 tests covering the load-bearing parser + .o-derivation logic that
emits one link.d per binary at CI time:

- _parse_list_form / _parse_dict_form: subsegment syntax (anonymous,
  named, end-marker, linker_entry=false).
- parse_subsegments: walks a multi-segment splat.yaml shape.
- objs_from_splat: c/data/img/.section-merge derivation rules.
- objs_from_csrc: src-tree walk.
- render_link_d: output shape (elf rule + per-.o targets + -include
  block).

Total tests: 43 (was 29).
@andreadellacorte
Copy link
Copy Markdown
Contributor Author

checking some more things

Verified scenario locally: dropping `INCLUDE_ASM(..., CdStatus)` from
src/SLUS_010.40/libcd/SYS.c (without adding a C body) now fails ci-smoke
with "hash mismatch: SLUS_010.40" — the per-PRG hash check catches the
silent linker fallback (--unresolved-symbols=ignore-all on .elf builds
masks the missing-symbol error otherwise).

Adds `make ci-smoke-refresh` as the maintainer flow for intentional
output changes: rebuild, write new hashes, commit alongside the src/
diff.

Note: a per-decomp commit needs a 1-line bump in fixtures/ci/expected.sha256
(the affected PRG's hash shifts because stub .zero bytes differ from
compiled-C retail bytes). True zero-fixture-touch would need per-function
hash verification — explored but scoped out: it requires pyelftools +
retail .o files via local `make check` to build the manifest, and the
prototype hit environment friction not worth resolving for the marginal
DX gain.
asm-manifest now carries a `sha256` field on each glabel/dlabel — the
SHA-256 of that symbol's bytes in the linked .elf. At ci-smoke time,
tools/ci/verify_function_hashes.py:

  1. Reads each binary's build/data/<bin>.elf.
  2. For each labeled region with a sha256: if the function is still
     INCLUDE_ASM'd in src/, skip (stub bytes are placeholders, not retail).
     Otherwise, hash the actual .elf bytes and compare.

This catches at function granularity:
  - Buggy C decomp (compiles but produces non-retail bytes).
  - Dropped INCLUDE_ASM without a C body (missing-from-.elf or zero-pad
    address shift cascades — verified locally: scenario flagged 183
    symbol-level mismatches on a single dropped INCLUDE_ASM).
  - Tampered fixtures.

Strictly stronger than the per-PRG hash check it replaces:
  - Pinpoints the affected symbols rather than just "this PRG drifted".
  - Doesn't require a hash refresh on every decomp commit — manifest
    entries for newly-decompiled funcs get their hash updated only if
    the maintainer rebuilds it (via `make ci-fixture-asm-manifest-with-hashes`).

ELF parsing via pyelftools (added to requirements.txt). Cached per-binary
to walk each .elf once per ci-smoke run.

8 new tests cover path → binary derivation, the INCLUDE_ASM consistency
contract between asm_manifest and verify_function_hashes, and the
verifier's three branches (no-hashes, stubbed-skipped, missing-.elf).

Total tests: 51 (was 43).
…comps).

Previously the manifest stored sha256 for EVERY label, including stubbed
functions whose .elf bytes are .zero N. When upstream decomped one of
those stubbed functions, the merged PR build produced real retail bytes
for it; ci-smoke compared against the stored stub-hash and fail.

Now build_manifest() skips hashing any glabel whose name is in the
current active INCLUDE_ASM set. The verifier already skipped those at
verify time, so the only effect is the manifest is smaller and doesn't
carry stale stub-hashes that go bad on upstream decomp.

Tradeoff: newly-decompiled functions aren't byte-verified by ci-smoke
(no hash to compare against). They remain covered by the contributor's
local `make check`. A maintainer refresh of the manifest brings the
new functions under byte-verification.

Manifest: 4714 → 3776 hashes (1.2M → 1.1M). Stubbed funcs no longer
recorded — only decompiled ones.

Also merges origin/main (brings in 4 upstream decomps that exposed
the stale-stub-hash bug locally: func_8009F858, F C20, 01C8, 0ABC).
`make commit-check` already cleans + rebuilds + runs byte-match (via
remake → make → check). At that point build/data/*.elf contain bytes
that are proven-retail (otherwise the byte-match would have failed).

Wire `ci-fixture-asm-manifest-with-hashes` in between remake and
objdiff so the manifest's per-function sha256 fields get refreshed
with retail hashes anchored to a passing make check.

Contributor flow:
  - Iterate: `make check` (fast, byte-verifies)
  - Pre-push: `make commit-check` (slow, refreshes manifest + objdiff)
  - Commit: src + 1 line in asm-manifest.json (the new function hash)

This closes the gap where a newly-decompiled function wasn't byte-
verified by CI. The hash gets recorded automatically on a passing
make check, and subsequent CI runs verify the recorded hash. A
buggy decomp would fail make check locally → no manifest refresh
→ reviewer notices missing manifest diff in a decomp commit.
Closes the gap from the previous commit: a newly-decompiled function
(no INCLUDE_ASM, no sha256 in manifest) used to silently skip
verification — meaning a buggy decomp could pass CI as long as the
build linked.

Now: any glabel in build/src/<binary>/nonmatchings/ that isn't in the
current active INCLUDE_ASM set MUST have a sha256 in the manifest.
Missing hash → reported with the count and the names, returns non-zero.
Tells contributor to run `make commit-check` to anchor the hash to a
passing `make check`.

splat's matchings/ tree is excluded — it's reference disassembly for
already-decompiled functions, not part of the active decomp contract.
Same goes for dlabel/jlabel (data, not function decomp).

Verified locally: dropping `INCLUDE_ASM(..., CdLastCom)` from
src/SLUS_010.40/libcd/SYS.c flags `CdLastCom` and exits 1.
The previous design rendered stubs as `.zero N` zero-bytes. When a
function transitioned from INCLUDE_ASM (stub = zeros) to decompiled
(C body = retail bytes), the .elf changed → PRG hash shifted →
ci-smoke required a hash bump on every decomp commit (and the merge
friction we hit twice this session when upstream decomps landed).

Now asm-manifest carries `bytes` per label (hex-encoded retail content
from the maintainer's `make check`-validated build). Render emits
.byte directives instead of .zero. Result:
  - Stub-assembled bytes == retail bytes
  - C-decompiled bytes == retail bytes (when correctly decompiled)
  - PRG hash invariant under decomp transitions
  - Buggy decomp still fails ci-smoke (C produces non-retail bytes →
    PRG hash diverges from expected.sha256)

Switched back to per-PRG verify_hashes.py (simpler, sufficient now
that hashes are decomp-stable). Dropped verify_function_hashes.py
and its tests. Dropped commit-check's manifest-refresh step.

Maintainer flow on intentional toolchain/output changes:
  make check                       # local, against real disk
  make ci-fixture-asm-manifest     # now sources bytes from build/src

Manifest size: 1.1M → 4.1M (carries 5,041 labels × retail bytes).
Contributor decomp flow: just edit src/, run make check, commit.
Local hashes (committed by 0d1859f) were built before merging in
main's 5 latest decomps; CI checks out the merged state, so its build
produced different bytes for BATTLE.PRG and ENDING.PRG.

Sources manifest bytes from THIS worktree's build/src (post-merge,
post-ci-smoke) instead of the stale parent worktree. Updated
expected.sha256 to the post-merge build hashes.

This is the maintainer-refresh path for the bytes-in-manifest design:
when upstream decomps invalidate the manifest, rebuild it locally
against a fresh ci-smoke run, then update expected.sha256.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant