Skip to content

fix(metal): revert broken TBQ3_0/TBQ4_0 attn-score kernels to verified-correct sources#31

Open
lalalune wants to merge 5 commits into
mainfrom
fix/metal-tbq3-tbq4-attn-score
Open

fix(metal): revert broken TBQ3_0/TBQ4_0 attn-score kernels to verified-correct sources#31
lalalune wants to merge 5 commits into
mainfrom
fix/metal-tbq3-tbq4-attn-score

Conversation

@lalalune

Copy link
Copy Markdown
Member

Commit 412b8487b ("fix(metal): correct TBQ3_0/TBQ4_0 attention score kernels") regressed eliza-shipped/turbo3.metal+turbo4.metal: standalone metal_verify drops to 0/8 (outputs 20–170× off) and the built-fork Metal graph-dispatch smoke fails GGML_OP_ATTN_SCORE_TBQ/turbo3+turbo4. The metal-verify gate only tested the verify-harness copy (native/metal/*.metal), not these eliza-shipped/ copies the runtime embeds — so the regression shipped unguarded.

Restores both from the verified native/metal/ copies (metal_verify 8/8). After: dispatch_smoke 9/9 routes PASS on Apple M4 Max, all shipped kernels parity-clean. Verified-here on M4 Max 2026-06-23.

Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com

Shaw and others added 5 commits June 21, 2026 23:42
…ly kokoro CLI

The iOS XCFramework slice (build-llama-cpp-mtp.mjs, BUILD_SHARED_LIBS=OFF) builds
`elizainference_static` by name and links libelizainference.a into the framework,
but the target was never defined (only the SHARED `elizainference`), so ninja
failed 'unknown target elizainference_static'. And tools/kokoro built its
standalone CLI unconditionally with install(TARGETS kokoro-tts RUNTIME), which on
iOS (executables are app bundles) fails configure with 'no BUNDLE DESTINATION'.

- tools/omnivoice/CMakeLists.txt: add an elizainference_static STATIC target
  mirroring the SHARED one (same FFI+core sources, includes, OMNIVOICE_BUILD /
  ELIZA_ENABLE_VISION defines, and the if(TARGET kokoro_lib) Kokoro fold →
  ELIZA_ENABLE_KOKORO), OUTPUT_NAME elizainference → libelizainference.a; no
  -reexport_library (SHARED-only).
- tools/kokoro/CMakeLists.txt: guard the host-only kokoro-tts CLI + its install
  off iOS.

Verified: ios-arm64-metal-fused AND ios-arm64-simulator-metal-fused both build
clean (320/320), libelizainference.a exports the four eliza_inference_kokoro_*
symbols + tts/asr/eot, CAPABILITIES.json abiVersion=11 kokoro=kokoro-82m; the
desktop SHARED build is unaffected (additive target).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rt (80→88)

The streaming-LLM config struct gained `context_size` (int32 @ offset 80) for
ABI v9 — `_llm_stream_open` now honors `cfg->context_size` (>0) instead of only
the ELIZA_LLM_N_CTX env default. The TS marshaller (ffi-bindings.ts) was already
emitting 88 bytes with context_size at offset 80, but the C-side ABI-guard
`static_assert(sizeof(eliza_llm_stream_config_t) == 80)` was never bumped, so the
fused `elizainference` target failed to compile. Bump it to 88 to match the real
layout (8×int32 + 5×ptr, pointer-aligned).

Validated: a host Metal build of `libelizainference` (ABI v12) loads + generates
on the real google/gemma-4-E2B with context_size=4096 honored (83 tok/s, M4 Max).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…d-correct sources

Commit 412b848 ('fix(metal): correct TBQ3_0/TBQ4_0 attention score kernels')
regressed kernel_turbo3_dot/kernel_turbo4_dot in eliza-shipped/: standalone
metal_verify drops to 0/8 (outputs 20-170x off) and the built-fork Metal
graph-dispatch smoke fails GGML_OP_ATTN_SCORE_TBQ/turbo3+turbo4. The metal-verify
gate only tested native/metal/*.metal (the verify-harness copy), not the
eliza-shipped/ copies the runtime embeds, so the regression shipped unguarded.

Restore eliza-shipped/turbo3.metal + turbo4.metal from the verified native/metal
copies (metal_verify 8/8). After the fix: dispatch_smoke PASS 9/9 routes;
all shipped kernels pass parity.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f99e2fbc-9080-44f0-8d8d-fcf9346599cc

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/metal-tbq3-tbq4-attn-score

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant