fix(metal): revert broken TBQ3_0/TBQ4_0 attn-score kernels to verified-correct sources#31
Open
lalalune wants to merge 5 commits into
Open
fix(metal): revert broken TBQ3_0/TBQ4_0 attn-score kernels to verified-correct sources#31lalalune wants to merge 5 commits into
lalalune wants to merge 5 commits into
Conversation
…ly kokoro CLI The iOS XCFramework slice (build-llama-cpp-mtp.mjs, BUILD_SHARED_LIBS=OFF) builds `elizainference_static` by name and links libelizainference.a into the framework, but the target was never defined (only the SHARED `elizainference`), so ninja failed 'unknown target elizainference_static'. And tools/kokoro built its standalone CLI unconditionally with install(TARGETS kokoro-tts RUNTIME), which on iOS (executables are app bundles) fails configure with 'no BUNDLE DESTINATION'. - tools/omnivoice/CMakeLists.txt: add an elizainference_static STATIC target mirroring the SHARED one (same FFI+core sources, includes, OMNIVOICE_BUILD / ELIZA_ENABLE_VISION defines, and the if(TARGET kokoro_lib) Kokoro fold → ELIZA_ENABLE_KOKORO), OUTPUT_NAME elizainference → libelizainference.a; no -reexport_library (SHARED-only). - tools/kokoro/CMakeLists.txt: guard the host-only kokoro-tts CLI + its install off iOS. Verified: ios-arm64-metal-fused AND ios-arm64-simulator-metal-fused both build clean (320/320), libelizainference.a exports the four eliza_inference_kokoro_* symbols + tts/asr/eot, CAPABILITIES.json abiVersion=11 kokoro=kokoro-82m; the desktop SHARED build is unaffected (additive target). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rt (80→88) The streaming-LLM config struct gained `context_size` (int32 @ offset 80) for ABI v9 — `_llm_stream_open` now honors `cfg->context_size` (>0) instead of only the ELIZA_LLM_N_CTX env default. The TS marshaller (ffi-bindings.ts) was already emitting 88 bytes with context_size at offset 80, but the C-side ABI-guard `static_assert(sizeof(eliza_llm_stream_config_t) == 80)` was never bumped, so the fused `elizainference` target failed to compile. Bump it to 88 to match the real layout (8×int32 + 5×ptr, pointer-aligned). Validated: a host Metal build of `libelizainference` (ABI v12) loads + generates on the real google/gemma-4-E2B with context_size=4096 honored (83 tok/s, M4 Max). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…d-correct sources Commit 412b848 ('fix(metal): correct TBQ3_0/TBQ4_0 attention score kernels') regressed kernel_turbo3_dot/kernel_turbo4_dot in eliza-shipped/: standalone metal_verify drops to 0/8 (outputs 20-170x off) and the built-fork Metal graph-dispatch smoke fails GGML_OP_ATTN_SCORE_TBQ/turbo3+turbo4. The metal-verify gate only tested native/metal/*.metal (the verify-harness copy), not the eliza-shipped/ copies the runtime embeds, so the regression shipped unguarded. Restore eliza-shipped/turbo3.metal + turbo4.metal from the verified native/metal copies (metal_verify 8/8). After the fix: dispatch_smoke PASS 9/9 routes; all shipped kernels pass parity. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit
412b8487b("fix(metal): correct TBQ3_0/TBQ4_0 attention score kernels") regressedeliza-shipped/turbo3.metal+turbo4.metal: standalonemetal_verifydrops to 0/8 (outputs 20–170× off) and the built-fork Metal graph-dispatch smoke failsGGML_OP_ATTN_SCORE_TBQ/turbo3+turbo4. Themetal-verifygate only tested the verify-harness copy (native/metal/*.metal), not theseeliza-shipped/copies the runtime embeds — so the regression shipped unguarded.Restores both from the verified
native/metal/copies (metal_verify8/8). After:dispatch_smoke9/9 routes PASS on Apple M4 Max, all shipped kernels parity-clean. Verified-here on M4 Max 2026-06-23.Co-Authored-By: Claude Opus 4.8 noreply@anthropic.com