v0.12.0
What's Changed
- docs: clarify model.gguf placeholder in all README examples by @unamedkr in #19
- feat(wasm): Qwen3/Llama model selector + real-time streaming by @unamedkr in #20
- Fix GGUF BPE merge parsing — Qwen3/Llama3 garbage output by @unamedkr in #21
- perf: sort vocab before merge parsing + rebuild WASM with ASYNCIFY by @unamedkr in #22
- Fix Qwen3 garbage output: RMSNorm +1 for all Qwen-family models by @unamedkr in #23
- Fix Qwen RMSNorm + switch WASM demo to Qwen3.5 0.8B by @unamedkr in #24
- perf(wasm): SIMD128 + O3 + LTO for 2-4x faster browser inference by @unamedkr in #25
- ux(wasm): Thinking... indicator during prompt prefill by @unamedkr in #26
- perf(wasm): pthreads multi-threading + Service Worker COOP/COEP by @unamedkr in #27
- perf(wasm): Web Worker + no ASYNCIFY — maximum inference speed by @unamedkr in #28
- fix(wasm): OOM on low-memory devices by @unamedkr in #29
- fix(wasm): eliminate UI hang during prompt prefill by @unamedkr in #30
- ux(wasm): polished demo — progress bar, mobile, two-phase UI by @unamedkr in #31
- fix(wasm): drop pthreads — fixes UI hang by @unamedkr in #32
- fix(wasm): remove prefill sleep — restores token streaming by @unamedkr in #33
- fix(wasm): ccall({async:true}) — fixes ASYNCIFY streaming by @unamedkr in #34
- ux(wasm): clarify prefill wait + confirm streaming works by @unamedkr in #35
- feat(wasm): Llama 3.2 1B Instruct + skip Q4 reconversion by @unamedkr in #36
- feat(wasm): SmolLM2-135M fast default + Llama 1B quality option by @unamedkr in #37
- docs: address 'why not just use llama.cpp?' feedback by @unamedkr in #38
- docs(guide): 'When to use which?' table + C code in CTA by @unamedkr in #39
- i18n: complete EN/KO coverage for guide page by @unamedkr in #40
- fix(guide): Korean typo 겄건이 → 경계 by @unamedkr in #41
- feat(cli): ollama-parity — tq pull/list/run/serve by @unamedkr in #42
- feat(pypi): quantcpp CLI ollama-parity (pull/list/run/serve) by @unamedkr in #43
- chore: sync version fallback to 0.12.0 by @unamedkr in #44
Full Changelog: v0.8.0...v0.12.0