Skip to content

NCZ 26.5 Magnetar + MNEMOS — validated on Cix Sky1 (benchmark data inside) #23

@perlowja

Description

@perlowja

Cross-posting for the model hub community.

See related issues in this tracker:


The MNEMOS Embedkit is the embedding layer that ships with NCZ 26.5 Magnetar. It gives you a production-grade semantic memory backend running entirely on-device, no cloud dependency.

What it does

The Embedkit provides a unified embedding interface with automatic backend selection:

from embedkit import Engine

engine = Engine.auto()  # picks NPU on Cix Sky1, CPU ONNX elsewhere
vector = engine.embed("What does the agent remember about the user?")

On Cix Sky1, Engine.auto() detects /dev/aipu (the Zhouyi V3 NPU device) and routes to the npu-cix adapter using libnoe and a compiled .cix model. On other hardware it falls back to cpu-llamacpp or cpu-fastembed (ONNX).

Validated performance (Cix Sky1)

Backend Throughput Notes
NPU (libnoe + bge-small-zh.cix) 30+ emb/sec sustained Per-call job overhead; persistent-job path coming
CPU ONNX (fastembed) 700+ emb/sec batched Leaves NPU + GPU free for other workloads

Model: bge-small-zh-v1.5 (512-dim, 256-token context), validated over 2000-call sustained runs on MS-R1 64GB.

Architecture

Agent query
    ↓
MNEMOS semantic memory API
    ↓
Embedkit Engine.auto()
    ├── NPU adapter (libnoe + .cix model) — Cix Sky1
    ├── CPU ONNX adapter (fastembed) — any arm64/x86
    └── llama.cpp adapter (CPU/Vulkan) — fallback
    ↓
Vector embedding → MNEMOS vector store → retrieval results

The NPU runs concurrently with the Mali-G720 GPU (which handles LLM decode via llama.cpp Vulkan). No resource contention — both paths are always warm.

Status

  • npu-cix adapter: validated on Cix Sky1 / MS-R1
  • cpu-fastembed adapter: validated on arm64 and x86
  • 🔜 Bundled in ISO: currently installs post-boot via ncz install mnemos; will be baked into the next Magnetar release
  • 🔜 70–80 emb/sec on NPU: pending upstream libnoe persistent-job fix (tracked with cixtech)

Install (current)

On a running Magnetar system:

ncz install mnemos   # pulls MNEMOS server + Embedkit

The Embedkit source will be published to the MNEMOS organization shortly.


NCZ 26.5 Magnetar: https://github.com/nclawzero/distro/releases/tag/v26.5-magnetar

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions