Skip to content
alex-spacemit edited this page Jun 5, 2026 · 2 revisions

llama.cpp Upstream Support Status

Welcome to the llama.cpp upstream support status page for SpacemiT SOCs!

This page summarizes the current llama.cpp support status, validation progress, and upstream patch work on SpacemiT platforms.

Overview

llama.cpp is a lightweight inference framework for large language models. This section collects the main documentation, development branches, and platform information referenced throughout this page.

Module Support Status

GGML Quantization Support For Matrix

Quantization Type X60/X100 A100
Q2_K ✔️
Q3_K ✔️
Q4_0 ✔️ ✔️
Q4_1 ✔️ ✔️
Q4_K ✔️ ✔️
Q5_0 ✔️
Q5_1 ✔️
Q5_K ✔️
Q6_K ✔️
Q8_0 ✔️

Upstrem Component Status

All these patches will be upstreamed incrementally.

Platform Component Submitted time Status Link Owner Comments
K1 Add SpacemiT backend Sep 29, 2025 Merged 15288 alex-spacemit
K3 Refactor SpacemiT backend and add K3 support May 14, 2026 Merged 22863 alex-spacemit
K3 Optimize FlashAttention with IME2 WIP co-seven

Local Component Status

These components are maintained as local extensions and will continue to be updated as needed.

Component Submitted time Status Link Owner Comments
Extend compatibility to a broader range of multimodal vision models WIP co-seven

Monthly Update Log

Month Summary Updated by
2026-05 Initial wiki template created alex-spacemit yutingnie