forked from ggml-org/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 5
Home
alex-spacemit edited this page Jun 5, 2026
·
2 revisions
Welcome to the llama.cpp upstream support status page for SpacemiT SOCs!
This page summarizes the current llama.cpp support status, validation progress, and upstream patch work on SpacemiT platforms.
llama.cpp is a lightweight inference framework for large language models. This section collects the main documentation, development branches, and platform information referenced throughout this page.
- Doc:
- Rolling patches file:
- Support SOCs
| Quantization Type | X60/X100 | A100 |
|---|---|---|
| Q2_K | ✔️ | |
| Q3_K | ✔️ | |
| Q4_0 | ✔️ | ✔️ |
| Q4_1 | ✔️ | ✔️ |
| Q4_K | ✔️ | ✔️ |
| Q5_0 | ✔️ | |
| Q5_1 | ✔️ | |
| Q5_K | ✔️ | |
| Q6_K | ✔️ | |
| Q8_0 | ✔️ |
All these patches will be upstreamed incrementally.
| Platform | Component | Submitted time | Status | Link | Owner | Comments |
|---|---|---|---|---|---|---|
| K1 | Add SpacemiT backend | Sep 29, 2025 | Merged | 15288 | alex-spacemit | |
| K3 | Refactor SpacemiT backend and add K3 support | May 14, 2026 | Merged | 22863 | alex-spacemit | |
| K3 | Optimize FlashAttention with IME2 | WIP | co-seven |
These components are maintained as local extensions and will continue to be updated as needed.
| Component | Submitted time | Status | Link | Owner | Comments |
|---|---|---|---|---|---|
| Extend compatibility to a broader range of multimodal vision models | WIP | co-seven |
| Month | Summary | Updated by |
|---|---|---|
| 2026-05 | Initial wiki template created | alex-spacemit yutingnie |