Skip to content

Commit c0efd46

Browse files
author
spectre
committed
feat(kernels): add blitz-int8-matmul + blitz-bf16-matmul — 11 kernels, 162 tests
Wave 4 (INT8 quantized matmul): 4× memory bandwidth reduction, symmetric per-tensor and per-channel quantization, production-grade INT8 inference. Wave 5 (BF16 matmul): H100/A100/TPU native dtype, same dynamic range as FP32, trivial conversion, LLaMA/Mistral/Gemma default training dtype. Catalog: 11 kernels (10 workspace + cc-faculty-wasm). 162 tests passing. Pricing corrected: Full catalog $6,500 (matches landing page). -- NEXUS | 2026-03-11
1 parent 9e8300e commit c0efd46

6 files changed

Lines changed: 960 additions & 2 deletions

File tree

Cargo.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ members = [
88
"kernels/blitz-fused-mlp",
99
"kernels/blitz-swiglu",
1010
"kernels/blitz-rmsnorm",
11+
"kernels/blitz-int8-matmul",
12+
"kernels/blitz-bf16-matmul",
1113
]
1214

1315
[workspace.package]

README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
[![built with Rust](https://img.shields.io/badge/built_with-Rust-orange)](https://www.rust-lang.org)
99
[![target: wasm32-wasip2](https://img.shields.io/badge/target-wasm32--wasip2-purple)](https://wasi.dev)
1010

11-
## Kernel Catalog — 9 Kernels Available
11+
## Kernel Catalog — 11 Kernels Available
1212

1313
| Kernel | Description | Use Case |
1414
|--------|-------------|----------|
@@ -20,6 +20,8 @@
2020
| `blitz-fused-mlp` | Fused Linear→LayerNorm→GELU→Linear | Full FFN block (GPT-style) |
2121
| `blitz-swiglu` | SwiGLU gated activation | LLaMA 2/3, Mistral FFN |
2222
| `blitz-rmsnorm` | RMS LayerNorm (no mean subtraction) | LLaMA/Mistral/Gemma normalization |
23+
| `blitz-int8-matmul` | INT8 quantized matmul · 4× memory bandwidth | Quantized weight inference |
24+
| `blitz-bf16-matmul` | BF16 matmul · H100/TPU native dtype | Mixed-precision LLM inference |
2325
| `cc-faculty-wasm` | Claude Code cognitive substrate | Agent memory + reasoning integration |
2426

2527
**[→ View full catalog and pricing](https://blitzkernels.pages.dev)**
@@ -91,6 +93,7 @@ rms_norm(&mut hidden, &weight, hidden_size, 1e-6);
9193
│ blitz-embedding → blitz-attention → blitz-kv-cache │
9294
│ blitz-rope → blitz-rmsnorm → blitz-swiglu │
9395
│ blitz-fused-mlp → blitz-layernorm-gelu │
96+
│ blitz-int8-matmul → blitz-bf16-matmul │
9497
│ cc-faculty-wasm │
9598
└──────────────────────────────────────────────────────┘
9699
Pure Rust · No allocator required
@@ -109,7 +112,7 @@ rms_norm(&mut hidden, &weight, hidden_size, 1e-6);
109112
| Option | Price | Includes |
110113
|--------|-------|---------|
111114
| Single kernel | **$1,500** | Pre-compiled WASM + source + 30-min integration call |
112-
| Full catalog (9) | **$8,500** | All 9 kernels + dedicated integration support + priority updates |
115+
| Full catalog (11) | **$6,500** | All 11 kernels + dedicated integration support + priority updates |
113116
| Support add-on | **$200/mo** | Priority email, patch releases, architecture review |
114117

115118
**[→ Purchase at blitzkernels.pages.dev](https://blitzkernels.pages.dev)**
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
[package]
2+
name = "blitz-bf16-matmul"
3+
version = "0.1.0"
4+
edition = "2021"
5+
description = "BF16 matrix multiplication — H100/A100 inference dtype, pure Rust, WASM-portable"
6+
license = "MIT"
7+
8+
[lib]
9+
crate-type = ["cdylib", "rlib"]
10+
11+
[features]
12+
default = ["std"]
13+
std = []

0 commit comments

Comments
 (0)