qjl

Here are 3 public repositories matching this topic...

mindtro / semafold

Vector compression with TurboQuant codecs for embeddings, retrieval, and KV-cache. 10x compression, pure NumPy, no GPU required.

retrieval quantization vector-database kv-cache llm-inference embedding-compression turboquant vector-compression qjl semafold

Updated Apr 1, 2026
Python

Ryuketsukami / turboquant-compression

Star

Near-optimal vector quantization for LLM KV cache compression. Python implementation of TurboQuant (ICLR 2026) — PolarQuant + QJL for 3-bit quantization with minimal accuracy loss and up to 8x memory reduction.

Updated Mar 28, 2026
Python

Ryuketsukami / turboquant-skill

Star

AI agent skill implementing Google's TurboQuant compression algorithm (ICLR 2026) — 6x KV cache memory reduction, 8x speedup, zero accuracy loss. Compatible with Claude Code, Codex CLI, and all Agent Skills-compatible tools.

Updated Mar 28, 2026
Python

Improve this page

Add a description, image, and links to the qjl topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the qjl topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qjl