Skip to content

Latest commit

 

History

History
33 lines (26 loc) · 2.62 KB

File metadata and controls

33 lines (26 loc) · 2.62 KB

Tensorbit Models

The Official Model Zoo for Tensorbit Labs.

This repository serves as a centralized library for pre-optimized neural network, large language model, and vision transformer binaries. Each model in this collection has been processed through the full Tensorbit P-D-Q pipeline (Pruning, Distillation, and Quantization) to ensure maximum performance on edge hardware without sacrificing reasoning capabilities.

Why Tensorbit Models?

Standard open-source models are often too "heavy" for on-device deployment. Tensorbit Labs specializes in transforming these heavy open-source models into lightweight, efficient versions suitable for on-device deployment. The system optimizes models through a specialized pipeline that combines pruning, distillation, and quantization to reduce size and latency while maintaining high accuracy.

  • Memory Efficiency: Up to % reduction in VRAM footprint.
  • Inference Speed: Optimized for tensorbit-run execution on NPU/ARM architectures.
  • Verified Benchmarks: Every binary is benchmarked via tensorbit-bench to ensure accuracy parity with the original models.

Performance Comparison

Please reference performance_comparison.csv to view comparisons between raw PyTorch vs. Tensorbit stats for every model in the zoo.

Model Catalog

Model Name Base Architecture Sparsity Precision Target Hardware
tb-llama-4-8b Llama 4 45% INT4 Apple M4 / Snapdragon G3
tb-mistral-next Mistral 30% INT4 ARM Cortex-A78
tb-vit-large ViT 50% INT8 Industrial NPU

Usage

These models are stored as .tb binaries designed to be loaded directly into the tensorbit-run engine.

# Example: Running a Tensorbit model locally
./tensorbit-run --model ./models/tb-llama-4-8b.tb --prompt "Explain quantum gravity."

Contribution & Requests

We focus on optimizing high-impact, open-weights models. If you would like to request a specific model optimization or contribute a "Tensorbit-ified" version of your own architecture, please open an issue or pull request with the label model-request.

License

The optimization weights (.tbm model files) are provided under the Apache License 2.0. Please refer to the original model creators (e.g., Meta, Mistral AI, etc.) for their underlying architectural licenses.