A comprehensive framework for memory-efficient neural network training through stochastic activation quantization during backpropagation.
Training deep neural networks is notoriously memory-intensive, with saved forward activations often dominating GPU memory usage during backpropagation. While quantization for inference and forward-path QAT are well-established, activation quantization specifically for backpropagation remains largely unexplored.
This research addresses a critical challenge: in modern CNNs, the activations stored for gradient computation frequently account for the majority of training memory, while model parameters and optimizer states are comparatively smaller. This asymmetry makes activations the primary target for memory optimization.
Key Insight: By intelligently quantizing only the saved activations used in backpropagation (while keeping forward computation unchanged), the work achieves substantial memory reductions without compromising training stability or final accuracy.
- Comprehensive implementation using Brevitas on EfficientNetV2
- Substantial memory reductions: ~8× reduction at 4-bit with competitive accuracy
- Stability optimization: Advanced techniques to prevent training spikes and ensure convergence
- Layer-wise analysis: Systematic study of quantization sensitivity across network layers
- Unbiased stochastic rounding to eliminate systematic quantization bias
- Per-bucket scaling with configurable bucket sizes for optimal dynamic range adaptation
- Vectorized implementation for computational efficiency
- Arbitrary two-bit mixing: Flexible probability-controlled precision selection (e.g., 2-bit/4-bit)
- Superior performance: 2-4 bit stochastic quantization often exceeds 32-bit baseline accuracy
- Multi-dataset evaluation: CIFAR-10, CIFAR-100, and Tiny ImageNet (ImageNet-200)
- Comprehensive bit-width analysis: 1-32 bits with focus on ultra-low precision (2-4 bits)
- Practical memory savings: Up to ~16× reduction with 2-bit stochastic quantization
- Robust generalization: Consistent improvements across datasets and architectures
This thesis pioneers the application of stochastic quantization to saved activations during the backward pass - a previously unexplored direction that demonstrates substantial training accuracy improvements. The significant performance gains, particularly the ability of low-bit quantized models to exceed unquantized baselines, underscore the profound research value and potential of this novel approach.
Impact Areas:
- 🎯 Edge Computing: Enable training on memory-constrained devices
- 🔋 Energy Efficiency: Reduce computational overhead and power consumption
- 🏗️ Scalability: Train larger models or use bigger batch sizes on existing hardware
- 📱 Democratization: Make advanced AI training accessible on consumer hardware
master_thesis/
├── thesis_template/msc-thesis-template-main/classic/
│ ├── ClassicThesis.tex # Main thesis document
│ ├── Chapters/ # Thesis chapters
│ │ ├── 01-introduction.tex # Introduction and motivation
│ │ ├── 02-background.tex # Background and related work
│ │ ├── 03-sota_and_rw.tex # State-of-the-art review
│ │ ├── 04-contribution1.tex # Deterministic quantization framework
│ │ ├── 05-contribution2.tex # Stochastic quantization approach
│ │ ├── 06-contribution3.tex # Generalization experiments
│ │ └── 07-discussion.tex # Discussion and conclusion
│ ├── Plots/ # Generated figures and results
│ ├── Tables/ # Experimental results tables
│ ├── FrontBackmatter/ # Abstract, acknowledgments, etc.
│ └── Bibliography.bib # References
├── plots/ # Additional visualization materials
└── README.md # This file
The thesis presents theoretical foundations and experimental validation of mix-precision stochastic quantization techniques. Key implementation concepts include:
- Per-bucket scaling with configurable bucket sizes
- Affine uniform quantization with min-max normalization
- Integration with Brevitas for deterministic nearest rounding
- Unbiased stochastic rounding to eliminate systematic bias
- Vectorized implementation for computational efficiency
- Mixed-precision capabilities with probability-controlled bit selection
The stochastic quantization follows the principle:
E[Q(x)] = x (unbiased property)
q_i = clip(k_i + Bernoulli(α_i), q_min, q_max)
Where quantization maintains expected value while reducing variance through per-bucket scaling.
| Method | Bit-width | Memory Reduction | CIFAR-10 Accuracy | CIFAR-100 Accuracy |
|---|---|---|---|---|
| FP32 Baseline | 32 | 1× | 94.2% | 78.5% |
| Deterministic | 4 | ~8× | 93.8% | 77.9% |
| Stochastic | 2 | ~16× | 94.6% ⬆️ | 79.1% ⬆️ |
| Stochastic | 4 | ~8× | 94.8% ⬆️ | 79.3% ⬆️ |
🎉 Remarkable Finding: Stochastic quantization at 2-4 bits often exceeds the unquantized 32-bit baseline while providing massive memory savings!
The complete thesis document is available in the repository:
- PDF:
thesis_template/msc-thesis-template-main/classic/ClassicThesis.pdf - LaTeX Source:
thesis_template/msc-thesis-template-main/classic/
@mastersthesis{li2025mixprecision,
title={Mix-Precision Stochastic Quantization of Activations during Backpropagation},
author={Li, Jiufeng},
year={2025},
school={Heidelberg University},
type={Master's Thesis},
url={https://github.com/CoreSheep/Stochastic-Activations-Quantization-Backward}
}For a rapid understanding of the work, refer to the presentation materials in presentations/ (coming soon).
This repository contains the thesis documentation for academic research on mix-precision stochastic quantization. For academic discussions and questions about the research, please feel free to open issues or contact the author.
This work is licensed under the GNU General Public License v2.0. See LICENSE for details.
- Brevitas Team for the excellent quantization framework
- PyTorch Community for the robust deep learning foundation
- Hardware and Arti�cial Intelligence (HAWAII) Lab for research support and resources
🌟 Star this repository if you find it helpful for your research! 🌟
Advancing the frontier of memory-efficient deep learning, one quantized activation at a time.