Mix-Precision Stochastic Quantization of Activations during Backpropagation 🧠⚡

A comprehensive framework for memory-efficient neural network training through stochastic activation quantization during backpropagation.

🎯 Motivation

Training deep neural networks is notoriously memory-intensive, with saved forward activations often dominating GPU memory usage during backpropagation. While quantization for inference and forward-path QAT are well-established, activation quantization specifically for backpropagation remains largely unexplored.

This research addresses a critical challenge: in modern CNNs, the activations stored for gradient computation frequently account for the majority of training memory, while model parameters and optimizer states are comparatively smaller. This asymmetry makes activations the primary target for memory optimization.

Key Insight: By intelligently quantizing only the saved activations used in backpropagation (while keeping forward computation unchanged), the work achieves substantial memory reductions without compromising training stability or final accuracy.

🚀 Key Contributions

1. Deterministic Activation Quantization Framework 📊

Comprehensive implementation using Brevitas on EfficientNetV2
Substantial memory reductions: ~8× reduction at 4-bit with competitive accuracy
Stability optimization: Advanced techniques to prevent training spikes and ensure convergence
Layer-wise analysis: Systematic study of quantization sensitivity across network layers

2. Stochastic Activation Quantization with Mixed-Precision 🎲

Unbiased stochastic rounding to eliminate systematic quantization bias
Per-bucket scaling with configurable bucket sizes for optimal dynamic range adaptation
Vectorized implementation for computational efficiency
Arbitrary two-bit mixing: Flexible probability-controlled precision selection (e.g., 2-bit/4-bit)
Superior performance: 2-4 bit stochastic quantization often exceeds 32-bit baseline accuracy

3. Extensive Experimental Validation and Generalization 📈

Multi-dataset evaluation: CIFAR-10, CIFAR-100, and Tiny ImageNet (ImageNet-200)
Comprehensive bit-width analysis: 1-32 bits with focus on ultra-low precision (2-4 bits)
Practical memory savings: Up to ~16× reduction with 2-bit stochastic quantization
Robust generalization: Consistent improvements across datasets and architectures

🔬 Research Significance

This thesis pioneers the application of stochastic quantization to saved activations during the backward pass - a previously unexplored direction that demonstrates substantial training accuracy improvements. The significant performance gains, particularly the ability of low-bit quantized models to exceed unquantized baselines, underscore the profound research value and potential of this novel approach.

Impact Areas:

🎯 Edge Computing: Enable training on memory-constrained devices
🔋 Energy Efficiency: Reduce computational overhead and power consumption
🏗️ Scalability: Train larger models or use bigger batch sizes on existing hardware
📱 Democratization: Make advanced AI training accessible on consumer hardware

📁 Thesis Structure

master_thesis/
├── thesis_template/msc-thesis-template-main/classic/
│   ├── ClassicThesis.tex                     # Main thesis document
│   ├── Chapters/                             # Thesis chapters
│   │   ├── 01-introduction.tex               # Introduction and motivation
│   │   ├── 02-background.tex                 # Background and related work
│   │   ├── 03-sota_and_rw.tex               # State-of-the-art review
│   │   ├── 04-contribution1.tex              # Deterministic quantization framework
│   │   ├── 05-contribution2.tex              # Stochastic quantization approach
│   │   ├── 06-contribution3.tex              # Generalization experiments
│   │   └── 07-discussion.tex                 # Discussion and conclusion
│   ├── Plots/                                # Generated figures and results
│   ├── Tables/                               # Experimental results tables
│   ├── FrontBackmatter/                      # Abstract, acknowledgments, etc.
│   └── Bibliography.bib                      # References
├── plots/                                    # Additional visualization materials
└── README.md                                 # This file

🛠️ Implementation Overview

The thesis presents theoretical foundations and experimental validation of mix-precision stochastic quantization techniques. Key implementation concepts include:

Deterministic Quantization Framework

Per-bucket scaling with configurable bucket sizes
Affine uniform quantization with min-max normalization
Integration with Brevitas for deterministic nearest rounding

Stochastic Quantization Approach

Unbiased stochastic rounding to eliminate systematic bias
Vectorized implementation for computational efficiency
Mixed-precision capabilities with probability-controlled bit selection

Mathematical Formulation

The stochastic quantization follows the principle:

E[Q(x)] = x  (unbiased property)
q_i = clip(k_i + Bernoulli(α_i), q_min, q_max)

Where quantization maintains expected value while reducing variance through per-bucket scaling.

📊 Key Results

Method	Bit-width	Memory Reduction	CIFAR-10 Accuracy	CIFAR-100 Accuracy
FP32 Baseline	32	1×	94.2%	78.5%
Deterministic	4	~8×	93.8%	77.9%
Stochastic	2	~16×	94.6% ⬆️	79.1% ⬆️
Stochastic	4	~8×	94.8% ⬆️	79.3% ⬆️

🎉 Remarkable Finding: Stochastic quantization at 2-4 bits often exceeds the unquantized 32-bit baseline while providing massive memory savings!

📚 Full Thesis & Citation

📖 Access Full Thesis

The complete thesis document is available in the repository:

PDF: thesis_template/msc-thesis-template-main/classic/ClassicThesis.pdf
LaTeX Source: thesis_template/msc-thesis-template-main/classic/

📝 Citation Format

@mastersthesis{li2025mixprecision,
  title={Mix-Precision Stochastic Quantization of Activations during Backpropagation},
  author={Li, Jiufeng},
  year={2025},
  school={Heidelberg University},
  type={Master's Thesis},
  url={https://github.com/CoreSheep/Stochastic-Activations-Quantization-Backward}
}

🎤 Quick Overview Presentation

For a rapid understanding of the work, refer to the presentation materials in presentations/ (coming soon).

🤝 Contributing

This repository contains the thesis documentation for academic research on mix-precision stochastic quantization. For academic discussions and questions about the research, please feel free to open issues or contact the author.

📄 License

This work is licensed under the GNU General Public License v2.0. See LICENSE for details.

🙏 Acknowledgments

Brevitas Team for the excellent quantization framework
PyTorch Community for the robust deep learning foundation
Hardware and Arti�cial Intelligence (HAWAII) Lab for research support and resources

🌟 Star this repository if you find it helpful for your research! 🌟

Advancing the frontier of memory-efficient deep learning, one quantized activation at a time.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Mix-Precision Stochastic Quantization of Activations during Backpropagation.pdf		Mix-Precision Stochastic Quantization of Activations during Backpropagation.pdf
README.md		README.md
master_thesis_defense.pptx		master_thesis_defense.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mix-Precision Stochastic Quantization of Activations during Backpropagation 🧠⚡

🎯 Motivation

🚀 Key Contributions

1. Deterministic Activation Quantization Framework 📊

2. Stochastic Activation Quantization with Mixed-Precision 🎲

3. Extensive Experimental Validation and Generalization 📈

🔬 Research Significance

📁 Thesis Structure

🛠️ Implementation Overview

Deterministic Quantization Framework

Stochastic Quantization Approach

Mathematical Formulation

📊 Key Results

📚 Full Thesis & Citation

📖 Access Full Thesis

📝 Citation Format

🎤 Quick Overview Presentation

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Mix-Precision Stochastic Quantization of Activations during Backpropagation 🧠⚡

🎯 Motivation

🚀 Key Contributions

1. Deterministic Activation Quantization Framework 📊

2. Stochastic Activation Quantization with Mixed-Precision 🎲

3. Extensive Experimental Validation and Generalization 📈

🔬 Research Significance

📁 Thesis Structure

🛠️ Implementation Overview

Deterministic Quantization Framework

Stochastic Quantization Approach

Mathematical Formulation

📊 Key Results

📚 Full Thesis & Citation

📖 Access Full Thesis

📝 Citation Format

🎤 Quick Overview Presentation

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages