A survey of overlapping work

Before beginning this journey we asked a lot of questions about different numbering and quantization schemes. What we forgot to ask is what work has been done on using quarternary numbers in ML whether for quantization or otherwise.

There may be valuable insights and findings that we should take into account.
I did a brief google search and returned these results. I'll be there are a lot more. So please integrate, address, compare, distinguish at least the following where it might be relevant. Also consider their findings deeply.

```Work on quaternary quantization (2-bit precision, offering 4 discrete levels) has evolved from early theoretical models to specialized applications in edge computing and modern large-scale neural network compression. [1] 
1. Key Research & Frameworks

* Binary Quadratic Quantization (BQQ): Recent research presented at [[NeurIPS 2025](https://neurips.cc/virtual/2025/poster/119877)](https://neurips.cc/virtual/2025/poster/119877) introduces BQQ, which achieves a superior trade-off between memory efficiency and reconstruction error. It has shown strong performance in Post-Training Quantization (PTQ), outperforming previous state-of-the-art methods by up to 2.2% on datasets like ImageNet.
* Quaternary Neural Belief Propagation (BP4): Researchers have extended neural belief propagation decoders to the quaternary (BP4) level to improve decoding strategies for quantum Low-Density Parity-Check (QLDPC) codes, aiming for low-latency and high-performance decoding.
* QUAD: A specialized framework implemented using PyTorch and Hugging Face Transformers for the quantization and parameter-efficient tuning of Large Language Models (LLMs). [2, 3, 4] 

2. Domain-Specific Applications

* Edge & Embedded Hardware: Quaternary quantization is frequently used to deploy complex models on resource-constrained hardware.
* Industrial Monitoring: Implementation of quaternary quantization in CNN hardware accelerators has demonstrated an 89% reduction in memory demand while maintaining 96.37% accuracy for real-time bearing fault diagnosis.
   * Recurrent Neural Networks (RNNs): Early notable work explored quaternary schemes specifically for RNNs in sentiment analysis tasks.
* Image Steganography: The OPMS-QQGE method uses a quaternary quantized Gaussian embedding model to enhance security in image steganography, surpassing previous state-of-the-art methods in resisting CNN-based stegalyzers. [5, 6, 7, 8] 

3. Current Performance & Trade-offs

* Accuracy vs. Efficiency: While quaternary quantization drastically reduces VRAM usage—enabling larger models to run on consumer GPUs—it historically suffers from more significant accuracy loss compared to 4-bit formats like NVFP4.
* Fine-Tuning: New techniques like QuES (Quantized Expert Scaling) are being developed to improve arithmetic reasoning in 2-bit quantized models, making fine-tuning more accessible in low-resource environments. [3, 9, 10, 11, 12] 

Would you like to explore how quaternary quantization specifically compares to the newer 1.58-bit (ternary) "BitNet" architectures?

[1] [[https://www.intechopen.com](https://www.intechopen.com/chapters/38750#:~:text=For%20the%20quaternary%20data%20processing%20in%20optics%2C,polarized%20state%20of%20light%20as%20mentioned%20below:)](https://www.intechopen.com/chapters/38750#:~:text=For%20the%20quaternary%20data%20processing%20in%20optics%2C,polarized%20state%20of%20light%20as%20mentioned%20below:)
[2] [[https://arxiv.org](https://arxiv.org/pdf/2308.08208)](https://arxiv.org/pdf/2308.08208)
[3] [[https://neurips.cc](https://neurips.cc/virtual/2025/poster/119877#:~:text=Experimental%20results%20demonstrate%20that%20BQQ%20consistently%20achieves,respectively%2C%20with%20quantization%20equivalent%20to%202%20bits.)](https://neurips.cc/virtual/2025/poster/119877#:~:text=Experimental%20results%20demonstrate%20that%20BQQ%20consistently%20achieves,respectively%2C%20with%20quantization%20equivalent%20to%202%20bits.)
[4] [[https://arxiv.org](https://arxiv.org/html/2503.19353v1#:~:text=QUAD%20is%20implemented%20using%20PyTorch%20%2847%29%20and%20the%20Hugging%20Face%20Transformers%20%2848%29%20library.)](https://arxiv.org/html/2503.19353v1#:~:text=QUAD%20is%20implemented%20using%20PyTorch%20%2847%29%20and%20the%20Hugging%20Face%20Transformers%20%2848%29%20library.)
[5] [[https://arxiv.org](https://arxiv.org/html/2402.12263v2)](https://arxiv.org/html/2402.12263v2)
[6] [[https://www.mdpi.com](https://www.mdpi.com/1424-8220/23/13/5897)](https://www.mdpi.com/1424-8220/23/13/5897)
[7] [[https://ieeexplore.ieee.org](https://ieeexplore.ieee.org/iel7/10206/9970396/10214134.pdf)](https://ieeexplore.ieee.org/iel7/10206/9970396/10214134.pdf)
[8] [[https://www.meegle.com](https://www.meegle.com/en_us/topics/quantization/quantization-for-real-time-processing#:~:text=This%20makes%20it%20%28%20Quantization%20Error%20%29,like%20smartphones%2C%20IoT%20devices%2C%20and%20embedded%20systems.)](https://www.meegle.com/en_us/topics/quantization/quantization-for-real-time-processing#:~:text=This%20makes%20it%20%28%20Quantization%20Error%20%29,like%20smartphones%2C%20IoT%20devices%2C%20and%20embedded%20systems.)
[9] [[https://arxiv.org](https://arxiv.org/html/2602.03120v1)](https://arxiv.org/html/2602.03120v1)
[10] [[https://arxiv.org](https://arxiv.org/html/2507.17417v1)](https://arxiv.org/html/2507.17417v1)
[11] [[https://arxiv.org](https://arxiv.org/html/2509.13514v1#:~:text=Overall%2C%20no%20single%20configuration%20simultaneously%20optimizes%20accuracy%2C,efficiency%2C%20it%20typically%20degrades%20accuracy%20and%20robustness.)](https://arxiv.org/html/2509.13514v1#:~:text=Overall%2C%20no%20single%20configuration%20simultaneously%20optimizes%20accuracy%2C,efficiency%2C%20it%20typically%20degrades%20accuracy%20and%20robustness.)
[12] [[https://www.newline.co](https://www.newline.co/@zaoyang/ultimate-guide-to-gptq-quantization--e1a7bf92#:~:text=One%20of%20the%20standout%20advantages%20of%20GPTQ,GPUs%2C%20edge%20devices%2C%20or%20even%20mobile%20platforms.)](https://www.newline.co/@zaoyang/ultimate-guide-to-gptq-quantization--e1a7bf92#:~:text=One%20of%20the%20standout%20advantages%20of%20GPTQ,GPUs%2C%20edge%20devices%2C%20or%20even%20mobile%20platforms.)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A survey of overlapping work #69

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

A survey of overlapping work #69

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions