Skip to content

Commit c39bf72

Browse files
Reuvenruvnet
andcommitted
feat(ruvector-cnn): implement ADR-091 INT8 CNN quantization
Complete implementation of INT8 quantization for ruvector-cnn: Phase 1 - Core Infrastructure: - QuantizationParams, QuantizationScheme, QuantizationMode - QuantizedTensor<i8> with quantize/dequantize methods - CalibrationMethod (MinMax, Percentile, MSE, Entropy) - 34 unit tests passing Phase 2 - INT8 Kernels: - Scalar reference: conv2d, depthwise_conv2d, matmul, requantize - AVX2 SIMD: _mm256_maddubs_epi16 for 2-4x speedup - ARM NEON: vmull_s8, vpadalq_s16 for 2-3x speedup - WASM SIMD128: i8x16 operations for 1.5-2x speedup Phase 3 - Graph Rewrite Passes: - GR-1: BatchNorm fusion into Conv weights - GR-2: Zero-point correction pre-computation - GR-3: Q/DQ node insertion at FP32/INT8 boundaries - GR-4: ReLU/HardSwish fusion with LUT Phase 4 - Quantized Layers: - QuantizedConv2d with per-channel quantization - QuantizedDepthwiseConv2d for MobileNet - QuantizedLinear for FC layers - QuantizedMaxPool2d/AvgPool2d - QuantizedResidualAdd with scale alignment Phase 6 - Tests & Benchmarks: - quality_validation.rs: cosine similarity ≥0.995 - acceptance_gates.rs: 7 ADR-091 gates - kernel_equivalence.rs: SIMD vs scalar validation - int8_bench.rs: Criterion benchmarks Performance targets: - 2.5x latency improvement (MobileNetV3) - 4x memory reduction - <1% accuracy degradation Co-Authored-By: claude-flow <ruv@ruv.net>
1 parent d33fc67 commit c39bf72

37 files changed

Lines changed: 9919 additions & 0 deletions

Cargo.lock

Lines changed: 1 addition & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/ruvector-cnn/Cargo.toml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,12 @@ image = { version = "0.25", default-features = false, features = ["png", "jpeg"]
4141

4242
[dev-dependencies]
4343
criterion = { workspace = true }
44+
fastrand = "2.0"
4445

4546
[[bench]]
4647
name = "cnn_benchmarks"
4748
harness = false
49+
50+
[[bench]]
51+
name = "int8_bench"
52+
harness = false

0 commit comments

Comments
 (0)