Skip to content

Fix correctness bugs and bump to v0.4.1#34

Merged
gvonness-apolitical merged 2 commits intomainfrom
fix/v0.4.1-correctness-fixes
Mar 14, 2026
Merged

Fix correctness bugs and bump to v0.4.1#34
gvonness-apolitical merged 2 commits intomainfrom
fix/v0.4.1-correctness-fixes

Conversation

@gvonness-apolitical
Copy link
Contributor

Summary

Fixes three deferred correctness bugs from the codebase review (all produced silently wrong results) plus pre-existing review fixes:

  • Powi f32 exponent encoding: powi(n) on f32 tapes silently returned wrong values/gradients for negative exponents (n <= -2) due to precision loss in the u32 → f32 → u32 round-trip. All 5 dispatch sites (forward, reverse, tangent fwd/rev, cross-country) now decode the exponent directly from raw u32 via powi_exp_decode_raw.
  • taylor_powi negative base: Taylor::powi produced NaN for negative base (e.g. (-2)^3) because it used exp(n * ln(a)). Added taylor_powi_squaring using binary exponentiation for negative base or |n| <= 8.
  • Checkpoint position lookup: grad_checkpointed, grad_checkpointed_disk, grad_checkpointed_with_hints used Vec::contains() (O(n) per step). Converted to HashSet for O(1).
  • Pre-existing review fixes: Rem derivative corrections for Dual/Reverse/BReverse, hypot zero-division guard, powi overflow guard, Round nonsmooth kink detection at half-integers, CSE custom-op handling, optimize() dedup removal, grad() constant-output fast path, sparsity Powi/Custom handling.

Bumps version to 0.4.1 with updated CHANGELOG and SECURITY.

Test plan

  • cargo test --features bytecode,taylor,stde,parallel,diffop,laurent — all 600+ tests pass
  • cargo clippy --features bytecode,taylor,stde,parallel,diffop,laurent -- -D warnings — clean
  • New tests: f32 tape powi(-2) and powi(-10) value+gradient, powi_exp_decode_raw round-trip, Taylor powi with negative base (squared, cubed, 4th power, negative exponent), positive base regression
  • Nonsmooth Round test updated for half-integer kink detection

…and codebase review fixes (v0.4.1)

Fix three deferred correctness bugs from the codebase review plus
pre-existing fixes from the review session:

- powi on f32 bytecode tapes silently produced wrong values/gradients for
  negative exponents due to precision loss in the u32→f32→u32 round-trip.
  All 5 dispatch sites now decode directly from raw u32.

- Taylor::powi produced NaN for negative base values (e.g. (-2)^3) because
  it used exp(n*ln(a)). Added binary exponentiation path for negative base
  or small exponents.

- Checkpoint position lookups converted from Vec::contains (O(n)) to
  HashSet (O(1)) in all three checkpointing variants.

- Pre-existing review fixes: Rem derivative corrections for Dual/Reverse/
  BReverse, hypot zero-division guard, powi overflow guard, Round kink
  detection at half-integers, CSE custom-op handling, optimize() dedup,
  grad() constant-output fast path, sparsity Powi/Custom handling.

Bump version to 0.4.1. Update CHANGELOG and SECURITY.
@gvonness-apolitical gvonness-apolitical merged commit 5f2d84d into main Mar 14, 2026
6 checks passed
@gvonness-apolitical gvonness-apolitical deleted the fix/v0.4.1-correctness-fixes branch March 14, 2026 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant