You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
IAA03_fast_math is a single-header math kernel(Atan2 only for now) designed to eliminate the "Trigonometry Tax" in high-throughput systems (Physics Engines, Audio DSP, and ML Pre-processing). Branchless ,ILP and SIMD (AVX2/SSE4.1), it achieves up to a ~186x per-element throughput speedup over std::atan2 while being IEEE 754 compliant
Safety-hardened GEMM (matrix multiply) implementation achieving 169.8 GFLOPS on Intel i9-14900. Built for embedded systems and safety-critical applications where reliability matters as much as speed. 162× faster than naive, zero UB, fully validated.