The main implementation of the ieee::sig::div "arbitrary-precision division" function could be made into a separate function with a const SPECIALIZE_FOR_KNOWN_DIVISOR: u128 const-generic parameter, which it would use as such:
// The parameter being `0` is like `None` - could use `Option<NonZeroU128>` in the future.
if SPECIALIZE_FOR_KNOWN_DIVISOR != 0 {
assert_eq!(divisor[0], SPECIALIZE_FOR_KNOWN_DIVISOR);
assert!(is_all_zeros(&divisor[1..]));
}
(Hopefully this is enough for the rest of the body to be specialized by LLVM, but it can be further forced if necessary)
Then ieee::sig::div would become a "dispatch" fn, which invokes N+1 different instantiations of the const-generic implementation, for N "commonly used divisors" (10 comes to mind, tho there may be a whole sequence of powers of 5 for the conversion from decimal strings IIRC), and one 0 instantiation (which isn't specialized at all), and because the code still does the same division, we're only relying on the optimizer to actually turn the divisions into multiplications.
Whatever we do to the division algorithm, we shouldn't forget to add benchmarks first (unless the "from decimal" benchmark would cover enough interesting cases).
The main implementation of the
ieee::sig::div"arbitrary-precision division" function could be made into a separate function with aconst SPECIALIZE_FOR_KNOWN_DIVISOR: u128const-generic parameter, which it would use as such:(Hopefully this is enough for the rest of the body to be specialized by LLVM, but it can be further forced if necessary)
Then
ieee::sig::divwould become a "dispatch"fn, which invokes N+1 different instantiations of theconst-generic implementation, for N "commonly used divisors" (10comes to mind, tho there may be a whole sequence of powers of5for the conversion from decimal strings IIRC), and one0instantiation (which isn't specialized at all), and because the code still does the same division, we're only relying on the optimizer to actually turn the divisions into multiplications.Whatever we do to the division algorithm, we shouldn't forget to add benchmarks first (unless the "from decimal" benchmark would cover enough interesting cases).