Skip to content
Merged

B16 #108

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ ifeq ($(AGGRESSIVENESS), 0)
COVER_FLOAT_FLAGS += --partial-output
endif

MODELS := B1 B2 B3 B6 B7 B8 B9 B10 B11 B12 B13 B14 B15 B20 B21 B25 B26 B27 B29
MODELS := B1 B2 B3 B6 B7 B8 B9 B10 B11 B12 B13 B14 B15 B16 B20 B21 B25 B26 B27 B29

.PHONY: build clean sim all $(MODELS)

Expand Down
127 changes: 127 additions & 0 deletions docs/B16.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
= Floating Point Model Documentation
:toc:
:toclevels: 3
:sectnums:

== B12 Add: Cancellation

Aharoni et al.

=== Description

B16. Multiply-Add: Cancellation
This model tests every possible value for cancellation.
For the difference between the exponent of the intermediate result and the
maximum between the exponents of the addend and the multiplication result,
test all values in the range:
[-(2 * p + 1), 1].

*Number of tests generated:* 1732 (1E3)

*Precisions Supported:* `BF_16`, `FP_16`, `FP_32`, `FP_64`, `FP_128`

*Operations Supported:* fmadd, fmsub, fnmadd, fnmsub

== Definitions (TODO: CHANGE UP DEFINITIONS)

`a`:: Operand 1
`b`:: Operand 2
`m`:: Mantissa. Number of bits in mantissa, excluding the hidden 1
`p`:: Precision. Number of significant bits, including the hidden 1. p = m + 1
`d`:: The difference between max(a_exp, b_exp) and exponent of the intermediate result. This is the interest of this model, where we test all d's in [-p, +1]
`k`:: Magnitude of d. Defined as -d in the code since we deal with d = 1 separately
`a_m`:: Mantissa of `a`
`b_m`:: Mantissa of `b`
`a_exp`:: Exponent of operand `a`, generally in `[min_exp, max_exp]`
`b_exp`:: Exponent of operand `b`, generally in `[min_exp, max_exp]`
`exp`:: Exponent of the result, generally in `[min_exp, max_exp]`
`max_exp`:: Maximum exponent value based on precision
`min_exp`:: Minimum exponent value based on precision

== Background (TODO: NEED NEW BACKGROUND FOR FMAS)

Cancellation occurs when the most significant digits of two operands subtract to zero, triggering a massive normalization effort as the hardware shifts out the many leading zeros. In high-performing floating point processors, a Leading Zero Anticipator (LZA) is used to predict the location of the most significant bit of the result to perform normalization. Generating operands that force specific cancellation depths thus tests the accuracy of the LZA, especially for the most difficult edge cases where the depth of cancellation is close to the precision.

=== Design Choices (TODO: UPDATE DESIGN CHOICES)

. If a_exp and b_exp are both negative, their sum must not exceed the lower exponent bound. This means that a_exp + b_exp > min_exp
. If a_exp and b_exp are both positive, their sum must not exceed the upper exponent bound. This means that a_exp + b_exp < max_exp
. Even though theoretically whether the product exponent is greater or the addend operand is greater should be random, we have the addend exponent to be greater for most cases to make controlling cancellation easier. Since we're only interested in the difference between the exponent of the operands and the intermediate result, this difference should not be of main concern, but could be a future improvement
. For extremely deep cancellation cases the only possible result is Zero

=== Notes (TODO: CHANGE NOTES)

All explanation in this document will be done with one instead of both operations because the only difference will be sign assignments. All exponents used in the examples are unbiased exponents.

== General Procedure
We need to ensure that the product exponent `a_raw` and `b_raw` is high enough that even at the most extreme cancellation `d = -(2p+1)`, the result stays above the subnormal floor `min_raw`. Conversely, it must be low enough that at `d=1`, we don't overflow `max_raw`.

Two situations: ab_exp is max, or c_exp is max. When we multiply a, b, the intermediate product is 2p bits wide

product of a and b cannot be too different from desired, otherwise c would dominate and cancellation would be 0

Unlike B2, we cannot use softfloat to generate desired result because ab has 2p precision and softfloat will remove that and round it to p.

For single precision, the `c` only has `m` bits of precision (23 bits for f32). It can only cancel the top `m` bits of the product `a*b`. If you want a cancellation depth of 40, bits 24 through 39 of `a*b` must be exactly zero. If we use purely random mantissas for `a`` and `b`, their cross-multiplication will fill that gap with random noise, `c` won't be able to reach deep enough to cancel it, and the cancellation depth will permanently collapse back to around -24.

Case 1: d < -2m, then we have to generate subnormal numbers, so a_raw and b_raw both need to be the smallest exponent possible
Case 2: d < -m, meaning the cancellation is deeper than the precision. Set all operands to be power of 2 is an easy way to solve this
Case 3: shallow cancellation d= -6, -5, -4, -3, -2, -1. (values are experimentally determined) Because shallow bins are very sensitive to
1. whether c_exp becomes larger than ab_exp
2. whether the final rounded result carries up one exponent
3. whether cancellation becomes slightly deeper than intended
4. whether product rounding changes ab_exp
Case 4: Other d, random generation
Case 5: d = 0. We make the exponent of c larger than ab_exp, so the result can be the same exp as c
Case 6: d = 1. Again we make the exponent of c larger than ab_exp, and force c to be the largest mantissa so the result definitely carries

== Specific Test Procedure
Case 3: for d = -6, -5, -4, -3, -2, -1, the process looks like
1. Generate a and b
2. Compute the real rounded product exponent ab_exp.
3. Choose a desired result with exponent ab_exp + d
4. Solve for c
5. Reject unless c_exp == ab_exp
6. Reject unless actual_b16_d(...) == d

== Test Count Breakdown

Since each precision has tests going from -(2*p+1) to 1, the amount of test case per precision per operation is (1- -(2*p+1))/1+1 = 2*p+3

[cols="1,1,1",options="header"]
|===
| Precision | p | Total (2*p+3)

| BF_16
| 8
| 19

| FP_16
| 11
| 25

| FP_32
| 24
| 51

| FP_64
| 53
| 109

| FP_128
| 113
| 229
|===

=== Overall Test Count

[cols="1,1",options="header"]
|===
| Description | Value

| Total cancellation tests
| 433

| Accounting for FMADD, FMSUB, FNMADD, and FNMSUB (×4)
| 1732
|===
234 changes: 234 additions & 0 deletions src/cover_float/testgen/B16.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,234 @@
"""
Angela Zheng (angela20061015@gmail.com)

Created: 4/28/2026
Last Modified: 4/28/2026
"""

import logging
import random
from dataclasses import dataclass
from pathlib import Path
from random import seed
from typing import TextIO, cast

import cover_float.common.log as log
from cover_float.common.constants import (
BIAS,
EXPONENT_BITS,
FLOAT_FMTS,
MANTISSA_BITS,
OP_FMADD,
OP_FMSUB,
OP_FNMADD,
OP_FNMSUB,
OP_MUL,
UNBIASED_EXP,
)
from cover_float.common.util import (
decimal_components_to_hex,
generate_test_vector,
get_result_from_ref,
reproducible_hash,
)
from cover_float.reference import run_and_store_test_vector
from cover_float.testgen.model import register_model

logger: log.ModelLogger = cast(log.ModelLogger, logging.getLogger("B16"))

OPS = [OP_FMADD, OP_FMSUB, OP_FNMADD, OP_FNMSUB]
SOLVER_OPS = {
OP_FMADD: OP_FNMSUB,
OP_FMSUB: OP_FMSUB,
OP_FNMADD: OP_FNMADD,
OP_FNMSUB: OP_FMADD,
}


@dataclass(frozen=True)
class FloatFormat:
name: str
m_bits: int
e_bits: int
bias: int
min_exp: int
max_exp: int

@property
def p(self) -> int:
return self.m_bits + 1

@classmethod
def from_name(cls, name: str) -> "FloatFormat":
min_e, max_e = UNBIASED_EXP[name]
return cls(name, MANTISSA_BITS[name], EXPONENT_BITS[name], BIAS[name], min_e, max_e)

def to_hex(self, sign: int, exp: int, mant: int) -> str:
return decimal_components_to_hex(self.name, sign, exp + self.bias, mant)

def get_exp(self, fp_hex: str) -> int:
bits = int(fp_hex, 16)
return ((bits >> self.m_bits) & ((1 << self.e_bits) - 1)) - self.bias


class B16Generator:
def __init__(self, fmt: str, test_f: TextIO, cover_f: TextIO) -> None:
self.f = FloatFormat.from_name(fmt)
self.test_f, self.cover_f = test_f, cover_f

def get_op_details(self, op: str, a: str, b: str, c: str) -> tuple[int, int]:
"""Helper to get actual product exp and final result exp."""
p_hex = get_result_from_ref(OP_MUL, a, b, "0", self.f.name)
r_hex = get_result_from_ref(op, a, b, c, self.f.name)
return self.f.get_exp(p_hex), self.f.get_exp(r_hex)

def store(self, op: str, a: str, b: str, c: str) -> None:
v = generate_test_vector(op, int(a, 16), int(b, 16), int(c, 16), self.f.name, self.f.name)
run_and_store_test_vector(v, self.test_f, self.cover_f)

def get_random_split(self, target_exp: int) -> tuple[int, int]:
"""Splits a target product exponent into two valid operand exponents."""
lo, hi = self.f.min_exp, self.f.max_exp
a_min, a_max = max(lo, target_exp - hi), min(hi, target_exp - lo)
a = random.randint(a_min, a_max)
return a, target_exp - a

def generate_same_exp(self, d: int, op: str) -> bool:
f, m = self.f, self.f.m_bits
a_s, b_s = random.randint(0, 1), random.randint(0, 1)
c_s = (a_s ^ b_s) if op in [OP_FMADD, OP_FNMADD] else (a_s ^ b_s) ^ 1
target_p_exp = random.randint(0, f.max_exp)
a_r = random.randint(0, target_p_exp)
b_r = target_p_exp - a_r
c_r = a_r + b_r + 3
a_m, b_m, c_m = random.getrandbits(m), random.getrandbits(m), random.getrandbits(m)
a_h = f.to_hex(a_s, a_r, a_m)
b_h = f.to_hex(b_s, b_r, b_m)
c_h = f.to_hex(c_s, c_r, c_m)
r_hex = get_result_from_ref(op, a_h, b_h, c_h, f.name)
r_exp = f.get_exp(r_hex)
if r_exp != c_r:
return False
self.store(op, a_h, b_h, c_h)
return True

def generate_shallow_cancel(self, d: int, op: str) -> bool:
f, m = self.f, self.f.m_bits
a_s, b_s = random.randint(0, 1), random.randint(0, 1)

# pick a safe product exponent away from underflow/overflow
target_p_exp = random.randint(f.min_exp + 10, f.max_exp - 10)
a_min = max(f.min_exp + 5, target_p_exp - (f.max_exp - 5))
a_max = min(f.max_exp - 5, target_p_exp - (f.min_exp + 5))
a_r = random.randint(a_min, a_max)
b_r = target_p_exp - a_r

# use non-extreme mantissas to avoid accidental exponent carry in a*b
a_m = random.randint(1 << (m - 2), (1 << m) - 1)
b_m = random.randint(1 << (m - 2), (1 << m) - 1)
a_h = f.to_hex(a_s, a_r, a_m)
b_h = f.to_hex(b_s, b_r, b_m)
p_exp = self.f.get_exp(get_result_from_ref(OP_MUL, a_h, b_h, "0", f.name))
res_raw = p_exp + d

# pick a mid-range result mantissa so rounding is less likely to change exponent
r_s = random.randint(0, 1)
res_m = random.randint(1 << (m - 2), (1 << (m - 1)) - 1)
res_h = f.to_hex(r_s, res_raw, res_m)
c_h = get_result_from_ref(SOLVER_OPS[op], a_h, b_h, res_h, f.name)
c_exp = f.get_exp(c_h)

# for shallow cancellation, c should be aligned with the product
if c_exp != p_exp:
return False
# Final validation
p_exp2, r_exp = self.get_op_details(op, a_h, b_h, c_h)
if (r_exp - max(p_exp2, c_exp)) == d:
self.store(op, a_h, b_h, c_h)
return True
return False

def generate_deep_cancel(self, d: int, op: str) -> bool:
f = self.f
a_s, b_s, r_s = random.randint(0, 1), random.randint(0, 1), random.randint(0, 1)
res_raw, res_m, a_m, b_m = f.min_exp - 1, 0, 0, 0
target_p_exp = res_raw - d
split = self.get_random_split(target_p_exp)
a_r, b_r = split
a_h, b_h = f.to_hex(a_s, a_r, a_m), f.to_hex(b_s, b_r, b_m)
p_exp = self.f.get_exp(get_result_from_ref(OP_MUL, a_h, b_h, "0", f.name))
res_h = f.to_hex(r_s, p_exp + d, res_m)
c_h = get_result_from_ref(SOLVER_OPS[op], a_h, b_h, res_h, f.name)
p_exp, r_exp = self.get_op_details(op, a_h, b_h, c_h)
c_exp = f.get_exp(c_h)
if (r_exp - max(p_exp, c_exp)) == d:
self.store(op, a_h, b_h, c_h)
return True
return False

def generate(self, d: int, op: str) -> bool:
f, m = self.f, self.f.m_bits
a_s, b_s, r_s = random.randint(0, 1), random.randint(0, 1), random.randint(0, 1)

if d <= -(2 * f.p - 1):
return self.generate_deep_cancel(d, op)
elif d in [-6, -5, -4, -3, -2, -1]:
return self.generate_shallow_cancel(d, op)
elif d == 0:
return self.generate_same_exp(d, op)
elif d == 1: # need result > operands
a_raw, b_raw = random.randint(0, f.max_exp // 2), random.randint(0, f.max_exp // 2)
c_s = 0 if op in [OP_FMADD, OP_FNMADD] else 1
a_h = f.to_hex(a_s, a_raw, random.getrandbits(m))
b_h = f.to_hex(a_s, b_raw, random.getrandbits(m))
c_h = f.to_hex(c_s, a_raw + b_raw + 1, (1 << m) - 1)
self.store(op, a_h, b_h, c_h)
return True
else:
valid_lo = max(f.min_exp, (f.min_exp - 1) - d)
valid_hi = min(f.max_exp, f.max_exp - d)
target_p_exp = random.randint(valid_lo, valid_hi)

# generate a and b
split = self.get_random_split(target_p_exp)
a_r, b_r = split

# special mantissas for deep cancellation
if d < -m:
target_depth = abs(d)
k = max(0, 2 * m - target_depth) // 2
a_m, b_m, res_m = 1 << k, 1 << (max(0, 2 * m - target_depth) - k), 0
else:
a_m, b_m, res_m = [random.getrandbits(m) for _ in range(3)]

a_h, b_h = f.to_hex(a_s, a_r, a_m), f.to_hex(b_s, b_r, b_m)

# solve for c
p_exp = self.f.get_exp(get_result_from_ref(OP_MUL, a_h, b_h, "0", f.name))
res_h = f.to_hex(r_s, p_exp + d, res_m)
c_h = get_result_from_ref(SOLVER_OPS[op], a_h, b_h, res_h, f.name)

p_exp, r_exp = self.get_op_details(op, a_h, b_h, c_h)
c_exp = f.get_exp(c_h)

if (r_exp - max(p_exp, c_exp)) == d:
self.store(op, a_h, b_h, c_h)
return True
return False


@register_model("B16")
def main(test_f: TextIO, cover_f: TextIO) -> None:
with (
Path("./tests/testvectors/B16_tv.txt").open("w") as tf,
Path("./tests/covervectors/B16_cv.txt").open("w") as cf,
):
for fmt_name in FLOAT_FMTS:
gen = B16Generator(fmt_name, tf, cf)
for d in range(-(2 * gen.f.p + 1), 2):
retries = 15
for op in OPS:
seed(reproducible_hash(f"{fmt_name}_b16_{d}_{op}"))
for _ in range(retries):
if gen.generate(d, op):
break
2 changes: 2 additions & 0 deletions src/cover_float/testgen/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
import cover_float.testgen.B13 as B13
import cover_float.testgen.B14 as B14
import cover_float.testgen.B15 as B15
import cover_float.testgen.B16 as B16
import cover_float.testgen.B20 as B20
import cover_float.testgen.B21 as B21
import cover_float.testgen.B25 as B25
Expand All @@ -48,6 +49,7 @@
"B13",
"B14",
"B15",
"B16",
"B20",
"B21",
"B25",
Expand Down