Skip to content

Fix dtype_byte_size for FP8 fnuz / e8m0fnu dtypes#4063

Open
lollinng wants to merge 1 commit into
huggingface:mainfrom
lollinng:accelerate-fp8-dtype-byte-size
Open

Fix dtype_byte_size for FP8 fnuz / e8m0fnu dtypes#4063
lollinng wants to merge 1 commit into
huggingface:mainfrom
lollinng:accelerate-fp8-dtype-byte-size

Conversation

@lollinng
Copy link
Copy Markdown

@lollinng lollinng commented Jun 4, 2026

Problem

dtype_byte_size only special-cases torch.float8_e4m3fn and torch.float8_e5m2. The other 1-byte FP8 dtypes — float8_e4m3fnuz, float8_e5m2fnuz, and float8_e8m0fnu — fall through to:

bit_search = re.search(r"[^\d](\d+)$", str(dtype))

Their string forms end in ...fnuz / ...fnu (no trailing digit), so bit_search is None and the function raises ValueError: dtype is not a valid dtype. This breaks anything that sizes such tensors — compute_module_sizes, infer_auto_device_map, estimate-memory, etc. — for models stored in those dtypes.

Fix

Handle all available 1-byte FP8 dtypes, guarded with hasattr so it stays correct on older torch where some variants don't exist.

Testing done (CPU)

Added tests/test_modeling_utils.py::ModelingUtilsTester::test_dtype_byte_size. With torch 2.12:

float8_e4m3fn      after-fix=1   before(regex)=RAISED
float8_e5m2        after-fix=1
float8_e4m3fnuz    after-fix=1   before(regex)=RAISED
float8_e5m2fnuz    after-fix=1   before(regex)=RAISED
float8_e8m0fnu     after-fix=1   before(regex)=RAISED

Standard dtypes unchanged (float32=4, float16/bfloat16=2, int8=1, float64=8, bool=1/8). ruff check / ruff format --check clean.

tests/test_modeling_utils.py::ModelingUtilsTester::test_dtype_byte_size PASSED

dtype_byte_size only special-cased torch.float8_e4m3fn and float8_e5m2.
The other 1-byte FP8 variants (float8_e4m3fnuz, float8_e5m2fnuz,
float8_e8m0fnu) fell through to the regex r'[^\d](\d+)$', which finds no
trailing digit and raised ValueError. This broke compute_module_sizes /
device-map inference / estimate-memory for models using those dtypes.

Handle all available 1-byte FP8 dtypes (guarded with hasattr for older
torch). Adds a regression test.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant