Skip to content

fix: add EXIT_CODE.INTERRUPTED to resolve AttributeError on SIGTERM (fixes #392, #393)#400

Merged
FileSystemGuy merged 1 commit into
mainfrom
fix/exit-code-interrupted
Jun 2, 2026
Merged

fix: add EXIT_CODE.INTERRUPTED to resolve AttributeError on SIGTERM (fixes #392, #393)#400
FileSystemGuy merged 1 commit into
mainfrom
fix/exit-code-interrupted

Conversation

@russfellows

Copy link
Copy Markdown
Contributor

Summary

When dlio_benchmark finishes (successfully or with an error), OpenMPI sends
SIGTERM to the parent process group as part of its normal cleanup. The
mlpstorage signal handler catches this and calls sys.exit(EXIT_CODE.INTERRUPTED).
That call crashed with:

AttributeError: type object 'EXIT_CODE' has no attribute 'INTERRUPTED'

because the INTERRUPTED member was missing from the EXIT_CODE enum in
mlpstorage_py/config.py. The enum had SUCCESS through TIMEOUT (0–7) and a
# Add more as needed comment where INTERRUPTED should have been.

Root Cause

main.py references EXIT_CODE.INTERRUPTED in two places (the signal handler at
lines 63 and 319), but the enum value was never defined. Any benchmark invocation
that reaches the MPI cleanup phase triggers the SIGTERM → signal handler →
AttributeError crash path, which was misreported as a general failure rather
than a clean interrupted exit.

Fix

mlpstorage_py/config.py — add INTERRUPTED = 8 to the EXIT_CODE enum:

-    # Add more as needed
+    INTERRUPTED = 8

Issues Fixed

Testing

  • 1027 unit tests pass (uv run python -m pytest tests/unit/ -q)
  • Manually verified: EXIT_CODE.INTERRUPTED == 8 and str(EXIT_CODE.INTERRUPTED) == "INTERRUPTED (8)"

Files Changed

File Change
mlpstorage_py/config.py Added INTERRUPTED = 8 to EXIT_CODE enum

When dlio_benchmark exits, OpenMPI sends SIGTERM to the parent process group.
The mlpstorage signal handler calls sys.exit(EXIT_CODE.INTERRUPTED), which
crashed with AttributeError because INTERRUPTED was missing from the EXIT_CODE
enum in config.py.

Add INTERRUPTED = 8 to the enum.

Fixes #392
Fixes #393
@russfellows russfellows requested a review from a team May 31, 2026 23:49
@github-actions

Copy link
Copy Markdown

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@russfellows

russfellows commented Jun 1, 2026 via email

Copy link
Copy Markdown
Contributor Author

@FileSystemGuy FileSystemGuy merged commit 5c743ff into main Jun 2, 2026
2 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 2, 2026
@russfellows

russfellows commented Jun 2, 2026 via email

Copy link
Copy Markdown
Contributor Author

@russfellows

russfellows commented Jun 2, 2026 via email

Copy link
Copy Markdown
Contributor Author

@russfellows

russfellows commented Jun 2, 2026 via email

Copy link
Copy Markdown
Contributor Author

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Datagen to S3 fails using the MinIO SDK Datagen to S3 fails using s3dlio

3 participants