[FIX] Patch libdevice stubs for interpreter/sanitizer mode#305
Open
[FIX] Patch libdevice stubs for interpreter/sanitizer mode#305
Conversation
Libdevice stub functions (e.g. `tanh`) return None in interpreter mode because Triton's _patch_lang does not patch the libdevice module. This causes TypeError when the result is passed to tl.store. Replace libdevice stubs with numpy-backed implementations that route through interpreter_builder.unary_op, and register tanh as a symbolic unary operation so the sanitizer can track it.
Resolve conflicts: keep libdevice patching (HEAD) + client_manager param (main), and keep both libdevice and fake tensor test sections.
Sanitizer Performance Benchmark
Iterations: 1 warmup + 20 measured |
…ore, and asin/acos support Replace flat dict + monkey-patched restore with LibdeviceSpec registry, arity-dispatching factory, and single tagged stack in _LangPatchScope. Unsupported ops now raise NotImplementedError immediately instead of silently returning None. Extend symbolic engine with asin/acos ops and add UnarySymbolicExpr.concretize().
…c for builder ops, add rsqrt Address PR review feedback: _LIBDEVICE_REGISTRY is now the true single source of truth. UNARY_OPS, _NUMPY_OPS, and _UNARY_NUMPY_TO_SYM_OP in symbolic_engine.py are derived from the registry instead of being maintained as parallel hardcoded mappings. LibdeviceSpec gains an optional builder_method field so ops like rsqrt that use interpreter_builder methods directly (rather than numpy ufuncs) can be expressed in the registry. rsqrt is added as the first builder-backed op. Tests: restore verification for libdevice patching, rsqrt E2E (sanitizer + numerical correctness), updated consistency and concretize tests.
Filter all registry-derived structures by spec.arity == 1 to prevent future arity>1 specs from being misclassified as unary ops. Replace NotImplementedError in _to_z3_impl() with a fresh opaque Z3 Int symbol so transcendental ops don't crash the sanitizer when their results flow into Z3-analyzed paths. Add tests for Z3 fallback, arity-conditional consistency, alias restore, and module-style unsupported op (libdevice.erf).
… caches - Extract LibdeviceSpec and _LIBDEVICE_REGISTRY to core/libdevice_registry.py to break the reverse dependency (symbolic_engine.py → patch.py). - Wrap triton_patch_lang + _patch_libdevice in try/except so a failure in _patch_libdevice rolls back all Triton lang state via scope.restore(). - Add _Z3_RANGE_BOUNDS to UnarySymbolicExpr: tanh/sin/cos bounded to [-1,1], exp/sqrt/rsqrt bounded to >=0, etc. Prevents unconstrained opaque symbols from degrading sanitizer analysis when transcendental ops flow into pointer arithmetic. - Pre-build _INTERPRETER_FNS and _REGISTERED_SPECS at module level, cache unsupported-fn wrappers with @cache, skip _patch_libdevice entirely when the kernel doesn't reference libdevice. - Add tests: patch rollback regression, Z3 range constraint solver checks, addptr-through-bounded-unary, multiple-alias restore, unsupported alias.
…ests - Add erf to registry as numpy-backed op (dtype-preserving vectorized math.erf), since interpreter_builder.create_erf exists upstream. - Remove blanket replacement of unregistered libdevice stubs with NotImplementedError — only patch registered ops. Unregistered stubs are left as-is, narrowing the patch surface to what we actually support. - Remove _make_unsupported_libdevice_fn and the else branch in _patch_libdevice_aliases that replaced unregistered aliases. - Add erf to e2e sanitizer op test, numerical correctness test, and unit concretize test. - Add tests: unregistered alias left unchanged, same-module helper alias patched, default-arg capture NOT patched (documented limitation).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tanh) returnNonein interpreter/sanitizer mode because Triton's_patch_langdoes not patch thetriton.language.extra.libdevicemodule, causingTypeError: cannot convert None of type <class 'NoneType'> to tensorwhen the result is passed totl.storeinterpreter_builder.unary_op, handling both module attribute access (libdevice.tanh(x)) and direct imports (from ... import tanh)tanhas a symbolic unary operation (UNARY_OPS+_UNARY_NUMPY_TO_SYM_OP) so the sanitizer can track itTest plan
test_libdevice_tanhend-to-end test that exercises the exact reproducer pattern (load → tanh → store)