feat(model): tree/tabular backend + SHAP TreeExplainer end-to-end (#246)#298
Merged
stanlrt merged 23 commits intoJun 9, 2026
Conversation
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #246. Adds a tree/tabular model backend so SHAP
TreeExplainerruns end-to-end. XGBoost is the MVP runtime; the architecture is a sharedTabularTreeBackendbase plus per-runtime subclasses, so sklearn (#295) and LightGBM (#296) land as ~15-line subclasses with no seam change. Additive only — new backend, newtreeextra, a defaultedForwardOutput.output_kindfield, and a keyword-onlyprediction_summariesarg. No breaking changes.Acceptance criteria → where
TreeExplainerruns end-to-end on a real tree model via a documented backendtest_tree_explainer_end_to_end_xgboost; docs in bd3e90erequiresreflects true capability (TREE_MODEL, notAUTOGRADplaceholder)BackendIncompatibilityError(not SHAP-internal)test_tree_explainer_gates_out_on_torch_backend; error hint in 4e95f22OutputKind+ softmax skip (c0050c2, 5d2adb3)How it works
EstimatorProviderprotocol +TREE_MODELshape-capability binding hands the raw fitted estimator to TreeSHAP through the Make the model-module explanation contract backend-agnostic (drop the nn.Module bridge) #209 agnostic contract — nonn.Modulefacade (f2c1f4d).TabularTreeBackend(e07bfde) owns the torch→numpy→predict_proba→torch bridge,fitted_estimator(), CPU/classification defaults.XGBoostBackend(586c594) is a concrete subclass loading.ubj, registeredprovides={TREE_MODEL, PREDICT_PROBA}.ForwardOutput.output_kindis stamped fromPREDICT_PROBA(c0050c2); classificationprediction_summariesskips softmax for probabilities (5d2adb3). Affects confidence display only — metrics argmax regardless, so metric numbers are unchanged.predict_callable(inherited) exposes the probability bridge, so model-agnostic SHAP explainers (e.g.KernelExplainer) work on tree backends for free.xgboostextra (xgboost>=2.0), kept out of thetransparencyumbrella. CI test sync installs--extra xgboostso the acceptance tests run rather than silently skip.Follow-ups (filed)
Checklist
!after the type or scope (e.g.feat!:orfeat(transparency)!:). I understand that this change will have a direct, disruptive impact on package users. I ensured that this change is absolutely necessary and cannot be delayed.Optional
🤖 Generated with Claude Code
Follow-on: registry-driven deps inference (rode along)
While fixing the
.ubj → xgboostdeps mapping, the extension→extra logic indeps/inference.pywas a hardcodedif/elifduplicating the backend registry. Refactored so backends declareextra=/supported_hardware=on@register, anddepsauto-derives the install hint via an import-free AST scan (mirrorsscan_adapter_extras). Also centralised the resolved-hardware vocabulary into aResolvedHardwareStrEnum (raitap/types.py) holding all three axes (resolved name ↔ pyproject extra suffix ↔ config bucket), deleting the scattered_HARDWARE_SUFFIXdict + duplicatedHardware = Literal[...]. Separable commit (refactor(deps): …) if the maintainer prefers it split out.