Skip to content

feat(model): tree/tabular backend + SHAP TreeExplainer end-to-end (#246)#298

Merged
stanlrt merged 23 commits into
mainfrom
246-add-treetabular-model-backend-enable-shap-treeexplainer-end-to-end
Jun 9, 2026
Merged

feat(model): tree/tabular backend + SHAP TreeExplainer end-to-end (#246)#298
stanlrt merged 23 commits into
mainfrom
246-add-treetabular-model-backend-enable-shap-treeexplainer-end-to-end

Conversation

@stanlrt

@stanlrt stanlrt commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Summary

Closes #246. Adds a tree/tabular model backend so SHAP TreeExplainer runs end-to-end. XGBoost is the MVP runtime; the architecture is a shared TabularTreeBackend base plus per-runtime subclasses, so sklearn (#295) and LightGBM (#296) land as ~15-line subclasses with no seam change. Additive only — new backend, new tree extra, a defaulted ForwardOutput.output_kind field, and a keyword-only prediction_summaries arg. No breaking changes.

Acceptance criteria → where

Criterion Where
TreeExplainer runs end-to-end on a real tree model via a documented backend test_tree_explainer_end_to_end_xgboost; docs in bd3e90e
requires reflects true capability (TREE_MODEL, not AUTOGRAD placeholder) 4e95f22
Non-tree backend → clear BackendIncompatibilityError (not SHAP-internal) test_tree_explainer_gates_out_on_torch_backend; error hint in 4e95f22
Predictions/metrics behave sensibly for tree outputs OutputKind + softmax skip (c0050c2, 5d2adb3)
Docs reflect supported methods + new backend bd3e90e

How it works

  • EstimatorProvider protocol + TREE_MODEL shape-capability binding hands the raw fitted estimator to TreeSHAP through the Make the model-module explanation contract backend-agnostic (drop the nn.Module bridge) #209 agnostic contract — no nn.Module facade (f2c1f4d).
  • TabularTreeBackend (e07bfde) owns the torch→numpy→predict_proba→torch bridge, fitted_estimator(), CPU/classification defaults. XGBoostBackend (586c594) is a concrete subclass loading .ubj, registered provides={TREE_MODEL, PREDICT_PROBA}.
  • Tree backends emit probabilities, not logits. ForwardOutput.output_kind is stamped from PREDICT_PROBA (c0050c2); classification prediction_summaries skips softmax for probabilities (5d2adb3). Affects confidence display only — metrics argmax regardless, so metric numbers are unchanged.
  • predict_callable (inherited) exposes the probability bridge, so model-agnostic SHAP explainers (e.g. KernelExplainer) work on tree backends for free.
  • New xgboost extra (xgboost>=2.0), kept out of the transparency umbrella. CI test sync installs --extra xgboost so the acceptance tests run rather than silently skip.

Follow-ups (filed)

Checklist

  • CI — Required checks are green
  • Breaking changes — If this PR breaks compatibility (API, configs, file formats, etc.), the title uses ! after the type or scope (e.g. feat!: or feat(transparency)!:). I understand that this change will have a direct, disruptive impact on package users. I ensured that this change is absolutely necessary and cannot be delayed.
  • Contributor guide — I’ve read Pull requests and commit messages before requesting review.

Optional

  • Issue (optional) — A GitHub issues is linked to this PR ("Development" section in the right sidebar).
  • Docs (optional) — User or contributor docs updated where needed.
  • Tests (optional) — New or updated tests cover the change.

🤖 Generated with Claude Code


Follow-on: registry-driven deps inference (rode along)

While fixing the .ubj → xgboost deps mapping, the extension→extra logic in deps/inference.py was a hardcoded if/elif duplicating the backend registry. Refactored so backends declare extra=/supported_hardware= on @register, and deps auto-derives the install hint via an import-free AST scan (mirrors scan_adapter_extras). Also centralised the resolved-hardware vocabulary into a ResolvedHardware StrEnum (raitap/types.py) holding all three axes (resolved name ↔ pyproject extra suffix ↔ config bucket), deleting the scattered _HARDWARE_SUFFIX dict + duplicated Hardware = Literal[...]. Separable commit (refactor(deps): …) if the maintainer prefers it split out.

@stanlrt stanlrt linked an issue Jun 8, 2026 that may be closed by this pull request
Comment thread src/raitap/models/access.py
Comment thread src/raitap/models/model.py Fixed
Comment thread src/raitap/models/model.py Fixed
Comment thread src/raitap/models/model.py
@stanlrt stanlrt changed the title feat(model): tree/tabular backend + SHAP TreeExplainer end-to-end (closes #246) feat(model): tree/tabular backend + SHAP TreeExplainer end-to-end (#246) Jun 9, 2026
@stanlrt stanlrt merged commit 5081914 into main Jun 9, 2026
18 checks passed
@stanlrt stanlrt deleted the 246-add-treetabular-model-backend-enable-shap-treeexplainer-end-to-end branch June 9, 2026 19:58
@github-actions github-actions Bot mentioned this pull request Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add tree/tabular model backend + enable SHAP TreeExplainer end-to-end

1 participant