Improve feature_selection: richer return type + verbose progress by adrian-prior · Pull Request #293 · PriorLabs/tabpfn-extensions

adrian-prior · 2026-05-13T15:14:50Z

This PR improves the usability of the feature selection extensions by:

changing the return interface
adding verbosity so users can see what's going on (in particular, we computed some CV scores before and after, which we never logged or saved)
Adds notes about feature selection not being that useful for TabPFN

…bose + KV-cache caveat - Rewrite `feature_selection` around a `FeatureSelectionResult` dataclass. The wrapper now returns the fitted SFS plus the support mask, selected indices/names, and the pre/post CV scores it already had to compute — collapses the typical caller-side mask -> names dance into one attribute access. Backward-incompatible: callers using `sfs.get_support()` directly need to switch to `result.support_mask` (or `result.selector.get_support()`). - Make `n_features_to_select` a required positional argument — there's no sensible default. - Expose the SFS knobs we were swallowing: `cv`, `scoring`, `direction`, `n_jobs`, `tol`. All keyword-only after the `*`. `cv=5` is the pre-existing hardcoded value, just configurable now. - Always compute baseline (all-features) and selected (subset) CV scores using the same `cv` / `scoring` as SFS, surface them on the result. - Add `verbose: bool = True`. When set: - print a config header (direction, cv, scoring, k) - print the baseline CV score before SFS runs - print per-round picks ("round i/k: picked feature 'x', cv = ...") via a `_VerboseSFS` subclass that overrides the private `_get_best_new_feature_score` method (sklearn doesn't expose a `verbose` parameter or callback hook on SFS itself — this is the cleanest workaround; documented in a class docstring with the private-API dependency caveat) - print the selected names + final CV score `verbose=False` keeps everything silent; the scores are still available on the returned `FeatureSelectionResult`. - Add a docstring note: TabPFN is very robust to noisy features in its in-context-learning regime, so accuracy gain from running SFS is often marginal — the value is more interpretability / parsimony / faster predict-time. Verified by a quick noise-trajectory benchmark (n_features 3 -> 13, CV score stays in 0.92–0.94 throughout). Mention SHAP as the alternative interpretability route since it can use the KV cache and is generally much faster. - Re-export `FeatureSelectionResult` from `tabpfn_extensions.interpretability` so callers can type-annotate without reaching into the submodule. - Update `examples/interpretability/feature_selection.py` to use the new return shape, and align params with the public TabPFN demo notebook (`n_estimators=1`, `n_features_to_select=4`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…e-selection # Conflicts: # src/tabpfn_extensions/interpretability/__init__.py

gemini-code-assist

Code Review

This pull request enhances the feature selection module by introducing a "FeatureSelectionResult" dataclass and a verbose mode for tracking selection progress. The "feature_selection" utility was refactored to return detailed results, including baseline and post-selection cross-validation scores, and now supports additional parameters such as "cv", "scoring", and "n_jobs". The example script was updated to reflect these changes. I have no feedback to provide.

adrian-prior and others added 2 commits May 13, 2026 17:14

Merge remote-tracking branch 'origin/main' into adrian/improve-featur…

b21c17c

…e-selection # Conflicts: # src/tabpfn_extensions/interpretability/__init__.py

adrian-prior marked this pull request as ready for review May 13, 2026 15:19

adrian-prior requested a review from a team as a code owner May 13, 2026 15:19

adrian-prior requested review from priorjulien and removed request for a team May 13, 2026 15:19

gemini-code-assist Bot reviewed May 13, 2026

View reviewed changes

adrian-prior changed the title ~~Improve feature_selection: richer return type + verbose progress + KV-cache caveat~~ Improve feature_selection: richer return type + verbose progress May 13, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve feature_selection: richer return type + verbose progress#293

Improve feature_selection: richer return type + verbose progress#293
adrian-prior wants to merge 2 commits into
mainfrom
adrian/improve-feature-selection

adrian-prior commented May 13, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adrian-prior commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adrian-prior commented May 13, 2026 •

edited

Loading