Skip to content

Release 0.4.1 — fix sklearn 1.8 ImportError#286

Merged
adrian-prior merged 3 commits into
mainfrom
adrian/release-0.4.1
May 11, 2026
Merged

Release 0.4.1 — fix sklearn 1.8 ImportError#286
adrian-prior merged 3 commits into
mainfrom
adrian/release-0.4.1

Conversation

@adrian-prior
Copy link
Copy Markdown
Collaborator

Hotfix on top of #285 (v0.4.0). Caught by a TestPyPI smoke install before publishing 0.4.0 to real PyPI — v0.4.0 was never published to PyPI, going straight to v0.4.1.

What broke

scikit-learn 1.8.0 (released recently, the latest resolved version on a fresh install) renamed sklearn.utils.validation._is_pandas_dfis_pandas_df (no underscore). src/tabpfn_extensions/misc/sklearn_compat.py imports the old name at module top-level, so any from tabpfn_extensions import ... on sklearn ≥ 1.8 immediately raises ImportError. That's the entire public API broken for anyone resolving sklearn 1.8 — which is the default on a clean pip install tabpfn-extensions today.

Fix

sklearn_compat.py now tries _is_pandas_df, falls back to is_pandas_df, and finally to the same pure-Python implementation already used in the older-sklearn branch. Three-line defensive shim.

Verification

Reproduced the original bug with a clean venv: pip install -i https://test.pypi.org/simple/ tabpfn-extensions==0.4.0 resolves sklearn 1.8 and crashes on import. After this patch (installed locally) the same import chain succeeds on sklearn 1.8.

Smoke results post-patch:

Check Result
from tabpfn_extensions import * on sklearn 1.8
warn_if_no_kv_cache silent on non-TabPFN estimator
SurvivalTabPFN import ❌ expected (GPL opt-in, scikit-survival not installed)
TabEBM import ❌ expected (pre-existing RES-1541)

Release flow after merge

  1. git tag v0.4.1 <merge-commit> && git push origin v0.4.1
  2. TestPyPI dry-run: python -m build && twine upload --repository testpypi dist/* → install in clean venv to re-verify.
  3. Real PyPI: twine upload dist/*.
  4. GitHub Release for the tag — body = the new [0.4.1] section of CHANGELOG.md.

🤖 Generated with Claude Code

scikit-learn 1.8 renamed sklearn.utils.validation._is_pandas_df to
is_pandas_df (dropping the leading underscore). misc/sklearn_compat.py
imports the old name at module top-level, so on a fresh `pip install
tabpfn-extensions` resolving sklearn>=1.8, every `from tabpfn_extensions
import ...` immediately ImportErrors.

Make the import resilient: try _is_pandas_df, fall back to is_pandas_df,
fall back to the same pure-Python implementation already used in the
older-sklearn branch.

Caught by a TestPyPI smoke install of 0.4.0 before publishing to real
PyPI — 0.4.0 was never published. Bumping straight to 0.4.1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the project version to 0.4.1 and addresses an import failure caused by the renaming of _is_pandas_df in scikit-learn 1.8+. The fix includes a compatibility shim with a pure-Python fallback. Feedback highlights a missing sys import in the fallback function, a version mismatch in the uv.lock file, and the accidental removal of platform-specific wheels, suggesting a regeneration of the lock file for all architectures.

I am having trouble creating individual review comments. Click here to see my feedback.

src/tabpfn_extensions/misc/sklearn_compat.py (273-279)

high

The fallback implementation of _is_pandas_df uses sys.modules, but sys is not imported in this scope. This will cause a NameError if the fallback path is ever executed. Adding a local import ensures the fallback works correctly.

            def _is_pandas_df(X):
                """Return True if X is a pandas DataFrame."""
                import sys
                try:
                    pd = sys.modules["pandas"]
                except KeyError:
                    return False
                return isinstance(X, pd.DataFrame)

uv.lock (3356)

high

There is a version mismatch in the lock file. pyproject.toml is being updated to 0.4.1, but uv.lock is only being updated to 0.4.0. Please run uv lock to ensure the lock file correctly reflects the project version.

uv.lock (299-300)

high

Multiple platform-specific wheels (e.g., ppc64le, s390x) are being removed from the lock file. This often happens when the lock file is updated in an environment that doesn't resolve all supported architectures. If these platforms are still supported, please regenerate the lock file using uv lock --all-platforms.

Replaces the targeted three-line shim from the prior commit with the
upstream sklearn-compat 0.1.5 file. The previous vendored copy was at
0.1.3 (March 2025) and predated scikit-learn 1.8, which renamed
sklearn.utils.validation._is_pandas_df -> is_pandas_df. Upstream 0.1.5
ships a proper "# Upgrading for scikit-learn 1.8" block plus general
cleanups (typing, docstrings, etc).

Only validate_data is consumed downstream — by many_class and hpo —
and its signature hasn't changed across these point releases. Verified
the full public API imports cleanly against sklearn 1.8 in a clean
TestPyPI-style smoke venv.

Also: add `# ruff: noqa` to the file. It's vendored upstream, so we
don't want our linter to push us toward modifying it (we'd just have
to redo the same hand-edits on every re-vendor). `# mypy: ignore-errors`
was already there for the same reason.

Source:
https://github.com/sklearn-compat/sklearn-compat/blob/0.1.5/src/sklearn_compat/_sklearn_compat.py

(NB: the docstring version string in upstream 0.1.5 still says
"Version: 0.1.4" — they don't always bump the docstring with the tag.
Set to 0.1.5 in our copy to match the tag we pulled from.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@adrian-prior adrian-prior marked this pull request as ready for review May 11, 2026 11:21
@adrian-prior adrian-prior requested a review from a team as a code owner May 11, 2026 11:21
@adrian-prior adrian-prior requested review from bejaeger and removed request for a team May 11, 2026 11:21
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@adrian-prior adrian-prior requested a review from LeoGrin May 11, 2026 11:23
Comment thread src/tabpfn_extensions/misc/sklearn_compat.py
Copy link
Copy Markdown
Collaborator

@LeoGrin LeoGrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! You updated sklearn_compat to latest version right?

@adrian-prior
Copy link
Copy Markdown
Collaborator Author

@LeoGrin yes, indeeed

Ben caught that my previous re-vendor commit wasn't actually 1:1 with
upstream — pre-commit ruff had auto-fixed 48 things during one of the
intermediate commit attempts (stripping # noqa: F401 markers, merging
adjacent imports, adding ":" to "Returns" docstring headers, etc.).

Re-fetch from
https://github.com/sklearn-compat/sklearn-compat/blob/0.1.5/src/sklearn_compat/_sklearn_compat.py
and replace the file. Diff against upstream now reduces to one line:
the docstring `Version: 0.1.4` -> `Version: 0.1.5` (upstream forgot to
bump the docstring with the 0.1.5 tag), plus a trailing-newline at EOF
added by pre-commit's end-of-file-fixer.

To keep it clean going forward, add the file to `[tool.ruff]
extend-exclude` in pyproject.toml so both ruff check and ruff format
skip it. (`# ruff: noqa` only covers lint, not the formatter.) The
ruff-pre-commit hooks already pass `--force-exclude` internally, so
the exclude is honored even when pre-commit passes the file as a
positional argument — no .pre-commit-config.yaml change needed.

Verified the full public-API import chain and ManyClassClassifier
fit/predict still work against sklearn 1.8.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@adrian-prior adrian-prior merged commit 98f1586 into main May 11, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants