Skip to content

feat: thinking-mode knobs (rename from enhanced_*)#270

Merged
ggprior merged 6 commits into
mainfrom
georg/enhanced-fit-to-effort
May 10, 2026
Merged

feat: thinking-mode knobs (rename from enhanced_*)#270
ggprior merged 6 commits into
mainfrom
georg/enhanced-fit-to-effort

Conversation

@ggprior
Copy link
Copy Markdown
Contributor

@ggprior ggprior commented May 6, 2026

Summary

  • Renames the user-facing fit-time knobs from `enhanced_` to `thinking_`: `enhanced_fit_mode` → `thinking_mode`, `enhanced_effort` → `thinking_effort`, `enhanced_timeout_s` → `thinking_timeout_s`, `enhanced_effort_metric` → `thinking_effort_metric` on `TabPFNClassifier` / `TabPFNRegressor`.
  • Internal validator + constants renamed to match: `validate_enhanced_fit_mode` → `validate_thinking_mode`, `ENHANCED_TIMEOUT_MAX_S` → `THINKING_TIMEOUT_MAX_S`, `EnhancedEffort` type alias → `ThinkingEffort`, `_VALID_ENHANCED_EFFORT_LEVELS` → `_VALID_THINKING_EFFORT_LEVELS`.
  • Server-side wire-protocol fields stay as `effort` / `effort_timeout_s` / `effort_metric` — only the client-side public API is renamed. The client.py translation strips `thinking_*` keys before forwarding to the server.
  • Companion server PR: PriorLabs/tabpfn-server#852.
  • Breaking API change: callers using `enhanced_fit_mode=...` / `enhanced_effort=...` must migrate to the `thinking_*` names.

Test plan

  • `pytest tests/unit` passes
  • Smoke test against staging once server PR is deployed: `thinking_mode=True, thinking_effort="medium"` runs an end-to-end classification + regression fit
  • Validation guards still trigger for invalid effort levels and timeout > 2400s

…etric

TabPFNClassifier and TabPFNRegressor now take:
- effort: Literal["medium", "high"] | None (None disables the autogluon-
  wrapped fit; medium/high pick the portfolio count server-side)
- effort_timeout_s: float | None (user budget; server forwards 0.9x to AG)
- effort_metric: str | None (eval metric for the sweep)

Replaces enhanced_fit_mode (bool) plus the old enhanced_fit_mode_metric
and enhanced_fit_mode_time_limit_s. validate_effort() enforces the literal
range, the 2400 s cap, and rejects timeout/metric set without effort.
client.py lifts the three fields to top-level FitRequest siblings and
keeps tabpfn_systems consistent ("enhanced" present iff effort is set).
@ggprior ggprior requested a review from a team as a code owner May 6, 2026 18:18
@ggprior ggprior requested review from noahho and removed request for a team May 6, 2026 18:18
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request replaces the "enhanced fit mode" parameters with a new "effort" system across the API models, client, and estimators. The fields enhanced_fit_mode_metric and enhanced_fit_mode_time_limit_s have been superseded by effort, effort_timeout_s, and effort_metric. The update includes revised validation logic, expanded docstrings detailing supported metrics for classification and regression, and updated unit tests. A review comment suggests improving the validation of effort_timeout_s to ensure it is a positive value.

I am having trouble creating individual review comments. Click here to see my feedback.

src/tabpfn_client/estimator.py (752)

medium

The validation only checks for the upper bound of effort_timeout_s. A non-positive timeout (zero or negative) is also invalid for the fitting process and should be caught here to provide a clear error message before reaching the server.

    if effort_timeout_s is not None and (effort_timeout_s <= 0 or effort_timeout_s > EFFORT_TIMEOUT_MAX_S):

@ggprior ggprior requested review from safaricd and removed request for noahho May 7, 2026 08:54
Copy link
Copy Markdown
Collaborator

@brendan-priorlabs brendan-priorlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Let's be sure to push this out to the beta users as soon as we do our tabpfn-server

Comment thread src/tabpfn_client/api_models.py Outdated
ggprior added 2 commits May 8, 2026 14:24
The estimator-side knobs are renamed to make their relationship explicit:
- effort -> enhanced_effort, defaulting to "medium" (only consulted when
  enhanced_fit_mode=True)
- effort_timeout_s -> enhanced_timeout_s
- effort_metric -> enhanced_effort_metric

Activation moves from "effort is not None" back to a dedicated boolean
(enhanced_fit_mode, default False). enhanced_effort is a Literal
["medium", "high"] with a documented default of "medium" so users can flip
on enhanced fit mode without picking an effort level.

Wire format is unchanged: client.py still sends effort, effort_timeout_s
and effort_metric on FitRequest, so no server companion change is needed.
@ggprior ggprior changed the title feat: replace enhanced_fit_mode with effort/effort_timeout_s/effort_metric feat: thinking-mode knobs (rename from enhanced_*) May 9, 2026
Copy link
Copy Markdown
Collaborator

@brendan-priorlabs brendan-priorlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to land that small change IMO

Comment thread src/tabpfn_client/estimator.py Outdated
ggprior added 3 commits May 10, 2026 10:58
Server-side wire fields renamed effort/effort_timeout_s/effort_metric ->
thinking_effort/thinking_timeout_s/thinking_effort_metric. Drop the client-side
translation layer and forward the user-facing knobs directly.
Drops the "medium" default on thinking_effort (now Optional, default None) so
the constructor can tell whether the user set it. Either signal turns thinking
on:
  - thinking_mode=True alone -> defaults thinking_effort to "medium"
  - thinking_effort="medium" or "high" alone -> implies thinking_mode=True

Updates validate_thinking_mode and the FitRequest builder in client.py to
honour the unified rule. Adds test_thinking_validation.py pinning the contract.
@ggprior ggprior added this pull request to the merge queue May 10, 2026
Merged via the queue into main with commit 5bea5b9 May 10, 2026
6 checks passed
@ggprior ggprior deleted the georg/enhanced-fit-to-effort branch May 10, 2026 10:45
ggprior added a commit that referenced this pull request May 10, 2026
The merged PRs renamed in different directions:
- tabpfn-server (#852): server's FitRequest field is `thinking_effort_metric`
- tabpfn-client (#270): client's user-facing kwarg is `thinking_metric`

Result: any fit from `tabpfn-client@main` against `api.priorlabs.ai`
422s with `extra_forbidden` on `thinking_metric` — including plain
v3/v2.5 fits, because the client always serialises the field as `null`
even when thinking is off.

Keeps the public API as `thinking_metric` (TabPFNClassifier/Regressor
kwarg, `validate_thinking_mode`, the instance attribute), and only
renames the FitRequest wire field + the call site that builds the
request to `thinking_effort_metric` so the body matches the server's
schema. Existing user code does not need to change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants