feat: thinking-mode knobs (rename from enhanced_*)#270
Conversation
…etric
TabPFNClassifier and TabPFNRegressor now take:
- effort: Literal["medium", "high"] | None (None disables the autogluon-
wrapped fit; medium/high pick the portfolio count server-side)
- effort_timeout_s: float | None (user budget; server forwards 0.9x to AG)
- effort_metric: str | None (eval metric for the sweep)
Replaces enhanced_fit_mode (bool) plus the old enhanced_fit_mode_metric
and enhanced_fit_mode_time_limit_s. validate_effort() enforces the literal
range, the 2400 s cap, and rejects timeout/metric set without effort.
client.py lifts the three fields to top-level FitRequest siblings and
keeps tabpfn_systems consistent ("enhanced" present iff effort is set).
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Code Review
This pull request replaces the "enhanced fit mode" parameters with a new "effort" system across the API models, client, and estimators. The fields enhanced_fit_mode_metric and enhanced_fit_mode_time_limit_s have been superseded by effort, effort_timeout_s, and effort_metric. The update includes revised validation logic, expanded docstrings detailing supported metrics for classification and regression, and updated unit tests. A review comment suggests improving the validation of effort_timeout_s to ensure it is a positive value.
I am having trouble creating individual review comments. Click here to see my feedback.
src/tabpfn_client/estimator.py (752)
The validation only checks for the upper bound of effort_timeout_s. A non-positive timeout (zero or negative) is also invalid for the fitting process and should be caught here to provide a clear error message before reaching the server.
if effort_timeout_s is not None and (effort_timeout_s <= 0 or effort_timeout_s > EFFORT_TIMEOUT_MAX_S):
brendan-priorlabs
left a comment
There was a problem hiding this comment.
LGTM! Let's be sure to push this out to the beta users as soon as we do our tabpfn-server
The estimator-side knobs are renamed to make their relationship explicit: - effort -> enhanced_effort, defaulting to "medium" (only consulted when enhanced_fit_mode=True) - effort_timeout_s -> enhanced_timeout_s - effort_metric -> enhanced_effort_metric Activation moves from "effort is not None" back to a dedicated boolean (enhanced_fit_mode, default False). enhanced_effort is a Literal ["medium", "high"] with a documented default of "medium" so users can flip on enhanced fit mode without picking an effort level. Wire format is unchanged: client.py still sends effort, effort_timeout_s and effort_metric on FitRequest, so no server companion change is needed.
brendan-priorlabs
left a comment
There was a problem hiding this comment.
Would be good to land that small change IMO
Server-side wire fields renamed effort/effort_timeout_s/effort_metric -> thinking_effort/thinking_timeout_s/thinking_effort_metric. Drop the client-side translation layer and forward the user-facing knobs directly.
Drops the "medium" default on thinking_effort (now Optional, default None) so the constructor can tell whether the user set it. Either signal turns thinking on: - thinking_mode=True alone -> defaults thinking_effort to "medium" - thinking_effort="medium" or "high" alone -> implies thinking_mode=True Updates validate_thinking_mode and the FitRequest builder in client.py to honour the unified rule. Adds test_thinking_validation.py pinning the contract.
The merged PRs renamed in different directions: - tabpfn-server (#852): server's FitRequest field is `thinking_effort_metric` - tabpfn-client (#270): client's user-facing kwarg is `thinking_metric` Result: any fit from `tabpfn-client@main` against `api.priorlabs.ai` 422s with `extra_forbidden` on `thinking_metric` — including plain v3/v2.5 fits, because the client always serialises the field as `null` even when thinking is off. Keeps the public API as `thinking_metric` (TabPFNClassifier/Regressor kwarg, `validate_thinking_mode`, the instance attribute), and only renames the FitRequest wire field + the call site that builds the request to `thinking_effort_metric` so the body matches the server's schema. Existing user code does not need to change.
Summary
Test plan