Skip to content

#920 with renames#943

Open
brendan-priorlabs wants to merge 16 commits into
mainfrom
brendan/v3-wiring
Open

#920 with renames#943
brendan-priorlabs wants to merge 16 commits into
mainfrom
brendan/v3-wiring

Conversation

@brendan-priorlabs
Copy link
Copy Markdown
Contributor

This is just @bejaeger's #920, but with renames to be compatible with the client package

bejaeger and others added 16 commits May 6, 2026 11:12
Switch ModelSource.get_{classifier,regressor}_v3() default_filename
and filenames from the dated form (tabpfn-v3-{role}-20260506.ckpt)
to the canonical wire-format name used everywhere else in the stack
(tabpfn-v3-{role}-v3_default.ckpt). This is the form produced by
tabpfn-client's `_model_name_to_path("v3_default")` and pinned by
tabpfn-server's STARTUP_MODELS, so a single source of truth flows
end-to-end with no rename layer in between.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@brendan-priorlabs brendan-priorlabs requested a review from a team as a code owner May 10, 2026 12:43
@brendan-priorlabs brendan-priorlabs requested review from oscarkey and removed request for a team May 10, 2026 12:43
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the default model version to V3, integrates V3 support across architectures and tests, and refactors architecture name resolution. Feedback identifies a potential TypeError in save_tabpfn_model from redundant configuration handling and suggests a more precise prefix-based filter for state dict keys in load_model to avoid stripping valid parameters.

Comment on lines 1105 to +1106
"architecture_name": architecture_name,
"inference_config": asdict(model.inference_config_),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This added line is redundant and potentially problematic. The inference_config is already retrieved using getattr(model, "inference_config_", None) at line 1083 and added to the checkpoint dict conditionally at lines 1108-1109. Including it here unconditionally will cause a TypeError if model.inference_config_ is None (as asdict(None) is invalid). It is better to rely on the existing conditional block below.

Suggested change
"architecture_name": architecture_name,
"inference_config": asdict(model.inference_config_),
"architecture_name": architecture_name,

model.load_state_dict(full_state)
# The model computes the loss internally. Strip criterion keys that
# save_tabpfn_model may have written so load_state_dict doesn't reject them.
model_state = {k: v for k, v in full_state.items() if "criterion." not in k}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The filtering condition "criterion." not in k is too broad and might accidentally strip valid model parameters if their names contain the string "criterion." (e.g., layer.criterion_weight). Since save_tabpfn_model specifically prefixes criterion keys with criterion., it is safer and more precise to use not k.startswith("criterion.").

Suggested change
model_state = {k: v for k, v in full_state.items() if "criterion." not in k}
model_state = {k: v for k, v in full_state.items() if not k.startswith("criterion.")}

Copy link
Copy Markdown
Contributor

@oscarkey oscarkey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving to unblock, but should these changes be merged into #920, as that isn't merged yet?

@oscarkey
Copy link
Copy Markdown
Contributor

ah just saw your comment on the other pr and realised this is probably an auto-assign!

@brendan-priorlabs
Copy link
Copy Markdown
Contributor Author

Sorry about that! Was indeed an auto-assign. We won't merge this as-is. Keeping it open for now until we've decided how to untangle our pins (not strictly necessary -- just a convenient bookmark)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants