chore: float64 -> float32 in process_text_na_dataframe by ggprior · Pull Request #926 · PriorLabs/TabPFN

ggprior · 2026-05-07T07:12:29Z

No description provided.

gemini-code-assist

Code Review

This pull request modifies the process_text_na_dataframe function in src/tabpfn/preprocessing/clean.py to cast the output to float32 instead of float64. Feedback indicates that this change introduces numerical stability risks, contradicts the function's docstrings, and creates inconsistencies with the rest of the preprocessing pipeline which defaults to float64.

gemini-code-assist · 2026-05-07T07:13:52Z

        X_encoded[:, string_cols_ix],
    )
-    return typing.cast("np.ndarray", X_encoded.astype(np.float64))
+    return typing.cast("np.ndarray", X_encoded.astype(np.float32))


Changing the return type to float32 introduces several inconsistencies:

Numerical Stability: src/tabpfn/constants.py (lines 50-51) defines DEFAULT_NUMPY_PREPROCESSING_DTYPE as np.float64 specifically to avoid overflows during transformations like Yeo-Johnson. Hardcoding float32 here may lead to issues in subsequent preprocessing steps.

Docstring Inconsistency: The docstring for process_text_na_dataframe (lines 142 and 145) still explicitly mentions conversion to float64.

Pipeline Inconsistency: fix_dtypes (line 69) defaults to float64. Since clean_data calls both, the numeric_dtype setting in fix_dtypes is now effectively overridden by this hardcoded float32 cast.

If the intention is to move the pipeline to float32, consider updating the global constant or making the dtype a parameter to maintain consistency.

chore: float64 -> float32 in process_text_na_dataframe

7eb72e1

gemini-code-assist Bot reviewed May 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: float64 -> float32 in process_text_na_dataframe#926

chore: float64 -> float32 in process_text_na_dataframe#926
ggprior wants to merge 1 commit into
mainfrom
georg/tabpfn-dtype-float32

ggprior commented May 7, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ggprior commented May 7, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant