Implement TransformerRegressor and update documentation by jrosenfeld13 · Pull Request #11 · crowdcent/centimators

jrosenfeld13 · 2026-02-10T15:41:18Z

Added TransformerRegressor class for sequence modeling with transformer architecture, supporting multiple attention modes and pooling strategies.
Updated documentation to include TransformerRegressor details and usage examples.

Note

Medium Risk
Adds a new Keras transformer-based estimator with multiple attention/pooling modes; primary risk is correctness and training stability across backends rather than security or data handling.

Overview
Introduces a new Keras TransformerRegressor sequence estimator, including learned positional embeddings, optional dual-axis (CrossAttention) vs temporal/feature attention modes, and attention/average pooling before an MLP head.

Exports the estimator (and supporting layers) via the package/estimator __init__ lazy-import surfaces, adds unit tests covering fit/predict and all attention modes, and updates docs to list and demonstrate the new Transformer model; bumps the editable package version to 0.3.2 in uv.lock.

^{Written by Cursor Bugbot for commit 403698d. This will update automatically on new commits. Configure here.}

- Added TransformerRegressor class for sequence modeling with transformer architecture, supporting multiple attention modes and pooling strategies. - Updated documentation to include TransformerRegressor details and usage examples.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

cursor · 2026-02-10T15:50:53Z

src/centimators/model_estimators/keras_estimators/transformer.py

+        ffn = layers.Dropout(self.dropout_rate)(ffn)
+        ffn = layers.Dense(self.d_model)(ffn)
+        ffn = layers.Dropout(self.dropout_rate)(ffn)
+        return x + ffn


Post-norm mode applies no LayerNorm at all

High Severity

When use_pre_norm is False, _encoder_block applies zero LayerNormalization operations. The docstring says False means "apply LayerNorm after attention/FFN" (post-norm), but the conditional only adds normalization when use_pre_norm is True and omits it entirely otherwise. A post-norm encoder block needs LayerNormalization after each residual connection (inputs + attention_out and x + ffn). Without any normalization, training will be numerically unstable and produce poor results.

Additional Locations (1)

src/centimators/model_estimators/keras_estimators/transformer.py#L142-L144

cursor bot reviewed Feb 10, 2026

View reviewed changes

jrosenfeld13 merged commit 130a850 into main Feb 10, 2026
4 checks passed

jrosenfeld13 deleted the feat/keras-transformer-estimator branch February 10, 2026 16:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement TransformerRegressor and update documentation#11

Implement TransformerRegressor and update documentation#11
jrosenfeld13 merged 1 commit intomainfrom
feat/keras-transformer-estimator

jrosenfeld13 commented Feb 10, 2026 •

edited by cursor bot

Loading

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Feb 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jrosenfeld13 commented Feb 10, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

This PR is being reviewed by Cursor Bugbot

Uh oh!

cursor bot Feb 10, 2026

Choose a reason for hiding this comment

Post-norm mode applies no LayerNorm at all

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jrosenfeld13 commented Feb 10, 2026 •

edited by cursor bot

Loading