Additive Terms by simon-hirsch · Pull Request #158 · simon-hirsch/ondil

simon-hirsch · 2025-09-18T11:43:01Z

No description provided.

…odel

simon-hirsch · 2025-11-12T09:32:40Z

Questions

Questions I'm encountering during development

What about the global model selection
Do we want to push forget to the EstimationMethod - I think we don't so that we can have different forgets for different terms that are estimated jointly.
Do we want to save the first result of each inner iteration in terms regardless of performance to have a fall-back?

Remaining To-Do

Make non-constant start values
Generalize Terms a bit more s.t. we can have all terms jointly that use a certain estimation method
Add a weight to coordinate descent for the regularization -> Allow for regularized estimation of centered, but unscaled variables with correct relative penalization

…egularized term

…rc/ondil/base/terms.py

…hirsch/rolch into structured-additive-model

…odel

…hirsch/rolch into structured-additive-model

…odel

…hirsch/rolch into structured-additive-model

Copilot

Pull request overview

This PR introduces a “terms + feature transformations” architecture to support structured additive distributional regression (including time-series/autoregressive terms), expands estimation method capabilities (bounds/constraints, extra kwargs), and adds/updates several distributions and supporting utilities.

Changes:

Added new ondil.terms module (linear / regularized linear, time-series terms, feature transformations, and scikit-learn estimator wrapper).
Added a new estimator OnlineStructuredAdditiveDistributionRegressor implementing an additive IRLS-style fit/update loop.
Extended estimation methods to accept additional kwargs (bounds, constraints, regularization weights), introduced constrained coordinate descent utilities, and added logging infrastructure (Loguru).

Reviewed changes

Copilot reviewed 43 out of 44 changed files in this pull request and generated 19 comments.

Show a summary per file

File	Description
tests/distributions/test_dist.py	Adds debug output and richer assert messages for distribution function parity vs R
tests/distributions/test_derivatives.py	Adds debug output for derivative parity tests
src/ondil/terms/time_series.py	New time-series term implementations (regularized + online updates)
src/ondil/terms/special.py	Adds a `Term` wrapper around scikit-learn estimators
src/ondil/terms/linear.py	Introduces linear / regularized linear terms and shared fit/update logic
src/ondil/terms/features.py	Adds feature transformation classes, including lagged time-series features
src/ondil/terms/init.py	Exposes terms/features in the public `ondil.terms` API
src/ondil/methods/ridge.py	Extends ridge/CD handling for bounds/weights and adds constrained CD implementation
src/ondil/methods/recursive_least_squares.py	Adjusts method signatures to accept kwargs and adds logging
src/ondil/methods/linear_constrained.py	Adds constrained coordinate descent + constrained elastic net path method
src/ondil/methods/factory.py	Adds `"ocd"` method option and tightens typing for method strings
src/ondil/methods/elasticnet.py	Refactors regularization weights handling and adds logging/kwargs plumbing
src/ondil/methods/init.py	Re-exports newly added methods (constraint + coordinate descent variants)
src/ondil/logging.py	Adds package-level logging configuration (Loguru-based)
src/ondil/links/identitylinks.py	Adds `LowerTruncatedIdentity` link
src/ondil/links/init.py	Exposes `LowerTruncatedIdentity` in link public API
src/ondil/incremental_statistics.py	Adds online mean/variance utilities (used for weighted regularization)
src/ondil/estimators/online_struct_add_distreg.py	Adds structured additive distribution regressor estimator
src/ondil/estimators/online_linear_model.py	Updates fit call to pass named args to `fit_beta`
src/ondil/estimators/init.py	Exports the new structured additive estimator
src/ondil/distributions/studentt.py	Adds start-value mixing/optimism and constant initial values
src/ondil/distributions/skew_t.py	Adds start-value mixing and constant/dynamic initial values logic
src/ondil/distributions/poisson.py	Adds start-value mixing and constant initial values
src/ondil/distributions/normal.py	Adds start-value mixing and constant initial values for Normal(+MeanVariance)
src/ondil/distributions/negative_binomial.py	Adds a new Negative Binomial distribution
src/ondil/distributions/johnsonsu.py	Adds start-value mixing and splits constant vs dynamic initial values
src/ondil/distributions/init.py	Exposes `NegativeBinomial` in public API
src/ondil/design_matrix.py	Adds `make_lags()` and reworks some utilities
src/ondil/coordinate_descent/utils.py	Extracts shared CD helpers (`soft_threshold`, `get_start_beta`)
src/ondil/coordinate_descent/cd_linear_constrained.py	Adds constrained coordinate descent kernels (numba)
src/ondil/coordinate_descent/cd_base.py	Refactors base CD to use shared helpers
src/ondil/coordinate_descent/init.py	Exposes coordinate descent functions in a proper package
src/ondil/base/terms.py	Introduces `Term` and `FeatureTransformation` base classes and helper logic
src/ondil/base/estimation_method.py	Refactors method interfaces to accept `**kwargs`
src/ondil/base/distribution.py	Adds `_svm` helper and distribution variance helper
src/ondil/base/init.py	Exposes `Term` from `ondil.base`
src/ondil/init.py	Imports new submodules and configures logging at import time
pyproject.toml	Adds `loguru` dependency
mkdocs.yml	Adds a new docs page for terms and features
docs/terms_and_features.md	Documents the new estimator→equation→terms→features hierarchy
docs/coordinate_descent.md	Fixes doc reference to `soft_threshold` after refactor
design/terms.md	Design notes for terms/feature transformations
design/additive_models.md	Design proposal for additive models in ondil
.gitignore	Normalizes formatting (adds line number alignment)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-22T13:13:51Z

+        if self.is_regularized:
+            is_regularized = self.is_regularized
+        else:
+            n_features = X_mat.shape[1]
+            is_regularized = np.repeat(True, n_features)
+        if self.fit_intercept and not self.regularize_intercept:
+            is_regularized[0] = False


Same issue in the path-based model-selection code: if self.is_regularized: is ambiguous when is_regularized is an ndarray mask. This will prevent users from providing per-coefficient regularization masks. Use if self.is_regularized is not None: (and validate the length matches the design matrix).

Copilot · 2026-04-22T13:13:51Z

+    def fit_beta(self, x_gram, y_gram, **kwargs):
        pass

-    def update_beta(self, x_gram, y_gram, is_regularized):
+    def update_beta(self, x_gram, y_gram, **kwargs):
        pass


ElasticNetPath.fit_beta() / update_beta() are currently pass, so if they are ever called they will silently return None instead of failing fast. Since this is a path-based method, these should raise NotImplementedError (or delegate to an appropriate single-lambda implementation) to avoid hard-to-debug downstream None errors.

Copilot · 2026-04-22T13:13:51Z

+        self._method = get_estimation_method(self.method)
+        if not self._method._path_based_method:
+            raise ValueError("Non-Path-based methods are not supported for LinearTerm.")


Same issue here: the exception says "...for LinearTerm" inside RegularizedTimeSeriesTerm._prepare_term(), but this class is RegularizedTimeSeriesTerm. Adjust the message so it points to the correct term type.

Copilot · 2026-04-22T13:13:52Z

+class OnlineStructuredAdditiveDistributionRegressor(
+    OndilEstimatorMixin,
+    RegressorMixin,
+    BaseEstimator,
+):
+    _parameter_constraints = {
+        "distribution": [Distribution],
+        "estimation_method": [EstimationMethod],
+        "terms": [dict, type(None)],
+        "scaler": [OnlineScaler, None],
+    }
+
+    def __init__(
+        self,
+        distribution: Distribution = Normal(),
+        terms: Optional[Dict[str, Any]] = None,
+        scale_inputs: bool = True,
+        learning_rate: float = 0.0,
+        max_outer_iterations: int = 10,
+        max_inner_iterations: int = 10,
+        rel_tol_outer: float = 1e-3,
+        rel_tol_inner: float = 1e-3,
+        abs_tol_outer: float = 1e-3,
+        abs_tol_inner: float = 1e-3,
+        step_size: float = 1.0,
+    ):
+        self.distribution = distribution
+        self.terms = terms
+        self.scale_inputs = scale_inputs
+        self.learning_rate = learning_rate
+        self.max_outer_iterations = max_outer_iterations
+        self.max_inner_iterations = max_inner_iterations
+        self.rel_tol_outer = rel_tol_outer
+        self.rel_tol_inner = rel_tol_inner
+        self.abs_tol_outer = abs_tol_outer
+        self.abs_tol_inner = abs_tol_inner
+        self.step_size = step_size
+


OnlineStructuredAdditiveDistributionRegressor introduces a large new estimator API (fit/update/predict, term orchestration, convergence logic), but there are no unit tests covering basic fit/update behavior or failure modes (e.g. step-size reduction, term state handling). Since the repo already has estimator-level tests, adding a minimal smoke test (fit + update on a small synthetic dataset) would help prevent regressions.

Copilot · 2026-04-22T13:13:52Z

+    if isinstance(lags, int):
+        logger.trace(f"Got integer lags, converting to list. Lags: {lags}")
+        lags = np.linspace(1, lags, lags, dtype=int)
+        logger.trace(f"Converted lags: {lags}")
+    n_lags = len(lags)
+
+    if n_lags == 0:
+        logger.warning("No lags specified. Expect a crash.")
+    if np.any(lags < 1):
+        logger.warning("We have lags smaller than 1. Are you sure?")
+
+    X = np.hstack([np.roll(y[:, None], i) for i in lags])
+    X[np.triu_indices(n_lags, k=0)] = np.mean(y)
+


make_lags() masks the rolled values using X[np.triu_indices(n_lags, k=0)] = mean(y), which only works when lags are exactly [1, 2, ..., n_lags] in order. For arbitrary lag sets (e.g. [1, 4]) this leaves look-ahead leakage from the np.roll wrap-around. Mask the first lag rows per lag-column (e.g. loop over (col, lag) and set X[:lag, col] = mean(y)), or build the lag matrix without wrap-around in the first place.

Copilot · 2026-04-22T13:13:55Z

+        self._method = get_estimation_method(self.method)
+        if self._method._path_based_method:
+            raise ValueError("Path-based methods are not supported for LinearTerm.")


The error messages in _prepare_term() refer to LinearTerm, but this is TimeSeriesTerm / RegularizedTimeSeriesTerm. This makes debugging confusing when users select an unsupported method. Update the message strings to reference the correct class names.

Copilot · 2026-04-22T13:13:55Z

+class LinearConstrainedCoordinateDescent(EstimationMethod):
+    r"""Linear Constrained (unconstrained) Estimation.
+
+    We use `numba` to speed up the coordinate descent algorithm.
+    """


LinearConstrainedCoordinateDescent is defined here in ridge.py and also in methods/linear_constrained.py (same class name/purpose). Keeping two implementations increases maintenance burden and makes it unclear which one is the supported public API. Consider removing one (or re-exporting a single implementation) to avoid divergence.

Copilot · 2026-04-22T13:13:55Z

 from . import (
    base,
    diagnostics,
    distributions,
    error,
    estimators,
+    incremental_statistics,
    information_criteria,
    links,
+    logging,
    methods,
    scaler,
+    terms,
    utils,
    warnings,
 )

+logging.set_log_level("INFO")
+


Calling logging.set_log_level("INFO") at import time introduces global side effects (it installs a stderr sink and emits a SUCCESS log). This can be surprising for library users and can interfere with applications/tests that configure logging themselves. Prefer leaving logging unconfigured by default and letting users opt in (or gate this behind an env var / explicit config call).

Copilot · 2026-04-22T13:13:55Z

+    forget_weight = init_forget_vector(forget, X.shape[0])
+    effective_weight = sample_weight * forget_weight
+
+    mean = np.average(X, weights=sample_weight * effective_weight, axis=0)


calculate_statistics() applies sample_weight twice when computing mean (weights=sample_weight * effective_weight where effective_weight already includes sample_weight). This will bias the mean/variance whenever non-uniform weights are provided. Use effective_weight as the weights (or keep forget_weight separate) so weights are applied exactly once.

Suggested change

mean = np.average(X, weights=sample_weight * effective_weight, axis=0)

mean = np.average(X, weights=effective_weight, axis=0)

Copilot · 2026-04-22T13:13:56Z

+    constraint_matrix: np.ndarray | None,
+    constraint_bounds: np.ndarray | None,
+    relaxation_method: Literal["alm", "dpga"] = "alm",
+    which_start_value: Literal[
+        "previous_lambda", "previous_fit", "average"
+    ] = "previous_lambda",
+    selection: Literal["cyclic", "random"] = "cyclic",


online_linear_constrained_coordinate_descent_path() declares relaxation_method: Literal["alm", "dpga"], which is a different spelling from the earlier "pdga" and from the method-level API ("pgda"). This mismatch makes it very easy to pass an option that compiles but is never handled. Use a single spelling (e.g. "pgda") consistently throughout signatures and conditionals.

simon-hirsch and others added 4 commits September 18, 2025 13:42

Add initial design document for structured additive models

3ce5232

Update AutoregressiveTerm with new features and lags

5d707cd

Add clarification for autoregressive terms.

7eeac72

Merge remote-tracking branch 'origin/main' into structured-additive-m…

4503ff7

…odel

simon-hirsch and others added 24 commits November 12, 2025 14:28

First version working with linear Terms

2367494

[AUTOMATED] Style fixes by ruff

c31d5b9

Stuff

7d0574e

update design doc

4f1efec

Improve step size handling in online_struct_add_distreg, add linear r…

33b1fa2

…egularized term

[AUTOMATED] Style fixes by ruff

96a7baa

Add attribute 'allow_online_updates' as a property in Term class in s…

7f73e22

…rc/ondil/base/terms.py

Simple term that adds a scikit-learn estimator as a term in the model.

4663f9f

ruff

6adc006

Merge branch 'structured-additive-model' of https://github.com/simon-…

af60359

…hirsch/rolch into structured-additive-model

[AUTOMATED] Style fixes by ruff

6564e46

Fix typo in parameter name

d244d18

First working update

dd90e8b

Add autoregressive term

2ac1b03

Add forget weights

67b2bd3

Add lags to design matrix, add regularized linear term

335e831

Add autoregressive term to terms __init__.py

4b0c298

Add predict method

8502e5c

Make predict method only 1 step ahead

db19cf8

Add defaults to terms

47e2d35

swap vstack and hstack for lagged value construction

3a62aa0

Fixes

88b73da

Make new API, pass fitted_values and target_values to all terms

9470c2b

[AUTOMATED] Style fixes by ruff

a3838c4

simon-hirsch requested a review from BerriJ November 14, 2025 13:53

simon-hirsch and others added 25 commits February 4, 2026 14:13

Update readme

d51e8a9

Merge branch 'structured-additive-model' of https://github.com/simon-…

4b75faf

…hirsch/rolch into structured-additive-model

Merge remote-tracking branch 'origin/crps' into structured-additive-m…

7083a90

…odel

Remove unused variables

b3288de

[AUTOMATED] Style fixes by ruff

41b2b13

Merge branch 'structured-additive-model' of https://github.com/simon-…

6cac827

…hirsch/rolch into structured-additive-model

Merge remote-tracking branch 'origin/main' into structured-additive-m…

a8eab1b

…odel

[AUTOMATED] Style fixes by ruff

00cbdb3

Ensure internal method API changes don't break other estimators

0f2d6c1

Merge branch 'structured-additive-model' of https://github.com/simon-…

0b33323

…hirsch/rolch into structured-additive-model

update start values negbimon

6b14198

Refactor terms

3795450

[AUTOMATED] Style fixes by ruff

a6441a3

Add docstrings

cb90b73

Merge branch 'structured-additive-model' of https://github.com/simon-…

8c9fe14

…hirsch/rolch into structured-additive-model

Fix docs

405d6dc

Add documentation for terms

c45502c

Add CD to estimation method factory

f61c217

[AUTOMATED] Style fixes by ruff

15671e1

Remove problematic columns in predict_in_sample_during_update

7451c52

Allow for beta bounds

17d9eff

Merge branch 'structured-additive-model' of https://github.com/simon-…

beea0a4

…hirsch/rolch into structured-additive-model

Adjust SkewTMeanStd initial values

5bde796

Fix to remove_problematic_columns

32ab138

Merge branch 'structured-additive-model' of https://github.com/simon-…

633c6ef

…hirsch/rolch into structured-additive-model

simon-hirsch mentioned this pull request Mar 1, 2026

Add Negative Binomial #186

Open

Merge branch 'main' into structured-additive-model

ba450c2

Copilot AI review requested due to automatic review settings April 22, 2026 12:59

Copilot started reviewing on behalf of BerriJ April 22, 2026 13:03 View session

Copilot AI reviewed Apr 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additive Terms#158

Additive Terms#158
simon-hirsch wants to merge 239 commits intomainfrom
structured-additive-model

simon-hirsch commented Sep 18, 2025

Uh oh!

simon-hirsch commented Nov 12, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

	mean = np.average(X, weights=sample_weight * effective_weight, axis=0)
	mean = np.average(X, weights=effective_weight, axis=0)

Conversation

simon-hirsch commented Sep 18, 2025

Uh oh!

simon-hirsch commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Questions

Remaining To-Do

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

simon-hirsch commented Nov 12, 2025 •

edited

Loading