Add `remainder` vector to `glex` output

If `max_interaction` is limited below the corresponding limit in the model, our sums of `m`/`shap` + `intercept` no longer equal the global model prediction.
This affects both `max_interaction` in `rpf` and `max_depth` in `xgboost`, as both `glex` methods have a `max_interaction` argument. (`rpf` even technically allows limiting to certain predictors)

As a workaround to not just drop that information, we considered a `remainder` vector as part of the return value of `glex`, which can then also be used to gauge if `max_interaction` in `glex` was set to a reasonable value, or if too much of the prediction was lost.

Generally, adding a `remainder` (or similarly named) vector to `m` would complicate downstream handling in plot functions, `glex_vi`, `glex_explain` etc.  
Then again, these functions already allow limiting the output to terms below a given degree of interaction and/or terms with negligible contribution to the prediction, aggregating them under "Remaining terms" (as of now).  

This might lead to confusion, because now we have two stages of remainder-ness: Those terms left out by `glex`, and those terms left out by plot functions. 
Maybe plot functions should just always include a "Remaining terms" element that, at least, contains the `glex`-remainder?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `remainder` vector to `glex` output #11

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add remainder vector to glex output #11

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Add `remainder` vector to `glex` output #11