-
Notifications
You must be signed in to change notification settings - Fork 0
predictive templates: preview-launch reframe + bug-attribution cleanup #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -21,9 +21,6 @@ tags: | |
|
|
||
| Retail planners forecast unit sales at fine granularity — per store, per item, per day — to drive replenishment, promotions, and labour planning. Classical demand-forecasting models score one (store, item) series in isolation; they miss the fact that store A's sales of bread move with bakery sales across the chain, and that bakery sells through similarly to dairy. This template wires those hierarchies into a **Predictive** reasoner: a regression GNN trained over a heterogeneous Sale → Store, Sale → Item, Item → ItemFamily graph, so the model propagates signal through the store and product hierarchies while predicting per-Sale unit sales. | ||
|
|
||
| > [!IMPORTANT] | ||
| > The RelationalAI **predictive reasoner (GNN)** used in this template is in early access. The API surface (`GNN`, `PropertyTransformer`, task relationships) may still change between releases; check the `rai-predictive-modeling` and `rai-predictive-training` skills for current guidance before adapting to production data. | ||
|
|
||
| ## Who this is for | ||
|
|
||
| - Retail data scientists building per-(store, item, day) demand-forecasting pipelines who want to add hierarchical signal (item family, store cluster) without manually engineering features | ||
|
|
@@ -67,7 +64,7 @@ GRANT ALL PRIVILEGES ON SCHEMA FAVORITA_MINI.EXPERIMENTS TO APPLICATION RELATION | |
| ### Tools | ||
|
|
||
| - Python >= 3.10 | ||
| - RelationalAI Python SDK (`relationalai`) | ||
| - RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`) | ||
|
|
||
| ## Quickstart | ||
|
|
||
|
|
@@ -140,7 +137,7 @@ Test-set RMSE (per (city, family, week)): 150.8997 | |
| ``` | ||
|
|
||
| > [!NOTE] | ||
| > The GNN learns base-level demand and weekday/weekend seasonality cleanly. The December holiday spike is partially captured but under-shot — that's because the SDK's `has_time_column=True` temporal indexing is currently disabled in this template (see [Customize this template](#customize-this-template) and the troubleshooting note below). The pandas-level temporal split is preserved (we still train on the past and evaluate on the future), but the GNN itself sees the date as a flat datetime feature rather than a temporal index. When the SDK's time-aware mode is stable for this dataset shape, re-enabling it should improve the December-spike capture. | ||
| > The GNN learns base-level demand and weekday/weekend seasonality cleanly. The December holiday spike is partially captured but under-shot — Sale.date is exposed as a flat datetime feature, not a temporal index, so the GNN doesn't aggregate over time windows. The pandas-level temporal split is preserved (we still train on the past and evaluate on the future). To trade simplicity for tighter spike capture, see the "Use temporal indexing" variant in [Customize this template](#customize-this-template). | ||
|
|
||
| ## Template structure | ||
|
|
||
|
|
@@ -291,7 +288,7 @@ results_df = ( | |
|
|
||
| ## Customize this template | ||
|
|
||
| - **Re-enable temporal indexing** when the SDK ships a stable fix — set `has_time_column=True`, restore `time_col=[Sale.date]` in the PropertyTransformer, restore the date arg in the Train/Val/Test relationships (`f"{Sale} at {Any:date} has {Any:value}"`), and add `temporal_strategy="last"` to the `GNN(...)` constructor. The December holiday spike should predict better. | ||
| - **Use temporal indexing instead** — for tighter holiday/seasonal spike capture, set `has_time_column=True`, restore `time_col=[Sale.date]` in the PropertyTransformer, restore the date arg in the Train/Val/Test relationships (`f"{Sale} at {Any:date} has {Any:value}"`), and add `temporal_strategy="last"` to the `GNN(...)` constructor. Trades simplicity for the GNN aggregating over time windows. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One last place with the same issue. |
||
| - **Forecast different granularity** — change `TEST_DAYS` / `VAL_DAYS` at the top of the script. Default is a 60-day test window after a 60-day val window. | ||
| - **Add weather, promotions calendar, holiday flags** — extend `Sale` with extra columns and add them to `PropertyTransformer.category` or `.continuous` as appropriate. The same hierarchical-graph + GNN scaffold absorbs new features without restructuring. | ||
| - **Bring more hierarchy in** — the bundled data has Item → ItemFamily. Real Favorita data has Item → Class → Family → Department. Define a `Class` and `Department` concept the same way `ItemFamily` is defined, add `Class → Family` and `Family → Department` edges, and the GNN propagates through deeper product hierarchies. | ||
|
|
@@ -353,14 +350,6 @@ model = Model("demand_forecasting_local_v2") # bump on each re-run if needed | |
| ``` | ||
| </details> | ||
|
|
||
| <details> | ||
| <summary>Train job failures with date columns at scale (<code>has_time_column=True</code>)</summary> | ||
|
|
||
| PyRel 1.0.x has a server-side `DateTime/VString` signature mismatch when `has_time_column=True` is paired with a date column at non-trivial dataset sizes. Symptoms include train jobs that hang at "Step 2/4: Preparing model for prediction" with no JOBS row, or fail with a SQL signature error. | ||
|
|
||
| Workaround (used as the default in this template): keep the date as a plain `datetime` feature in `PropertyTransformer`, but set `has_time_column=False` and drop `time_col` / `temporal_strategy`. Preserve the temporal split in pandas before building task tables. See [Customize this template](#customize-this-template) for the instructions to re-enable temporal indexing once the SDK fix lands. | ||
| </details> | ||
|
|
||
| ## Related templates | ||
|
|
||
| - **`subscriber_retention`** — sibling Predictive template using a regression GNN on a homogeneous call graph (no time column); useful as a comparison for the simpler-graph case | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -13,9 +13,8 @@ | |
| graph so the GNN can propagate signal through the store and item | ||
| hierarchies. | ||
| 3. Train a regression GNN predicting Sale.unit_sales. Sale.date is fed | ||
| as a plain datetime feature, not as a temporal index -- see the | ||
| `has_time_column=False` NOTE inline for the SDK workaround and the | ||
| README "Customize this template" for re-enabling temporal indexing. | ||
| as a plain datetime feature, not as a temporal index; the temporal | ||
| split is done in pandas before the task tables are built (see step 4). | ||
| 4. Generate per-Sale predictions on a forward-looking 60-day test window | ||
| (temporal split done in pandas before the task tables are built) and | ||
| aggregate to weekly per-(store, family) forecasts. | ||
|
|
@@ -131,13 +130,11 @@ | |
| continuous=[Store.cluster], | ||
| integer=[Item.item_class], | ||
| datetime=[Sale.date], | ||
| # NOTE: time_col disabled. The PyRel 1.0.x predictive backend has a known | ||
| # DateTime/VString signature mismatch when has_time_column=True is used | ||
| # with a date column at scale; the workaround is to keep the date as a | ||
| # plain datetime feature (above) and disable temporal indexing. The | ||
| # split is still temporal — see the train_mask / val_mask / test_mask | ||
| # assignments below — so we still train on the past and evaluate on | ||
| # the future, just without temporal-strategy aggregation in the GNN. | ||
| # Sale.date is exposed as a plain datetime feature above; we don't set | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I haven't seen the whole script end to end, however this statement seems to me a bit confusing. Even though the train/val/test split is temporal, if we do not provide a time column, the subgraphs of neighbors created by the gnn engine for every training node will include neighbors from the future. For example in product recommendation, for a customer node on 1/1/2026 its subgraph will include links to future transactions (after 1/1) if the task is not temporal, and this means information leakage. So I think either this sentence should change, or we should use the time column in the code, based on this specific case. |
||
| # time_col here so the GNN treats it as a regular feature rather than a | ||
| # temporal index. The split is still temporal — see the | ||
| # train_mask / val_mask / test_mask assignments below — so we still | ||
| # train on the past and evaluate on the future. | ||
| ) | ||
|
|
||
| # -------------------------------------------------- | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -31,13 +31,6 @@ Fraud and risk teams face four interconnected problems: discovering suspicious s | |
|
|
||
| **For the older rule-based-only take** (no ML), see `fraud_detection_rules.ipynb` -- a standalone notebook using Weakly Connected Components on shared-identifier edges to flag suspicious users. | ||
|
|
||
| > [!IMPORTANT] | ||
| > The RelationalAI **predictive reasoner (GNN)** used in this template is in | ||
| > early access. The API surface (`GNN`, `PropertyTransformer`, task | ||
| > relationships) may still change between releases; check the | ||
| > `rai-predictive-modeling` and `rai-predictive-training` skills for the | ||
| > current guidance before adapting to production data. | ||
|
|
||
| ## Who this is for | ||
|
|
||
| - Data scientists building end-to-end ML-to-optimization pipelines on transaction graphs | ||
|
|
@@ -89,7 +82,7 @@ you'll additionally need: | |
| ### Tools | ||
|
|
||
| - Python >= 3.10 | ||
| - RelationalAI Python SDK (`relationalai`) `==1.0.14` | ||
| - RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`) | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pkouki is this true that you need to install using |
||
| - For the rule-based notebook only: `jupyter` | ||
|
|
||
| ## Quickstart | ||
|
|
@@ -136,10 +129,8 @@ Snowflake dataset (accounts + transactions + train/val/test task tables): | |
| SCHEMA = "YOUR_SCHEMA" # schema with ACCOUNTS, TRANSACTIONS, TRAIN, VAL, TEST | ||
| ``` | ||
| 2. Adjust the `PropertyTransformer` to match your columns -- drop your PKs/FKs | ||
| explicitly, annotate categoricals and continuous fields, and -- if your | ||
| data is small enough that the GNN's datetime pipeline doesn't choke on it | ||
| -- set `time_col` on your timestamp column. (See the "has_time_column" | ||
| troubleshooting note below for the workaround at scale.) | ||
| explicitly, annotate categoricals and continuous fields, and set `time_col` | ||
| on your timestamp column. | ||
| 3. If your task tables use different column names, update the `Relationship` | ||
| templates (and any `TrainTable.<column>` accesses) to match. | ||
| 4. Run against a GPU-enabled RAI engine: | ||
|
|
@@ -342,10 +333,7 @@ alongside the raw transaction fields. | |
|
|
||
| Task relationships encode the `isFraud` label on train/val and omit it on | ||
| test. Both the local and Snowflake reference scripts use temporal | ||
| Relationships (`at {Any:step_ts}`) and `has_time_column=True`. At | ||
| multi-million-row scale the GNN's datetime pipeline can hit a server-side | ||
| `ValidationError` -- if you encounter that adapting to your own data, see | ||
| the troubleshooting block below for the workaround (drop temporal handling). | ||
| Relationships (`at {Any:step_ts}`) and `has_time_column=True`. | ||
|
|
||
| ```python | ||
| Train = Relationship(f"{Transaction} at {Any:step_ts} has {Any:label}") | ||
|
|
@@ -455,17 +443,6 @@ the tradeoff is visible. | |
| - Degenerate (selects 0 transactions): no transactions have an alert_score. Confirm `Transaction.predictions` was populated (test split present + GNN fit succeeded). | ||
| </details> | ||
|
|
||
| <details> | ||
| <summary><code>has_time_column=True</code> fails validation (two known triggers)</summary> | ||
|
|
||
| Known limitation in the predictive reasoner — the GNN's datetime feature pipeline can fail in two distinct cases: | ||
|
|
||
| 1. **Edge-intermediary case** (small-data trigger, documented in `rai-predictive-training`): when the concept carrying `time_col` is used only as an edge intermediary (not a node), validation fails with *"no time column defined in data tables"*. | ||
| 2. **Large-data trigger** (encountered while scaling this template's full Snowflake path): with a Snowflake `VARCHAR` ISO-8601 timestamp column loaded via `Table().to_schema()`, training fails server-side with *"ValidationError: Error processing datetime column 'step_ts'"* — even when the column is a node property, format is correct, and there are no NULLs. The bundled local CSV path (which uses `model.data(df).to_schema()` after `parse_dates=...`) does not hit this. | ||
|
|
||
| **Workaround for both:** set `has_time_column=False` in the `GNN(...)` constructor, drop `temporal_strategy=...`, strip the `at {Any:step_ts}` clauses from your Train/Val/Test relationship templates, and comment out `datetime=` and `time_col=` from your `PropertyTransformer`. Build the train/val/test split tables by `step` cutoff in SQL (the temporal split is preserved in the data even if the GNN can't use the timestamp as a feature). | ||
| </details> | ||
|
|
||
| <details> | ||
| <summary>Spinner floods the log when running in CI / non-TTY</summary> | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before I think.