predictive templates: preview-launch reframe + bug-attribution cleanup#74
Conversation
Predictive is in preview; templates lead with predictive as the default rather than framing it as early-access or wrapping the GNN choice in workaround language. - Remove early-access / private-preview IMPORTANT callouts from demand_forecasting, subscriber_retention, retail_planning, fraud-detection, and smoker_status_prediction READMEs. - Strip "SDK iteration-mutation bug" / "dictionary changed size" framing from telco_network_recovery README, runbook, and code comments. Property-equality FK-column edges are preserved as a positive design pattern. - Strip "PyRel 1.0.x DateTime/VString signature mismatch" framing from demand_forecasting docstring + inline comments; recast as a deliberate design choice (datetime as plain feature, pandas-side temporal split). - Drop the has_time_column=True troubleshoot blocks from fraud-detection and demand_forecasting READMEs, and clean up dangling pointers to them. Verified template API surface still matches latest pyrel main + the relationalai_gnns.TrainerConfig surface (task_type strings, eval_metric values, GNN(...) constructor params all resolve).
|
The docs preview for this pull request has been deployed to Vercel!
|
Three templates (demand_forecasting, subscriber_retention, fraud-detection) were pinned to relationalai==1.0.14 — predates PR #1030 which introduced the predictive API surface these templates call (task_type, eval_metric, temporal_strategy, has_time_column). Those pins could not have installed cleanly against the shipped code. Five of six templates were also missing the [gnn] extra. pyrel's predictive module imports from relationalai_gnns, which is only pulled in by [gnn] — without it, users hit ModuleNotFoundError at gnn.fit(). Aligns all six predictive templates to relationalai[gnn]==1.4.2 in pyproject.toml and the Tools sections of their READMEs. Matches the repo's exact-pin convention.
…gn choice dev-templates-review caught three spots in demand_forecasting/README.md that still implied an SDK bug after the troubleshoot block was removed: - NOTE about under-shot December spike: drop "the SDK's `has_time_column=True` temporal indexing is currently disabled" / "when the SDK's time-aware mode is stable" — reframe as the GNN seeing date as a flat feature, with a pointer to the temporal-indexing variant in Customize this template. - Customize bullet: "Re-enable temporal indexing when the SDK ships a stable fix" -> "Use temporal indexing instead — set has_time_column=True, ..." — presents the two configurations as alternatives, not as workaround vs fix.
dafnianagno
left a comment
There was a problem hiding this comment.
Everything looks good, except for the use of time in demand forecasting.
| # split is still temporal — see the train_mask / val_mask / test_mask | ||
| # assignments below — so we still train on the past and evaluate on | ||
| # the future, just without temporal-strategy aggregation in the GNN. | ||
| # Sale.date is exposed as a plain datetime feature above; we don't set |
There was a problem hiding this comment.
I haven't seen the whole script end to end, however this statement seems to me a bit confusing. Even though the train/val/test split is temporal, if we do not provide a time column, the subgraphs of neighbors created by the gnn engine for every training node will include neighbors from the future. For example in product recommendation, for a customer node on 1/1/2026 its subgraph will include links to future transactions (after 1/1) if the task is not temporal, and this means information leakage. So I think either this sentence should change, or we should use the time column in the code, based on this specific case.
|
|
||
| > [!NOTE] | ||
| > The GNN learns base-level demand and weekday/weekend seasonality cleanly. The December holiday spike is partially captured but under-shot — that's because the SDK's `has_time_column=True` temporal indexing is currently disabled in this template (see [Customize this template](#customize-this-template) and the troubleshooting note below). The pandas-level temporal split is preserved (we still train on the past and evaluate on the future), but the GNN itself sees the date as a flat datetime feature rather than a temporal index. When the SDK's time-aware mode is stable for this dataset shape, re-enabling it should improve the December-spike capture. | ||
| > The GNN learns base-level demand and weekday/weekend seasonality cleanly. The December holiday spike is partially captured but under-shot — Sale.date is exposed as a flat datetime feature, not a temporal index, so the GNN doesn't aggregate over time windows. The pandas-level temporal split is preserved (we still train on the past and evaluate on the future). To trade simplicity for tighter spike capture, see the "Use temporal indexing" variant in [Customize this template](#customize-this-template). |
There was a problem hiding this comment.
Same as before I think.
| ## Customize this template | ||
|
|
||
| - **Re-enable temporal indexing** when the SDK ships a stable fix — set `has_time_column=True`, restore `time_col=[Sale.date]` in the PropertyTransformer, restore the date arg in the Train/Val/Test relationships (`f"{Sale} at {Any:date} has {Any:value}"`), and add `temporal_strategy="last"` to the `GNN(...)` constructor. The December holiday spike should predict better. | ||
| - **Use temporal indexing instead** — for tighter holiday/seasonal spike capture, set `has_time_column=True`, restore `time_col=[Sale.date]` in the PropertyTransformer, restore the date arg in the Train/Val/Test relationships (`f"{Sale} at {Any:date} has {Any:value}"`), and add `temporal_strategy="last"` to the `GNN(...)` constructor. Trades simplicity for the GNN aggregating over time windows. |
There was a problem hiding this comment.
One last place with the same issue.
| 4. Configure: | ||
| ```bash | ||
| rai init | ||
| ``` |
There was a problem hiding this comment.
Right after this, we need to add the below note (the same note is in the retail_planning example):
After rai init generates the config file, add the following to your raiconfig.yaml:
data:
ensure_change_tracking: true
The above is true for all examples that include predictive.
| GRANT USAGE ON DATABASE FAVORITA_MINI TO APPLICATION RELATIONALAI; | ||
| GRANT ALL PRIVILEGES ON SCHEMA FAVORITA_MINI.EXPERIMENTS TO APPLICATION RELATIONALAI; | ||
| ``` | ||
|
|
There was a problem hiding this comment.
I would add a note
[!NOTE] Replace RELATIONALAI with the rai_app_name you set in raiconfig.yaml if it differs.
we have the same in the retail_planning example - again, this applies for all templates.
There was a problem hiding this comment.
My understanding is that the feature for changing the app name is for internal use only. External instances always use RELATIONALAI, which is why I do not usually include this kind of note in the docs.
somacdivad
left a comment
There was a problem hiding this comment.
Approving so we can move fast on this, but I am confused about the pyproject.tomls and the note in the readme about requiring the gnn extra (pip install relationalai[gnn]). That's not a pattern we've ever used before for public features.
|
|
||
| - Python >= 3.10 | ||
| - RelationalAI Python SDK (`relationalai`) `==1.0.14` | ||
| - RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`) |
There was a problem hiding this comment.
@pkouki is this true that you need to install using pip install relationalai[gnn]? Since this is public preview now, shouldn't it just come with the standard pip install relationalai?
Summary
Predictive is in preview. This PR brings the six predictive templates (
demand_forecasting,subscriber_retention,retail_planning,fraud-detection,telco_network_recovery,smoker_status_prediction) into line with the launch.1. Preview-launch reframe — remove early-access / private-preview
IMPORTANTcallouts from the five READMEs that carried them. Predictive is the default reasoner now, not an experimental one.2. Bug-attribution cleanup — strip "SDK iteration-mutation bug" / "dictionary changed size during iteration" framing from
telco_network_recovery(README, runbook, code comments) and "PyRel 1.0.x DateTime/VString signature mismatch" framing fromdemand_forecasting(module docstring, inline comment, README NOTE, Customize bullet). The underlying patterns — FK property-equality edges on GNN-graph concepts; date as a flat datetime feature with pandas-side temporal split — are preserved as positive design choices. Also drops thehas_time_column=Truetroubleshoot<details>blocks fromfraud-detectionanddemand_forecastingREADMEs and cleans up the dangling "see troubleshooting block below" pointers that referenced them.3. SDK pin harmonization — all six templates now pin
relationalai[gnn]==1.4.2inpyproject.tomland the Tools section of their README. Previously three (demand_forecasting,subscriber_retention,fraud-detection) pinnedrelationalai==1.0.14, which predates the predictive interface PR and the API surface those templates actually call. Five of six were also missing the[gnn]extra — pyrel's predictive module imports fromrelationalai_gnns, which is only pulled in by[gnn]— so a clean install would hitModuleNotFoundErroratgnn.fit().Validation
Cross-checked every predictive API call in the six templates against pyrel main and
relationalai_gnns.TrainerConfig:task_typestrings used (regression,binary_classification,repeated_link_prediction) are all in the valid set exposed by pyrel's predictive estimator.eval_metricvalues (rmse,roc_auc,link_prediction_map@12) all resolve, including the@kform.GNN(...)constructor kwarg used (has_time_column,device,n_epochs,lr,train_batch_size,head_layers,temporal_strategy,stream_logs,seed,max_iters) exists in the current signature.The bugs whose framing was removed are still real in pyrel main — kept the workaround patterns, dropped the bug-attribution language. The skill (
rai-predictive-training/references/known-limitations.md, in https://github.com/RelationalAI/rai-agent-evals/pull/96) documents the patterns for users who want depth.Both review passes (
/dev-templates-reviewover the diff and a final residue grep across all six templates forSDK.*(bug|workaround|fix),signature mismatch,iteration-mutation,stable fix,fix lands,early access,private preview) come back clean.Test plan
demand_forecasting.pyandtelco_network_recovery.pyend-to-end against a RAI-app-enabled Snowflake account with a freshrelationalai[gnn]==1.4.2install