Skip to content

predictive templates: preview-launch reframe + bug-attribution cleanup#74

Merged
pkouki merged 3 commits into
mainfrom
predictive-preview-launch-reframe
May 27, 2026
Merged

predictive templates: preview-launch reframe + bug-attribution cleanup#74
pkouki merged 3 commits into
mainfrom
predictive-preview-launch-reframe

Conversation

@cafzal
Copy link
Copy Markdown
Collaborator

@cafzal cafzal commented May 26, 2026

Summary

Predictive is in preview. This PR brings the six predictive templates (demand_forecasting, subscriber_retention, retail_planning, fraud-detection, telco_network_recovery, smoker_status_prediction) into line with the launch.

1. Preview-launch reframe — remove early-access / private-preview IMPORTANT callouts from the five READMEs that carried them. Predictive is the default reasoner now, not an experimental one.

2. Bug-attribution cleanup — strip "SDK iteration-mutation bug" / "dictionary changed size during iteration" framing from telco_network_recovery (README, runbook, code comments) and "PyRel 1.0.x DateTime/VString signature mismatch" framing from demand_forecasting (module docstring, inline comment, README NOTE, Customize bullet). The underlying patterns — FK property-equality edges on GNN-graph concepts; date as a flat datetime feature with pandas-side temporal split — are preserved as positive design choices. Also drops the has_time_column=True troubleshoot <details> blocks from fraud-detection and demand_forecasting READMEs and cleans up the dangling "see troubleshooting block below" pointers that referenced them.

3. SDK pin harmonization — all six templates now pin relationalai[gnn]==1.4.2 in pyproject.toml and the Tools section of their README. Previously three (demand_forecasting, subscriber_retention, fraud-detection) pinned relationalai==1.0.14, which predates the predictive interface PR and the API surface those templates actually call. Five of six were also missing the [gnn] extra — pyrel's predictive module imports from relationalai_gnns, which is only pulled in by [gnn] — so a clean install would hit ModuleNotFoundError at gnn.fit().

Validation

Cross-checked every predictive API call in the six templates against pyrel main and relationalai_gnns.TrainerConfig:

  • task_type strings used (regression, binary_classification, repeated_link_prediction) are all in the valid set exposed by pyrel's predictive estimator.
  • eval_metric values (rmse, roc_auc, link_prediction_map@12) all resolve, including the @k form.
  • Every GNN(...) constructor kwarg used (has_time_column, device, n_epochs, lr, train_batch_size, head_layers, temporal_strategy, stream_logs, seed, max_iters) exists in the current signature.
  • Template-set hyperparameter values are deliberate per-template overrides; no stale defaults.

The bugs whose framing was removed are still real in pyrel main — kept the workaround patterns, dropped the bug-attribution language. The skill (rai-predictive-training/references/known-limitations.md, in https://github.com/RelationalAI/rai-agent-evals/pull/96) documents the patterns for users who want depth.

Both review passes (/dev-templates-review over the diff and a final residue grep across all six templates for SDK.*(bug|workaround|fix), signature mismatch, iteration-mutation, stable fix, fix lands, early access, private preview) come back clean.

Test plan

  • Render the six README sections to confirm no orphan headings or broken anchors
  • Spot-run demand_forecasting.py and telco_network_recovery.py end-to-end against a RAI-app-enabled Snowflake account with a fresh relationalai[gnn]==1.4.2 install

Predictive is in preview; templates lead with predictive as the default rather
than framing it as early-access or wrapping the GNN choice in workaround
language.

- Remove early-access / private-preview IMPORTANT callouts from
  demand_forecasting, subscriber_retention, retail_planning, fraud-detection,
  and smoker_status_prediction READMEs.
- Strip "SDK iteration-mutation bug" / "dictionary changed size" framing from
  telco_network_recovery README, runbook, and code comments. Property-equality
  FK-column edges are preserved as a positive design pattern.
- Strip "PyRel 1.0.x DateTime/VString signature mismatch" framing from
  demand_forecasting docstring + inline comments; recast as a deliberate
  design choice (datetime as plain feature, pandas-side temporal split).
- Drop the has_time_column=True troubleshoot blocks from fraud-detection and
  demand_forecasting READMEs, and clean up dangling pointers to them.

Verified template API surface still matches latest pyrel main + the
relationalai_gnns.TrainerConfig surface (task_type strings, eval_metric
values, GNN(...) constructor params all resolve).
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 26, 2026

The docs preview for this pull request has been deployed to Vercel!

✅ Preview: https://relationalai-docs-qve9ikcem-relationalai.vercel.app/build/templates
🔍 Inspect: https://vercel.com/relationalai/relationalai-docs/GDAPiD4mwbzFziECrkkj4py5DoED

Three templates (demand_forecasting, subscriber_retention, fraud-detection)
were pinned to relationalai==1.0.14 — predates PR #1030 which introduced the
predictive API surface these templates call (task_type, eval_metric,
temporal_strategy, has_time_column). Those pins could not have installed
cleanly against the shipped code.

Five of six templates were also missing the [gnn] extra. pyrel's predictive
module imports from relationalai_gnns, which is only pulled in by [gnn] —
without it, users hit ModuleNotFoundError at gnn.fit().

Aligns all six predictive templates to relationalai[gnn]==1.4.2 in
pyproject.toml and the Tools sections of their READMEs. Matches the repo's
exact-pin convention.
@cafzal cafzal requested a review from pkouki May 26, 2026 22:32
…gn choice

dev-templates-review caught three spots in demand_forecasting/README.md that
still implied an SDK bug after the troubleshoot block was removed:

- NOTE about under-shot December spike: drop "the SDK's `has_time_column=True`
  temporal indexing is currently disabled" / "when the SDK's time-aware mode
  is stable" — reframe as the GNN seeing date as a flat feature, with a
  pointer to the temporal-indexing variant in Customize this template.
- Customize bullet: "Re-enable temporal indexing when the SDK ships a stable
  fix" -> "Use temporal indexing instead — set has_time_column=True, ..." —
  presents the two configurations as alternatives, not as workaround vs fix.
Copy link
Copy Markdown
Contributor

@dafnianagno dafnianagno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good, except for the use of time in demand forecasting.

# split is still temporal — see the train_mask / val_mask / test_mask
# assignments below — so we still train on the past and evaluate on
# the future, just without temporal-strategy aggregation in the GNN.
# Sale.date is exposed as a plain datetime feature above; we don't set
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen the whole script end to end, however this statement seems to me a bit confusing. Even though the train/val/test split is temporal, if we do not provide a time column, the subgraphs of neighbors created by the gnn engine for every training node will include neighbors from the future. For example in product recommendation, for a customer node on 1/1/2026 its subgraph will include links to future transactions (after 1/1) if the task is not temporal, and this means information leakage. So I think either this sentence should change, or we should use the time column in the code, based on this specific case.


> [!NOTE]
> The GNN learns base-level demand and weekday/weekend seasonality cleanly. The December holiday spike is partially captured but under-shot — that's because the SDK's `has_time_column=True` temporal indexing is currently disabled in this template (see [Customize this template](#customize-this-template) and the troubleshooting note below). The pandas-level temporal split is preserved (we still train on the past and evaluate on the future), but the GNN itself sees the date as a flat datetime feature rather than a temporal index. When the SDK's time-aware mode is stable for this dataset shape, re-enabling it should improve the December-spike capture.
> The GNN learns base-level demand and weekday/weekend seasonality cleanly. The December holiday spike is partially captured but under-shot — Sale.date is exposed as a flat datetime feature, not a temporal index, so the GNN doesn't aggregate over time windows. The pandas-level temporal split is preserved (we still train on the past and evaluate on the future). To trade simplicity for tighter spike capture, see the "Use temporal indexing" variant in [Customize this template](#customize-this-template).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as before I think.

## Customize this template

- **Re-enable temporal indexing** when the SDK ships a stable fix — set `has_time_column=True`, restore `time_col=[Sale.date]` in the PropertyTransformer, restore the date arg in the Train/Val/Test relationships (`f"{Sale} at {Any:date} has {Any:value}"`), and add `temporal_strategy="last"` to the `GNN(...)` constructor. The December holiday spike should predict better.
- **Use temporal indexing instead** — for tighter holiday/seasonal spike capture, set `has_time_column=True`, restore `time_col=[Sale.date]` in the PropertyTransformer, restore the date arg in the Train/Val/Test relationships (`f"{Sale} at {Any:date} has {Any:value}"`), and add `temporal_strategy="last"` to the `GNN(...)` constructor. Trades simplicity for the GNN aggregating over time windows.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One last place with the same issue.

4. Configure:
```bash
rai init
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right after this, we need to add the below note (the same note is in the retail_planning example):

After rai init generates the config file, add the following to your raiconfig.yaml:

data:
ensure_change_tracking: true

The above is true for all examples that include predictive.

GRANT USAGE ON DATABASE FAVORITA_MINI TO APPLICATION RELATIONALAI;
GRANT ALL PRIVILEGES ON SCHEMA FAVORITA_MINI.EXPERIMENTS TO APPLICATION RELATIONALAI;
```

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a note

[!NOTE] Replace RELATIONALAI with the rai_app_name you set in raiconfig.yaml if it differs.

we have the same in the retail_planning example - again, this applies for all templates.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the feature for changing the app name is for internal use only. External instances always use RELATIONALAI, which is why I do not usually include this kind of note in the docs.

Copy link
Copy Markdown
Collaborator

@somacdivad somacdivad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving so we can move fast on this, but I am confused about the pyproject.tomls and the note in the readme about requiring the gnn extra (pip install relationalai[gnn]). That's not a pattern we've ever used before for public features.


- Python >= 3.10
- RelationalAI Python SDK (`relationalai`) `==1.0.14`
- RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pkouki is this true that you need to install using pip install relationalai[gnn]? Since this is public preview now, shouldn't it just come with the standard pip install relationalai?

@pkouki pkouki merged commit e5ca24c into main May 27, 2026
3 checks passed
@pkouki pkouki deleted the predictive-preview-launch-reframe branch May 27, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants