From fdaf567aa37eb6ddb438a057a62f1239883950fc Mon Sep 17 00:00:00 2001 From: cafzal Date: Tue, 26 May 2026 15:09:12 -0700 Subject: [PATCH 1/3] predictive templates: preview-launch reframe + bug-attribution cleanup Predictive is in preview; templates lead with predictive as the default rather than framing it as early-access or wrapping the GNN choice in workaround language. - Remove early-access / private-preview IMPORTANT callouts from demand_forecasting, subscriber_retention, retail_planning, fraud-detection, and smoker_status_prediction READMEs. - Strip "SDK iteration-mutation bug" / "dictionary changed size" framing from telco_network_recovery README, runbook, and code comments. Property-equality FK-column edges are preserved as a positive design pattern. - Strip "PyRel 1.0.x DateTime/VString signature mismatch" framing from demand_forecasting docstring + inline comments; recast as a deliberate design choice (datetime as plain feature, pandas-side temporal split). - Drop the has_time_column=True troubleshoot blocks from fraud-detection and demand_forecasting READMEs, and clean up dangling pointers to them. Verified template API surface still matches latest pyrel main + the relationalai_gnns.TrainerConfig surface (task_type strings, eval_metric values, GNN(...) constructor params all resolve). --- v1/demand_forecasting/README.md | 11 ------- v1/demand_forecasting/demand_forecasting.py | 17 +++++------ v1/fraud-detection/README.md | 29 ++----------------- v1/retail_planning/README.md | 7 ----- v1/smoker_status_prediction/README.md | 3 -- v1/subscriber_retention/README.md | 3 -- v1/telco_network_recovery/README.md | 9 +----- v1/telco_network_recovery/runbook.md | 2 +- .../telco_network_recovery.py | 11 +++---- 9 files changed, 16 insertions(+), 76 deletions(-) diff --git a/v1/demand_forecasting/README.md b/v1/demand_forecasting/README.md index 0c09a3f..7d8ef6a 100644 --- a/v1/demand_forecasting/README.md +++ b/v1/demand_forecasting/README.md @@ -21,9 +21,6 @@ tags: Retail planners forecast unit sales at fine granularity — per store, per item, per day — to drive replenishment, promotions, and labour planning. Classical demand-forecasting models score one (store, item) series in isolation; they miss the fact that store A's sales of bread move with bakery sales across the chain, and that bakery sells through similarly to dairy. This template wires those hierarchies into a **Predictive** reasoner: a regression GNN trained over a heterogeneous Sale → Store, Sale → Item, Item → ItemFamily graph, so the model propagates signal through the store and product hierarchies while predicting per-Sale unit sales. -> [!IMPORTANT] -> The RelationalAI **predictive reasoner (GNN)** used in this template is in early access. The API surface (`GNN`, `PropertyTransformer`, task relationships) may still change between releases; check the `rai-predictive-modeling` and `rai-predictive-training` skills for current guidance before adapting to production data. - ## Who this is for - Retail data scientists building per-(store, item, day) demand-forecasting pipelines who want to add hierarchical signal (item family, store cluster) without manually engineering features @@ -353,14 +350,6 @@ model = Model("demand_forecasting_local_v2") # bump on each re-run if needed ``` -
-Train job failures with date columns at scale (has_time_column=True) - -PyRel 1.0.x has a server-side `DateTime/VString` signature mismatch when `has_time_column=True` is paired with a date column at non-trivial dataset sizes. Symptoms include train jobs that hang at "Step 2/4: Preparing model for prediction" with no JOBS row, or fail with a SQL signature error. - -Workaround (used as the default in this template): keep the date as a plain `datetime` feature in `PropertyTransformer`, but set `has_time_column=False` and drop `time_col` / `temporal_strategy`. Preserve the temporal split in pandas before building task tables. See [Customize this template](#customize-this-template) for the instructions to re-enable temporal indexing once the SDK fix lands. -
- ## Related templates - **`subscriber_retention`** — sibling Predictive template using a regression GNN on a homogeneous call graph (no time column); useful as a comparison for the simpler-graph case diff --git a/v1/demand_forecasting/demand_forecasting.py b/v1/demand_forecasting/demand_forecasting.py index 5fe397c..70621fa 100644 --- a/v1/demand_forecasting/demand_forecasting.py +++ b/v1/demand_forecasting/demand_forecasting.py @@ -13,9 +13,8 @@ graph so the GNN can propagate signal through the store and item hierarchies. 3. Train a regression GNN predicting Sale.unit_sales. Sale.date is fed - as a plain datetime feature, not as a temporal index -- see the - `has_time_column=False` NOTE inline for the SDK workaround and the - README "Customize this template" for re-enabling temporal indexing. + as a plain datetime feature, not as a temporal index; the temporal + split is done in pandas before the task tables are built (see step 4). 4. Generate per-Sale predictions on a forward-looking 60-day test window (temporal split done in pandas before the task tables are built) and aggregate to weekly per-(store, family) forecasts. @@ -131,13 +130,11 @@ continuous=[Store.cluster], integer=[Item.item_class], datetime=[Sale.date], - # NOTE: time_col disabled. The PyRel 1.0.x predictive backend has a known - # DateTime/VString signature mismatch when has_time_column=True is used - # with a date column at scale; the workaround is to keep the date as a - # plain datetime feature (above) and disable temporal indexing. The - # split is still temporal — see the train_mask / val_mask / test_mask - # assignments below — so we still train on the past and evaluate on - # the future, just without temporal-strategy aggregation in the GNN. + # Sale.date is exposed as a plain datetime feature above; we don't set + # time_col here so the GNN treats it as a regular feature rather than a + # temporal index. The split is still temporal — see the + # train_mask / val_mask / test_mask assignments below — so we still + # train on the past and evaluate on the future. ) # -------------------------------------------------- diff --git a/v1/fraud-detection/README.md b/v1/fraud-detection/README.md index 4f57be7..b350714 100644 --- a/v1/fraud-detection/README.md +++ b/v1/fraud-detection/README.md @@ -31,13 +31,6 @@ Fraud and risk teams face four interconnected problems: discovering suspicious s **For the older rule-based-only take** (no ML), see `fraud_detection_rules.ipynb` -- a standalone notebook using Weakly Connected Components on shared-identifier edges to flag suspicious users. -> [!IMPORTANT] -> The RelationalAI **predictive reasoner (GNN)** used in this template is in -> early access. The API surface (`GNN`, `PropertyTransformer`, task -> relationships) may still change between releases; check the -> `rai-predictive-modeling` and `rai-predictive-training` skills for the -> current guidance before adapting to production data. - ## Who this is for - Data scientists building end-to-end ML-to-optimization pipelines on transaction graphs @@ -136,10 +129,8 @@ Snowflake dataset (accounts + transactions + train/val/test task tables): SCHEMA = "YOUR_SCHEMA" # schema with ACCOUNTS, TRANSACTIONS, TRAIN, VAL, TEST ``` 2. Adjust the `PropertyTransformer` to match your columns -- drop your PKs/FKs - explicitly, annotate categoricals and continuous fields, and -- if your - data is small enough that the GNN's datetime pipeline doesn't choke on it - -- set `time_col` on your timestamp column. (See the "has_time_column" - troubleshooting note below for the workaround at scale.) + explicitly, annotate categoricals and continuous fields, and set `time_col` + on your timestamp column. 3. If your task tables use different column names, update the `Relationship` templates (and any `TrainTable.` accesses) to match. 4. Run against a GPU-enabled RAI engine: @@ -342,10 +333,7 @@ alongside the raw transaction fields. Task relationships encode the `isFraud` label on train/val and omit it on test. Both the local and Snowflake reference scripts use temporal -Relationships (`at {Any:step_ts}`) and `has_time_column=True`. At -multi-million-row scale the GNN's datetime pipeline can hit a server-side -`ValidationError` -- if you encounter that adapting to your own data, see -the troubleshooting block below for the workaround (drop temporal handling). +Relationships (`at {Any:step_ts}`) and `has_time_column=True`. ```python Train = Relationship(f"{Transaction} at {Any:step_ts} has {Any:label}") @@ -455,17 +443,6 @@ the tradeoff is visible. - Degenerate (selects 0 transactions): no transactions have an alert_score. Confirm `Transaction.predictions` was populated (test split present + GNN fit succeeded). -
-has_time_column=True fails validation (two known triggers) - -Known limitation in the predictive reasoner — the GNN's datetime feature pipeline can fail in two distinct cases: - -1. **Edge-intermediary case** (small-data trigger, documented in `rai-predictive-training`): when the concept carrying `time_col` is used only as an edge intermediary (not a node), validation fails with *"no time column defined in data tables"*. -2. **Large-data trigger** (encountered while scaling this template's full Snowflake path): with a Snowflake `VARCHAR` ISO-8601 timestamp column loaded via `Table().to_schema()`, training fails server-side with *"ValidationError: Error processing datetime column 'step_ts'"* — even when the column is a node property, format is correct, and there are no NULLs. The bundled local CSV path (which uses `model.data(df).to_schema()` after `parse_dates=...`) does not hit this. - -**Workaround for both:** set `has_time_column=False` in the `GNN(...)` constructor, drop `temporal_strategy=...`, strip the `at {Any:step_ts}` clauses from your Train/Val/Test relationship templates, and comment out `datetime=` and `time_col=` from your `PropertyTransformer`. Build the train/val/test split tables by `step` cutoff in SQL (the temporal split is preserved in the data even if the GNN can't use the timestamp as a feature). -
-
Spinner floods the log when running in CI / non-TTY diff --git a/v1/retail_planning/README.md b/v1/retail_planning/README.md index 18bd603..5e1de0d 100644 --- a/v1/retail_planning/README.md +++ b/v1/retail_planning/README.md @@ -25,13 +25,6 @@ Retailers face interconnected decisions: which items will sell, which customers **Then adapt the pattern to your own Snowflake data** using `retail_planning.py` as a reference. It trains three GNNs (sales regression, customer-churn classification, user-article link prediction) against the full Kaggle H&M dataset in Snowflake, aggregates all three signals into an adjusted demand estimate, and feeds that into the same two optimizers. The H&M pipeline is the worked example -- the structure (graph concepts → GNN tasks → aggregation bridge → prescriptive constraints) is what carries over to your own retail, pricing, or demand-planning data. -> [!IMPORTANT] -> The RelationalAI **predictive reasoner (GNN)** used in this template is in -> private preview. The API surface (`GNN`, `PropertyTransformer`, task -> relationships) may still change between releases; check the -> `rai-predictive-modeling` and `rai-predictive-training` skills for the -> current guidance before adapting to production data. - ## Who this is for - Data scientists building end-to-end ML-to-optimization pipelines diff --git a/v1/smoker_status_prediction/README.md b/v1/smoker_status_prediction/README.md index ca43d12..5ed9b15 100644 --- a/v1/smoker_status_prediction/README.md +++ b/v1/smoker_status_prediction/README.md @@ -18,9 +18,6 @@ tags: Predicting health-related behaviors like smoking status from medical and demographic data is a common tabular machine learning task. In practice, though, these behaviors are also shaped by social context: friends, family, and peers often influence one another. This template demonstrates how to model both individual attributes and social relationships with a Graph Neural Network (GNN), using the RelationalAI **Predictive** reasoner to train a single end-to-end model. -> [!IMPORTANT] -> The RelationalAI **predictive reasoner (GNN)** used in this template is in early access. The API surface (`GNN`, `PropertyTransformer`, task relationships) may still change between releases; check the `rai-predictive-modeling` and `rai-predictive-training` skills for current guidance before adapting to production data. - ## Who this is for - Data scientists who want to leverage the relational structure of data stored across connected tables diff --git a/v1/subscriber_retention/README.md b/v1/subscriber_retention/README.md index 1be7614..edde226 100644 --- a/v1/subscriber_retention/README.md +++ b/v1/subscriber_retention/README.md @@ -23,9 +23,6 @@ tags: Telco retention teams need to score every active subscriber for churn risk so they can target proactive offers at the right people before contracts roll over. Traditional churn models lean on plan attributes (rate, term, auto-renew) and demographics; they ignore the network around each subscriber. This template wires a call-graph signal into the model: who you call, who calls you, and how central you sit in the call network all become features, and the **Predictive** reasoner trains a GNN regression head over them. The graph features come from the **Graph** reasoner (PageRank on the Subscriber→Subscriber call graph); aggregate-derived `outgoing_calls` / `incoming_calls` properties round out the per-subscriber feature row. -> [!IMPORTANT] -> The RelationalAI **predictive reasoner (GNN)** used in this template is in early access. The API surface (`GNN`, `PropertyTransformer`, task relationships) may still change between releases; check the `rai-predictive-modeling` and `rai-predictive-training` skills for current guidance before adapting to production data. - ## Who this is for - Telco data scientists building churn-risk scoring pipelines that combine static plan attributes with relational/network signal diff --git a/v1/telco_network_recovery/README.md b/v1/telco_network_recovery/README.md index 4be5cec..6e6b5a9 100644 --- a/v1/telco_network_recovery/README.md +++ b/v1/telco_network_recovery/README.md @@ -47,7 +47,7 @@ Each stage writes derived properties back to the same ontology that downstream s - **Accretive ontology enrichment** — each stage writes derived properties that downstream stages consume as first-class attributes. No glue code, no DataFrame round-trips between stages (except where the GNN's prediction shape needs a one-step pandas aggregation before binding back). - **Heterogeneous-graph GNN** — three FK / shared-MODEL edges (`EquipmentHealth → NetworkEquipment`, `NetworkEquipment → CellTower`, `ModelAdvisory → NetworkEquipment`) so advisory severity propagates to every fleet sibling AND reaches tower-mate equipment via 2-hop paths. -- **Property-equality edges** — the GNN graph defines edges via `==` between FK columns instead of `model.Relationship` traversal. This pattern sidesteps an SDK iteration-mutation bug and is the recommended shape for any concept that participates in a GNN graph and has cross-pointing relationships. +- **Property-equality edges** — the GNN graph defines edges via `==` between FK columns instead of `model.Relationship` traversal. FK properties on `NetworkEquipment` and `EquipmentHealth` carry the join keys explicitly so heterogeneous edges read as property-level equality conditions. - **Bridge concept** — per-equipment predictions are aggregated in pandas (`sum`) and loaded back as a `CellTower.failure_intensity` property via a small `TowerFailureScore` concept. Same pattern as in `retail_planning`. - **Three-branch rule** — `CellTower.is_critical_restore` is defined three times (OR semantics). A tower is critical if any branch fires; the third branch lets the GNN broaden scope beyond WEST. - **Three-factor MIP objective** — `capacity_increase × weighted_impact × failure_intensity`. Each factor comes from a different reasoner upstream. @@ -309,13 +309,6 @@ Verify with `SHOW GRANTS ON SCHEMA .EXPERIMENTS` — you should see `OWNERSH
-
-GNN training raises RuntimeError: dictionary changed size during iteration - -This is a known SDK issue when a concept that participates in the GNN graph also carries a `model.Relationship` (the iteration over `concept._relationships` mutates mid-loop). The template works around it by using **property-equality edges** — FK columns (`tower_id_fk`, `equipment_id_fk`) joined via `==` in edge definitions instead of relationship traversal. If you add new edges, keep this pattern. - -
-
Stage 4 returns an infeasible status diff --git a/v1/telco_network_recovery/runbook.md b/v1/telco_network_recovery/runbook.md index 7488d1e..e704dec 100644 --- a/v1/telco_network_recovery/runbook.md +++ b/v1/telco_network_recovery/runbook.md @@ -49,7 +49,7 @@ figures shift run to run; the structural outcome holds.) **Response** -Concepts: `CellTower`, `NetworkEquipment` (with `tower_id_fk` FK property), `EquipmentHealth` (with `equipment_id_fk` FK property), `NetworkPerformance`, `Subscriber`, `CallDetailRecord` (edge concept: caller → callee, routed_through tower), `TowerUpgradeOption` (composite key tower_id+tier), `ModelAdvisory` (PK: MODEL) — all bound to the bundled CSVs. The FK properties on NetworkEquipment and EquipmentHealth carry the join keys explicitly so Stage 2's GNN can define heterogeneous edges via property equality (the workaround for an SDK iteration-mutation bug when GNN-node concepts also carry `model.Relationship` cross-pointers). +Concepts: `CellTower`, `NetworkEquipment` (with `tower_id_fk` FK property), `EquipmentHealth` (with `equipment_id_fk` FK property), `NetworkPerformance`, `Subscriber`, `CallDetailRecord` (edge concept: caller → callee, routed_through tower), `TowerUpgradeOption` (composite key tower_id+tier), `ModelAdvisory` (PK: MODEL) — all bound to the bundled CSVs. The FK properties on NetworkEquipment and EquipmentHealth carry the join keys explicitly so Stage 2's GNN can define heterogeneous edges via property equality. ### 2. Examine ontology diff --git a/v1/telco_network_recovery/telco_network_recovery.py b/v1/telco_network_recovery/telco_network_recovery.py index 904d68c..ccbc768 100644 --- a/v1/telco_network_recovery/telco_network_recovery.py +++ b/v1/telco_network_recovery/telco_network_recovery.py @@ -199,11 +199,9 @@ # NetworkEquipment concept: equipment items (radios, antennas, BBUs, # amplifiers, ...) installed on cell towers; the GNN's prediction # target. tower_id_fk is an explicit FK property used by the GNN -# graph via property equality; we avoid `model.Relationship` on -# concepts in the GNN graph because the SDK's _collect_node_columns -# iterates concept._relationships during fit() and lazy-registration -# during iteration trips RuntimeError: dictionary changed size during -# iteration. +# graph via property equality, so heterogeneous edges are expressed as +# property-level equality conditions rather than `model.Relationship` +# traversal. NetworkEquipment = model.Concept("NetworkEquipment", identify_by={"id": String}) NetworkEquipment.equipment_type = model.Property(f"{NetworkEquipment} has {String:equipment_type}") NetworkEquipment.manufacturer = model.Property(f"{NetworkEquipment} has {String:manufacturer}") @@ -277,8 +275,7 @@ # TowerUpgradeOption are loaded just before their respective stages # below. Loading them up front keeps gnn.fit()'s transaction payload # small enough to avoid Snowflake's CREATE_TRANSACTION_V2 row-size -# limit, and avoids triggering the SDK's _collect_node_columns -# iteration-mutation bug on CellTower's relationship set. +# limit. # -------------------------------------------------- # Stage 1: Predictive -- equipment-failure binary GNN From 5b070f5d7fd18a52d42e933c8629aa01884d8824 Mon Sep 17 00:00:00 2001 From: cafzal Date: Tue, 26 May 2026 15:21:23 -0700 Subject: [PATCH 2/3] predictive templates: harmonize SDK pin to relationalai[gnn]==1.4.2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three templates (demand_forecasting, subscriber_retention, fraud-detection) were pinned to relationalai==1.0.14 — predates PR #1030 which introduced the predictive API surface these templates call (task_type, eval_metric, temporal_strategy, has_time_column). Those pins could not have installed cleanly against the shipped code. Five of six templates were also missing the [gnn] extra. pyrel's predictive module imports from relationalai_gnns, which is only pulled in by [gnn] — without it, users hit ModuleNotFoundError at gnn.fit(). Aligns all six predictive templates to relationalai[gnn]==1.4.2 in pyproject.toml and the Tools sections of their READMEs. Matches the repo's exact-pin convention. --- v1/demand_forecasting/README.md | 2 +- v1/demand_forecasting/pyproject.toml | 2 +- v1/fraud-detection/README.md | 2 +- v1/fraud-detection/pyproject.toml | 2 +- v1/retail_planning/README.md | 2 +- v1/retail_planning/pyproject.toml | 2 +- v1/smoker_status_prediction/README.md | 2 +- v1/smoker_status_prediction/pyproject.toml | 2 +- v1/subscriber_retention/README.md | 2 +- v1/subscriber_retention/pyproject.toml | 2 +- v1/telco_network_recovery/README.md | 2 +- v1/telco_network_recovery/pyproject.toml | 2 +- 12 files changed, 12 insertions(+), 12 deletions(-) diff --git a/v1/demand_forecasting/README.md b/v1/demand_forecasting/README.md index 7d8ef6a..42d0cb0 100644 --- a/v1/demand_forecasting/README.md +++ b/v1/demand_forecasting/README.md @@ -64,7 +64,7 @@ GRANT ALL PRIVILEGES ON SCHEMA FAVORITA_MINI.EXPERIMENTS TO APPLICATION RELATION ### Tools - Python >= 3.10 -- RelationalAI Python SDK (`relationalai`) +- RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`) ## Quickstart diff --git a/v1/demand_forecasting/pyproject.toml b/v1/demand_forecasting/pyproject.toml index 931e802..7e1e73f 100644 --- a/v1/demand_forecasting/pyproject.toml +++ b/v1/demand_forecasting/pyproject.toml @@ -9,7 +9,7 @@ description = "RelationalAI template: demand_forecasting (PyRel v1)" readme = "README.md" requires-python = ">=3.10" dependencies = [ - "relationalai==1.0.14", + "relationalai[gnn]==1.4.2", "pandas", "numpy", ] diff --git a/v1/fraud-detection/README.md b/v1/fraud-detection/README.md index b350714..b8472e0 100644 --- a/v1/fraud-detection/README.md +++ b/v1/fraud-detection/README.md @@ -82,7 +82,7 @@ you'll additionally need: ### Tools - Python >= 3.10 -- RelationalAI Python SDK (`relationalai`) `==1.0.14` +- RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`) - For the rule-based notebook only: `jupyter` ## Quickstart diff --git a/v1/fraud-detection/pyproject.toml b/v1/fraud-detection/pyproject.toml index c73e88d..b4f528d 100644 --- a/v1/fraud-detection/pyproject.toml +++ b/v1/fraud-detection/pyproject.toml @@ -9,7 +9,7 @@ description = "RelationalAI template: fraud_detection (PyRel v1)" readme = "README.md" requires-python = ">=3.10" dependencies = [ - "relationalai==1.0.14", + "relationalai[gnn]==1.4.2", "pandas>=2.0", "numpy", "jupyter", diff --git a/v1/retail_planning/README.md b/v1/retail_planning/README.md index 5e1de0d..d988c87 100644 --- a/v1/retail_planning/README.md +++ b/v1/retail_planning/README.md @@ -76,7 +76,7 @@ you'll additionally need: ### Tools - Python >= 3.10 -- RelationalAI Python SDK (`relationalai`) == 1.4.1 +- RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`) ## Quickstart diff --git a/v1/retail_planning/pyproject.toml b/v1/retail_planning/pyproject.toml index 2d33c10..b48e7f4 100644 --- a/v1/retail_planning/pyproject.toml +++ b/v1/retail_planning/pyproject.toml @@ -9,7 +9,7 @@ description = "RelationalAI template: retail_planning (PyRel v1)" readme = "README.md" requires-python = ">=3.10" dependencies = [ - "relationalai[gnn]==1.4.1", + "relationalai[gnn]==1.4.2", "pandas>=2.0", ] diff --git a/v1/smoker_status_prediction/README.md b/v1/smoker_status_prediction/README.md index 5ed9b15..d3c98e9 100644 --- a/v1/smoker_status_prediction/README.md +++ b/v1/smoker_status_prediction/README.md @@ -54,7 +54,7 @@ Assumes familiarity with Python and basic ML concepts (binary classification, tr ### Tools - Python >= 3.10 -- RelationalAI Python SDK (`relationalai`) >= 1.4.2 +- RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`) ## Quickstart diff --git a/v1/smoker_status_prediction/pyproject.toml b/v1/smoker_status_prediction/pyproject.toml index a2fac6b..c59c23e 100644 --- a/v1/smoker_status_prediction/pyproject.toml +++ b/v1/smoker_status_prediction/pyproject.toml @@ -9,7 +9,7 @@ description = "RelationalAI template: smoker_status_prediction (PyRel v1)" readme = "README.md" requires-python = ">=3.10" dependencies = [ - "relationalai==1.4.2", + "relationalai[gnn]==1.4.2", "pandas>=2.0", ] diff --git a/v1/subscriber_retention/README.md b/v1/subscriber_retention/README.md index edde226..f200bf7 100644 --- a/v1/subscriber_retention/README.md +++ b/v1/subscriber_retention/README.md @@ -66,7 +66,7 @@ GRANT ALL PRIVILEGES ON SCHEMA TELCO_ENRICHMENT.EXPERIMENTS TO APPLICATION RELAT ### Tools - Python >= 3.10 -- RelationalAI Python SDK (`relationalai`) +- RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`) ## Quickstart diff --git a/v1/subscriber_retention/pyproject.toml b/v1/subscriber_retention/pyproject.toml index 61a9d4c..ba2deda 100644 --- a/v1/subscriber_retention/pyproject.toml +++ b/v1/subscriber_retention/pyproject.toml @@ -9,7 +9,7 @@ description = "RelationalAI template: subscriber_retention (PyRel v1)" readme = "README.md" requires-python = ">=3.10" dependencies = [ - "relationalai==1.0.14", + "relationalai[gnn]==1.4.2", "pandas", "numpy", ] diff --git a/v1/telco_network_recovery/README.md b/v1/telco_network_recovery/README.md index 6e6b5a9..1ef1d99 100644 --- a/v1/telco_network_recovery/README.md +++ b/v1/telco_network_recovery/README.md @@ -90,7 +90,7 @@ Each stage writes derived properties back to the same ontology that downstream s ### Tools - Python ≥ 3.10. -- RelationalAI Python SDK with the predictive submodule (`relationalai.semantics.reasoners.predictive`). +- RelationalAI Python SDK with the predictive extra (`relationalai[gnn] == 1.4.2`). ### One-time Snowflake setup for GNN experiment artifacts diff --git a/v1/telco_network_recovery/pyproject.toml b/v1/telco_network_recovery/pyproject.toml index ef0c5c1..bbaf912 100644 --- a/v1/telco_network_recovery/pyproject.toml +++ b/v1/telco_network_recovery/pyproject.toml @@ -9,7 +9,7 @@ description = "RelationalAI template: telco_network_recovery (PyRel v1)" readme = "README.md" requires-python = ">=3.10" dependencies = [ - "relationalai==1.4.2", + "relationalai[gnn]==1.4.2", "pandas>=2.0", ] From 0973d1448414d2423307d3a73b6fc4931d96fb87 Mon Sep 17 00:00:00 2001 From: cafzal Date: Tue, 26 May 2026 15:42:13 -0700 Subject: [PATCH 3/3] demand_forecasting: reframe remaining has_time_column residue as design choice MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit dev-templates-review caught three spots in demand_forecasting/README.md that still implied an SDK bug after the troubleshoot block was removed: - NOTE about under-shot December spike: drop "the SDK's `has_time_column=True` temporal indexing is currently disabled" / "when the SDK's time-aware mode is stable" — reframe as the GNN seeing date as a flat feature, with a pointer to the temporal-indexing variant in Customize this template. - Customize bullet: "Re-enable temporal indexing when the SDK ships a stable fix" -> "Use temporal indexing instead — set has_time_column=True, ..." — presents the two configurations as alternatives, not as workaround vs fix. --- v1/demand_forecasting/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/v1/demand_forecasting/README.md b/v1/demand_forecasting/README.md index 42d0cb0..beda46a 100644 --- a/v1/demand_forecasting/README.md +++ b/v1/demand_forecasting/README.md @@ -137,7 +137,7 @@ Test-set RMSE (per (city, family, week)): 150.8997 ``` > [!NOTE] -> The GNN learns base-level demand and weekday/weekend seasonality cleanly. The December holiday spike is partially captured but under-shot — that's because the SDK's `has_time_column=True` temporal indexing is currently disabled in this template (see [Customize this template](#customize-this-template) and the troubleshooting note below). The pandas-level temporal split is preserved (we still train on the past and evaluate on the future), but the GNN itself sees the date as a flat datetime feature rather than a temporal index. When the SDK's time-aware mode is stable for this dataset shape, re-enabling it should improve the December-spike capture. +> The GNN learns base-level demand and weekday/weekend seasonality cleanly. The December holiday spike is partially captured but under-shot — Sale.date is exposed as a flat datetime feature, not a temporal index, so the GNN doesn't aggregate over time windows. The pandas-level temporal split is preserved (we still train on the past and evaluate on the future). To trade simplicity for tighter spike capture, see the "Use temporal indexing" variant in [Customize this template](#customize-this-template). ## Template structure @@ -288,7 +288,7 @@ results_df = ( ## Customize this template -- **Re-enable temporal indexing** when the SDK ships a stable fix — set `has_time_column=True`, restore `time_col=[Sale.date]` in the PropertyTransformer, restore the date arg in the Train/Val/Test relationships (`f"{Sale} at {Any:date} has {Any:value}"`), and add `temporal_strategy="last"` to the `GNN(...)` constructor. The December holiday spike should predict better. +- **Use temporal indexing instead** — for tighter holiday/seasonal spike capture, set `has_time_column=True`, restore `time_col=[Sale.date]` in the PropertyTransformer, restore the date arg in the Train/Val/Test relationships (`f"{Sale} at {Any:date} has {Any:value}"`), and add `temporal_strategy="last"` to the `GNN(...)` constructor. Trades simplicity for the GNN aggregating over time windows. - **Forecast different granularity** — change `TEST_DAYS` / `VAL_DAYS` at the top of the script. Default is a 60-day test window after a 60-day val window. - **Add weather, promotions calendar, holiday flags** — extend `Sale` with extra columns and add them to `PropertyTransformer.category` or `.continuous` as appropriate. The same hierarchical-graph + GNN scaffold absorbs new features without restructuring. - **Bring more hierarchy in** — the bundled data has Item → ItemFamily. Real Favorita data has Item → Class → Family → Department. Define a `Class` and `Department` concept the same way `ItemFamily` is defined, add `Class → Family` and `Family → Department` edges, and the GNN propagates through deeper product hierarchies.