Skip to content

feat(wren-core): RLAC subqueries on MDL models + to-many calc on non-PK column#2335

Open
goldmedal wants to merge 2 commits into
Canner:mainfrom
goldmedal:feat/rlac-subquery-and-to-many-calc
Open

feat(wren-core): RLAC subqueries on MDL models + to-many calc on non-PK column#2335
goldmedal wants to merge 2 commits into
Canner:mainfrom
goldmedal:feat/rlac-subquery-and-to-many-calc

Conversation

@goldmedal
Copy link
Copy Markdown
Collaborator

@goldmedal goldmedal commented May 29, 2026

Summary

Two related improvements to the wren-core access-control / calculated-field machinery, backported as two commits:

1. fix(wren-core): to-many calculated field joined on a non-PK column

  • Materialize the source model's primary key when building a TO-MANY calculated field, so the calculated subquery can be joined back even when the relationship is keyed on a non-PK column (e.g. customer.mock_id = whitelist.mock_id).
  • Use physical columns (instead of visible columns) in Lineage and the WrenMDL symbol table so hidden calculated columns participate in lineage / qualified-name resolution — enabling hidden RLAC helper fields that are usable inside RLAC expressions yet non-selectable by users.

2. feat(wren-core): RLAC conditions may use subqueries on MDL models

  • RLAC conditions can now reference other Wren models via subqueries, e.g. c_custkey IN (SELECT allowed_id FROM allowed WHERE allowed_user = @session_user).
  • New RlacContextProvider resolves table references to MDL models during expression parsing. The default SessionState::create_logical_expr uses an empty table map and never consults the catalog, so such references previously failed with table not found.
  • RLAC parsing moves from ModelGenerationRule to ModelAnalyzeRule. The pre-parsed filter is carried on ModelPlanNode::rlac_filter and exposed via expressions(), so the analyzer's subquery traversal rewrites inner TableScans into ModelPlanNodes — referenced models' own RLAC/CLAC and remote-table mapping apply transitively.
  • Cycle detection via a shared building_models stack: A↔B and self-referential RLAC now report a planning error instead of looping.
  • collect_condition skips bare identifiers inside subqueries (different scope) while still collecting @property references everywhere.
  • extract_models (wren-core-py) keeps a model referenced solely by another model's RLAC subquery condition, recursively (multi-hop chains).

Test plan

  • cargo test --lib --tests --bins under core/wren-core — 140 passed; sqllogictest (view/tpch/type/model) green
  • cargo test --no-default-features under core/wren-core-py — 31 passed, incl. new 3-hop RLAC-chain extraction cases
  • cargo fmt --all + cargo clippy --all-targets --all-features -- -D warnings clean (both crates)
  • New wren-core tests: test_to_many_calculate_join_with_normal_column, test_rlac_on_to_many_calculated_field, test_rlac_on_to_many_hidden_calculated_field, test_rlac_with_cross_model_subquery, test_rlac_subquery_applies_inner_rlac, test_rlac_subquery_cycle_detected, test_rlac_self_reference_is_cycle

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Row-level access control now supports subqueries referencing other models, enabling cross-model filtering conditions and multi-hop access control chains.
    • Added detection of circular dependencies in row-level access control subquery references to prevent infinite loops.
  • Improvements

    • Enhanced lineage analysis to use physical column definitions for more accurate source column tracking.

Review Change Stack

goldmedal and others added 2 commits May 29, 2026 17:51
Materialize the source model's primary key when building a TO-MANY
calculated field, so the calculated subquery can be joined back even when
the relationship is keyed on a non-PK column (e.g.
`customer.mock_id = whitelist.mock_id`).

Also use physical columns (instead of visible columns) in `Lineage` and
the `WrenMDL` symbol table so hidden calculated columns participate in
lineage and qualified-name resolution. This lets hidden RLAC helper
fields be referenced inside RLAC expressions while remaining
non-selectable by users.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
RLAC conditions can now reference other Wren models via subqueries, e.g.
`c_custkey IN (SELECT allowed_id FROM allowed WHERE allowed_user = @session_user)`.

- A new `RlacContextProvider` resolves table references to MDL models
  during expression parsing. The default `SessionState::create_logical_expr`
  builds a context provider with an empty table map and never consults the
  catalog, so previously any table reference inside an RLAC condition failed
  with "table not found".
- RLAC parsing moves from `ModelGenerationRule` to `ModelAnalyzeRule`. The
  pre-parsed filter is carried on `ModelPlanNode::rlac_filter` and exposed via
  `expressions()`, so the analyzer's subquery traversal rewrites inner
  `TableScan`s into `ModelPlanNode`s — the referenced models' own RLAC/CLAC
  and remote table mapping apply transitively.
- Cycle detection via a shared `building_models` stack on `ModelAnalyzeRule`:
  A↔B and self-referential RLAC now report a planning error instead of looping.
- `collect_condition` skips bare identifiers inside subqueries (they belong to
  a different scope) while still collecting `@property` references everywhere.
- `extract_models` keeps models referenced solely by another model's RLAC
  subquery condition, recursively, so manifest extraction doesn't trim them.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added python Pull requests that update Python code rust Pull requests that update rust code core labels May 29, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e3795bb4-5e4a-4bf5-b687-fe5278cd34bc

📥 Commits

Reviewing files that changed from the base of the PR and between 12f30ef and cfcbf45.

📒 Files selected for processing (7)
  • core/wren-core-py/src/extractor.rs
  • core/wren-core/core/src/logical_plan/analyze/access_control.rs
  • core/wren-core/core/src/logical_plan/analyze/model_anlayze.rs
  • core/wren-core/core/src/logical_plan/analyze/model_generation.rs
  • core/wren-core/core/src/logical_plan/analyze/plan.rs
  • core/wren-core/core/src/mdl/lineage.rs
  • core/wren-core/core/src/mdl/mod.rs

Walkthrough

Extended row-level access control to parse and rewrite subqueries referencing other MDL models with cycle detection. Refactored RLAC filter computation to occur once during model analysis, storing the filter on ModelPlanNode and reusing it during generation. Updated manifest extraction to identify models kept only via RLAC subquery dependencies.

Changes

RLAC Subquery Support with Cycle Detection

Layer / File(s) Summary
RLAC Condition Visitor and Context Provider
core/wren-core/core/src/logical_plan/analyze/access_control.rs
Replaced closure-based expression visitation with ConditionVisitor that tracks subquery depth, validating outer-scope column references while collecting session properties anywhere in the tree. Added RlacContextProvider to resolve subquery table references against MDL models via DataFusion planning with custom context.
ModelPlanNode RLAC Filter Field and Expression Integration
core/wren-core/core/src/logical_plan/analyze/plan.rs
Added rlac_filter: Option<Expr> field to ModelPlanNode; integrated into expressions() as a trailing expression and reconstructed in with_exprs_and_inputs to survive DataFusion traversal. Added build_rlac_filter to compute combined AND-ed RLAC predicates during plan construction.
Model Analyzer RLAC Subquery Analysis and Cycle Detection
core/wren-core/core/src/logical_plan/analyze/model_anlayze.rs
Introduced build_model_plan_node with cycle detection via ModelStackGuard RAII; added analyze_rlac_subqueries and analyze_subquery_plan to recursively rewrite TableScan nodes inside RLAC filter subqueries, applying analyzer pipeline and RLAC/relationship logic to inner models.
Model Generation Simplified RLAC Filter Application
core/wren-core/core/src/logical_plan/analyze/model_generation.rs
Removed generate_row_level_access_control_filter and per-rule iteration; simplified filter application to reuse precomputed model_plan.rlac_filter from ModelPlanNode, eliminating duplicated RLAC computation.
MDL Column Resolution and Test Coverage
core/wren-core/core/src/mdl/lineage.rs, core/wren-core/core/src/mdl/mod.rs
Updated WrenMDL::new to build qualified references from model.columns instead of visible columns; changed Lineage::collect_source_columns to iterate physical columns. Extended test suite with RLAC/CLAC tests covering to-many calculated fields and multi-hop RLAC subquery chains with cycle detection.
Python Extractor RLAC Model Reference Extraction
core/wren-core-py/src/extractor.rs
Added internal helpers to parse RLAC condition SQL using GenericDialect and extract referenced model names by visiting ObjectNames and matching catalog/schema. Extended manifest extractor to identify and recursively keep models referenced only through RLAC subquery chains alongside relationship-derived models.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

rust, core, python

Suggested reviewers

  • douenergy

Poem

🐰 The Wren hops through access control's gate,
Where RLAC subqueries now validate,
Cycles caught with stacks held tight,
Filters parsed and expressions rewrite,
Multi-model queries, feature complete!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the two main changes: RLAC subqueries on MDL models and to-many calculated fields on non-PK columns.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@goldmedal goldmedal requested a review from douenergy June 1, 2026 01:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core python Pull requests that update Python code rust Pull requests that update rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant