feat: add flow authoring decorators#513
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR introduces a comprehensive provider-neutral flow authoring framework. It adds seven decorator types ( ChangesFlow Authoring Decorators
Possibly Related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Cairn Quality ReportCommit:
|
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0251da4b9f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
docs/guides/developer-guide.md (4)
903-919: 💤 Low valueConsider clarifying the hardcoded partition date is illustrative.
The function returns a hardcoded date
"2026-05-18", which is clearly just an example. However, users might benefit from a comment indicating that real implementations would typically compute this dynamically (e.g., usingdatetime.date.today().isoformat()).📝 Suggested clarification
`@phlo.schedule`( name="daily_customer_health", cron="0 6 * * *", targets=["transform_silver_orders", "publish_gold_customer_health"], timezone="Europe/London", ) def daily_customer_health(): - return {"partition_date": "2026-05-18"} + # Return dynamic run parameters (e.g., current date for daily partitions) + return {"partition_date": "2026-05-18"} # Example: use datetime.date.today().isoformat()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/developer-guide.md` around lines 903 - 919, Clarify that the hardcoded partition date in the example schedule is illustrative: update the `@phlo.schedule` example (function daily_customer_health and its returned partition_date) to include a brief comment next to the return explaining this is just an example and that real implementations should compute the partition (e.g., using the current date or other logic) rather than hardcoding a static string.
812-828: ⚡ Quick winConsider documenting valid
materializedvalues.The example uses
materialized="incremental"without explaining what values are valid (e.g.,table,view,incremental) or what they mean. Users unfamiliar with this parameter might benefit from either inline comments or a reference to API documentation.💡 Suggested enhancement
`@phlo.transform.sql`( table="silver.orders", depends_on=["bronze.orders"], - materialized="incremental", + materialized="incremental", # Options: "table", "view", "incremental" )Or add a brief explanation after the example:
The `materialized` parameter supports: `"table"` (full refresh), `"view"` (virtual), or `"incremental"` (append/merge).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/developer-guide.md` around lines 812 - 828, The example for phlo.transform.sql using materialized="incremental" lacks documentation of valid materialized values; update the docs near the orders_sql example to list supported values for the materialized parameter (e.g., "table" = full refresh, "view" = virtual, "incremental" = append/merge) and include a short one-line description or pointer to the API reference so readers know the semantics and when to use each option.
830-841: ⚡ Quick winClarify the purpose of the return value.
The function returns
"gold.customer_health", which duplicates thetableparameter. Users might be unclear about whether a return value is required, what it should be, or how it's used by the framework. Consider adding a brief comment explaining the return semantics or usingpassif no return is needed.📝 Suggested clarification
Option 1 - If return is not needed:
`@phlo.publish`( table="gold.customer_health", audience=["cs", "sales"], owner="data-platform", freshness_hours=6, ) def customer_health(): - return "gold.customer_health" + pass # Decorator registers the publish specificationOption 2 - If return is meaningful, add a comment:
def customer_health(): - return "gold.customer_health" + return "gold.customer_health" # Returns the published table identifier🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/developer-guide.md` around lines 830 - 841, The example shows a phlo.publish-decorated function customer_health that returns "gold.customer_health" which duplicates the table="gold.customer_health" parameter; update the example to clarify the return semantics by either removing the return and using pass when no return is required, or keep the return but add an inline comment inside the customer_health function explaining what the returned string represents and how the framework uses it (e.g., whether it must match the table param or is optional), so readers know if a return value is required and what it should contain.
855-865: ⚡ Quick winCross-reference to
orders_sqlmight not be obvious.The function returns
orders_sql, which references the transform function from the earlier SQL Transform example (lines 822-827). Readers skimming the documentation might miss this connection. Consider adding a brief comment or rewording to make the relationship explicit.📝 Suggested clarification
`@phlo.backfill`( target="silver.orders", partitions={"start": "2026-01-01", "end": "2026-03-31"}, mode="replace-partitions", ) def orders_q1_backfill(): - return orders_sql + return orders_sql # References the transform function defined above🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/guides/developer-guide.md` around lines 855 - 865, Clarify that orders_q1_backfill returns the previously defined SQL transform by explicitly referencing orders_sql from the earlier SQL Transform example: update the surrounding text near the orders_q1_backfill example to add a brief sentence like "Here orders_q1_backfill returns the SQL transform defined earlier as orders_sql (see the SQL Transform example above)," so readers understand orders_sql refers to that transform; mention the symbol orders_sql and the function name orders_q1_backfill to make the cross-reference explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/phlo/_flow_authoring.py`:
- Around line 45-56: The helper _call_with_optional_context currently
misclassifies callables with keyword-only, varargs, or multiple params and may
call them incorrectly; update it to first analyze inspect.signature(fn) and
allow only two safe shapes: no parameters (call fn()) or exactly one non-var
positional-or-keyword parameter (call fn(context)), and explicitly reject any
function that has KEYWORD_ONLY parameters, VAR_POSITIONAL, VAR_KEYWORD, or more
than one required positional parameter by raising a clear TypeError referencing
fn and its signature; keep RuntimeContext usage but guard unsupported signatures
before invoking the callable.
In `@src/phlo/transform.py`:
- Around line 58-69: The _static_sql_text helper incorrectly treats functions
with required keyword-only parameters as parameterless; update its
parameter-kind check in the list comprehension (used to build
required_parameters) to include inspect.Parameter.KEYWORD_ONLY along with
POSITIONAL_ONLY and POSITIONAL_OR_KEYWORD so that functions like def query(*,
ds: str) -> str are detected as requiring parameters and not invoked at
decoration time; modify the code path in _static_sql_text that builds
required_parameters to reference inspect.Parameter.KEYWORD_ONLY.
---
Nitpick comments:
In `@docs/guides/developer-guide.md`:
- Around line 903-919: Clarify that the hardcoded partition date in the example
schedule is illustrative: update the `@phlo.schedule` example (function
daily_customer_health and its returned partition_date) to include a brief
comment next to the return explaining this is just an example and that real
implementations should compute the partition (e.g., using the current date or
other logic) rather than hardcoding a static string.
- Around line 812-828: The example for phlo.transform.sql using
materialized="incremental" lacks documentation of valid materialized values;
update the docs near the orders_sql example to list supported values for the
materialized parameter (e.g., "table" = full refresh, "view" = virtual,
"incremental" = append/merge) and include a short one-line description or
pointer to the API reference so readers know the semantics and when to use each
option.
- Around line 830-841: The example shows a phlo.publish-decorated function
customer_health that returns "gold.customer_health" which duplicates the
table="gold.customer_health" parameter; update the example to clarify the return
semantics by either removing the return and using pass when no return is
required, or keep the return but add an inline comment inside the
customer_health function explaining what the returned string represents and how
the framework uses it (e.g., whether it must match the table param or is
optional), so readers know if a return value is required and what it should
contain.
- Around line 855-865: Clarify that orders_q1_backfill returns the previously
defined SQL transform by explicitly referencing orders_sql from the earlier SQL
Transform example: update the surrounding text near the orders_q1_backfill
example to add a brief sentence like "Here orders_q1_backfill returns the SQL
transform defined earlier as orders_sql (see the SQL Transform example above),"
so readers understand orders_sql refers to that transform; mention the symbol
orders_sql and the function name orders_q1_backfill to make the cross-reference
explicit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 8403f96a-8b1e-4bf7-b1b2-09131f9f586a
📒 Files selected for processing (6)
docs/guides/developer-guide.mdsrc/phlo/__init__.pysrc/phlo/_flow_authoring.pysrc/phlo/flow.pysrc/phlo/transform.pytests/plugins/test_flow_authoring_decorators.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: docs / build
- GitHub Check: python / core tests (3.11)
- GitHub Check: frontend / observatory
- GitHub Check: python / core tests (3.12)
- GitHub Check: integration / required suites
🔇 Additional comments (8)
docs/guides/developer-guide.md (4)
805-810: LGTM!
843-853: LGTM!
867-884: LGTM!
886-901: LGTM!src/phlo/__init__.py (1)
119-149: LGTM!Also applies to: 197-241
tests/plugins/test_flow_authoring_decorators.py (3)
1-14: LGTM!
16-266: LGTM!
268-287: ⚡ Quick win[rewritten comment text]
[exactly ONE classification tag]
Summary
Tests