Skip to content

feat: add flow authoring decorators#513

Merged
iamgp merged 4 commits into
mainfrom
GWP/flow-authoring-decorators
May 18, 2026
Merged

feat: add flow authoring decorators#513
iamgp merged 4 commits into
mainfrom
GWP/flow-authoring-decorators

Conversation

@iamgp
Copy link
Copy Markdown
Collaborator

@iamgp iamgp commented May 18, 2026

Summary

  • add provider-neutral flow authoring decorators for transform SQL, publish surfaces, operational observability, repeatable backfills, governance contracts, access policies, and schedules
  • expose the new decorators through lazy top-level phlo exports
  • document the schedule model as static targets plus optional dynamic run parameters

Tests

  • uv run pytest tests/plugins -v
  • uv run ruff check src/phlo/flow.py src/phlo/init.py tests/plugins/test_flow_authoring_decorators.py
  • uv run ruff format --check src/phlo/flow.py src/phlo/init.py tests/plugins/test_flow_authoring_decorators.py

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 18, 2026

Review Change Stack

Warning

Rate limit exceeded

@iamgp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 52 minutes and 49 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 284ebd89-5b34-4380-8f4c-6e6075157677

📥 Commits

Reviewing files that changed from the base of the PR and between e267513 and 9328ae1.

📒 Files selected for processing (3)
  • src/phlo/_flow_authoring.py
  • src/phlo/transform.py
  • tests/plugins/test_flow_authoring_decorators.py
📝 Walkthrough

Walkthrough

This PR introduces a comprehensive provider-neutral flow authoring framework. It adds seven decorator types (@phlo.transform.sql, @phlo.publish, @phlo.observe, @phlo.backfill, @phlo.contract, @phlo.access, @phlo.schedule) with supporting utilities, dataclass specifications, module-level registries, and lazy top-level API exposure. Tests validate all decorators and exports; documentation demonstrates usage.

Changes

Flow Authoring Decorators

Layer / File(s) Summary
Helper utilities and specification types
src/phlo/_flow_authoring.py, src/phlo/flow.py
asset_key() normalises names into stable keys; contract_metadata() builds metadata payloads; build_run() wraps functions into RunSpec objects with success metadata; _call_with_optional_context() inspects function signatures; append_asset() appends to registries. Flow module defines frozen ContractSpec, AccessPolicySpec, and ScheduleSpec dataclasses and creates internal registries for all asset and spec types.
Transform SQL decorator and registry
src/phlo/transform.py
@phlo.transform.sql registers SQL transform assets with static or deferred SQL text determination via signature introspection; _static_sql_text() captures SQL at registration time when no required parameters exist, otherwise defers to runtime; registry accessors and clearers exposed.
Flow governance decorators and registries
src/phlo/flow.py
@publish, @observe, @backfill, @contract, @access, @schedule decorators register corresponding specs and assets with metadata (table, audience, freshness, roles, cron, etc.); each stores specs in module-level registries; getter and clear functions provided for each registry; __all__ exports all public APIs.
Top-level API lazy loading and exports
src/phlo/__init__.py
_FLOW_EXPORTS set lists new top-level symbols; _SUBMODULE_EXPORTS extended to include transform; __all__ updated to expose all flow and transform decorators/accessors; new __getattr__ branch resolves flow exports by dynamically importing from phlo.flow, caching in globals, and returning the symbol.
Decorator and lazy-export test validation
tests/plugins/test_flow_authoring_decorators.py
Nine test functions validate: @transform.sql asset registration and deferred SQL rendering; @publish data product registration; @observe operational check registration with conditional freshness/row-count specs; @backfill repeatable backfill job registration; @contract governance contract registration with SLA derivation; @access access policy registration; @schedule static schedule metadata with dynamic parameterisation; top-level phlo exports lazy-load and resolve correctly.
Flow authoring decorators user guide
docs/guides/developer-guide.md
New "Flow Authoring Decorators" section inserted before "Built-in Checks" with subsections and runnable examples for each decorator type, describing provider-neutral specification patterns.

Possibly Related PRs

  • phlohouse/phlo#108: Both PRs extend top-level phlo API exports via __all__ and lazy __getattr__ resolution; this PR adds flow-authoring decorators whilst that PR adds ingestion and quality decorators.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.04% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add flow authoring decorators' accurately and concisely describes the main change: introducing new flow authoring decorators across multiple modules.
Description check ✅ Passed The description is directly related to the changeset, detailing the specific decorators added (transform SQL, publish, observe, backfill, contract, access, schedule), lazy exports, and documentation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch GWP/flow-authoring-decorators

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

Cairn Quality Report

Commit: b4b927c · View full report

Checker Status ✅ Passed ❌ Failed Items
ruff passed 0 0 0
ruff-format pass 0 0 1
pytest-3.11 passed 968 0 969
pytest-3.12 passed 968 0 969

@iamgp
Copy link
Copy Markdown
Collaborator Author

iamgp commented May 18, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0251da4b9f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/phlo/transform.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
docs/guides/developer-guide.md (4)

903-919: 💤 Low value

Consider clarifying the hardcoded partition date is illustrative.

The function returns a hardcoded date "2026-05-18", which is clearly just an example. However, users might benefit from a comment indicating that real implementations would typically compute this dynamically (e.g., using datetime.date.today().isoformat()).

📝 Suggested clarification
 `@phlo.schedule`(
     name="daily_customer_health",
     cron="0 6 * * *",
     targets=["transform_silver_orders", "publish_gold_customer_health"],
     timezone="Europe/London",
 )
 def daily_customer_health():
-    return {"partition_date": "2026-05-18"}
+    # Return dynamic run parameters (e.g., current date for daily partitions)
+    return {"partition_date": "2026-05-18"}  # Example: use datetime.date.today().isoformat()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/developer-guide.md` around lines 903 - 919, Clarify that the
hardcoded partition date in the example schedule is illustrative: update the
`@phlo.schedule` example (function daily_customer_health and its returned
partition_date) to include a brief comment next to the return explaining this is
just an example and that real implementations should compute the partition
(e.g., using the current date or other logic) rather than hardcoding a static
string.

812-828: ⚡ Quick win

Consider documenting valid materialized values.

The example uses materialized="incremental" without explaining what values are valid (e.g., table, view, incremental) or what they mean. Users unfamiliar with this parameter might benefit from either inline comments or a reference to API documentation.

💡 Suggested enhancement
 `@phlo.transform.sql`(
     table="silver.orders",
     depends_on=["bronze.orders"],
-    materialized="incremental",
+    materialized="incremental",  # Options: "table", "view", "incremental"
 )

Or add a brief explanation after the example:

The `materialized` parameter supports: `"table"` (full refresh), `"view"` (virtual), or `"incremental"` (append/merge).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/developer-guide.md` around lines 812 - 828, The example for
phlo.transform.sql using materialized="incremental" lacks documentation of valid
materialized values; update the docs near the orders_sql example to list
supported values for the materialized parameter (e.g., "table" = full refresh,
"view" = virtual, "incremental" = append/merge) and include a short one-line
description or pointer to the API reference so readers know the semantics and
when to use each option.

830-841: ⚡ Quick win

Clarify the purpose of the return value.

The function returns "gold.customer_health", which duplicates the table parameter. Users might be unclear about whether a return value is required, what it should be, or how it's used by the framework. Consider adding a brief comment explaining the return semantics or using pass if no return is needed.

📝 Suggested clarification

Option 1 - If return is not needed:

 `@phlo.publish`(
     table="gold.customer_health",
     audience=["cs", "sales"],
     owner="data-platform",
     freshness_hours=6,
 )
 def customer_health():
-    return "gold.customer_health"
+    pass  # Decorator registers the publish specification

Option 2 - If return is meaningful, add a comment:

 def customer_health():
-    return "gold.customer_health"
+    return "gold.customer_health"  # Returns the published table identifier
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/developer-guide.md` around lines 830 - 841, The example shows a
phlo.publish-decorated function customer_health that returns
"gold.customer_health" which duplicates the table="gold.customer_health"
parameter; update the example to clarify the return semantics by either removing
the return and using pass when no return is required, or keep the return but add
an inline comment inside the customer_health function explaining what the
returned string represents and how the framework uses it (e.g., whether it must
match the table param or is optional), so readers know if a return value is
required and what it should contain.

855-865: ⚡ Quick win

Cross-reference to orders_sql might not be obvious.

The function returns orders_sql, which references the transform function from the earlier SQL Transform example (lines 822-827). Readers skimming the documentation might miss this connection. Consider adding a brief comment or rewording to make the relationship explicit.

📝 Suggested clarification
 `@phlo.backfill`(
     target="silver.orders",
     partitions={"start": "2026-01-01", "end": "2026-03-31"},
     mode="replace-partitions",
 )
 def orders_q1_backfill():
-    return orders_sql
+    return orders_sql  # References the transform function defined above
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/guides/developer-guide.md` around lines 855 - 865, Clarify that
orders_q1_backfill returns the previously defined SQL transform by explicitly
referencing orders_sql from the earlier SQL Transform example: update the
surrounding text near the orders_q1_backfill example to add a brief sentence
like "Here orders_q1_backfill returns the SQL transform defined earlier as
orders_sql (see the SQL Transform example above)," so readers understand
orders_sql refers to that transform; mention the symbol orders_sql and the
function name orders_q1_backfill to make the cross-reference explicit.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/phlo/_flow_authoring.py`:
- Around line 45-56: The helper _call_with_optional_context currently
misclassifies callables with keyword-only, varargs, or multiple params and may
call them incorrectly; update it to first analyze inspect.signature(fn) and
allow only two safe shapes: no parameters (call fn()) or exactly one non-var
positional-or-keyword parameter (call fn(context)), and explicitly reject any
function that has KEYWORD_ONLY parameters, VAR_POSITIONAL, VAR_KEYWORD, or more
than one required positional parameter by raising a clear TypeError referencing
fn and its signature; keep RuntimeContext usage but guard unsupported signatures
before invoking the callable.

In `@src/phlo/transform.py`:
- Around line 58-69: The _static_sql_text helper incorrectly treats functions
with required keyword-only parameters as parameterless; update its
parameter-kind check in the list comprehension (used to build
required_parameters) to include inspect.Parameter.KEYWORD_ONLY along with
POSITIONAL_ONLY and POSITIONAL_OR_KEYWORD so that functions like def query(*,
ds: str) -> str are detected as requiring parameters and not invoked at
decoration time; modify the code path in _static_sql_text that builds
required_parameters to reference inspect.Parameter.KEYWORD_ONLY.

---

Nitpick comments:
In `@docs/guides/developer-guide.md`:
- Around line 903-919: Clarify that the hardcoded partition date in the example
schedule is illustrative: update the `@phlo.schedule` example (function
daily_customer_health and its returned partition_date) to include a brief
comment next to the return explaining this is just an example and that real
implementations should compute the partition (e.g., using the current date or
other logic) rather than hardcoding a static string.
- Around line 812-828: The example for phlo.transform.sql using
materialized="incremental" lacks documentation of valid materialized values;
update the docs near the orders_sql example to list supported values for the
materialized parameter (e.g., "table" = full refresh, "view" = virtual,
"incremental" = append/merge) and include a short one-line description or
pointer to the API reference so readers know the semantics and when to use each
option.
- Around line 830-841: The example shows a phlo.publish-decorated function
customer_health that returns "gold.customer_health" which duplicates the
table="gold.customer_health" parameter; update the example to clarify the return
semantics by either removing the return and using pass when no return is
required, or keep the return but add an inline comment inside the
customer_health function explaining what the returned string represents and how
the framework uses it (e.g., whether it must match the table param or is
optional), so readers know if a return value is required and what it should
contain.
- Around line 855-865: Clarify that orders_q1_backfill returns the previously
defined SQL transform by explicitly referencing orders_sql from the earlier SQL
Transform example: update the surrounding text near the orders_q1_backfill
example to add a brief sentence like "Here orders_q1_backfill returns the SQL
transform defined earlier as orders_sql (see the SQL Transform example above),"
so readers understand orders_sql refers to that transform; mention the symbol
orders_sql and the function name orders_q1_backfill to make the cross-reference
explicit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8403f96a-8b1e-4bf7-b1b2-09131f9f586a

📥 Commits

Reviewing files that changed from the base of the PR and between 9f9b4c2 and e267513.

📒 Files selected for processing (6)
  • docs/guides/developer-guide.md
  • src/phlo/__init__.py
  • src/phlo/_flow_authoring.py
  • src/phlo/flow.py
  • src/phlo/transform.py
  • tests/plugins/test_flow_authoring_decorators.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: docs / build
  • GitHub Check: python / core tests (3.11)
  • GitHub Check: frontend / observatory
  • GitHub Check: python / core tests (3.12)
  • GitHub Check: integration / required suites
🔇 Additional comments (8)
docs/guides/developer-guide.md (4)

805-810: LGTM!


843-853: LGTM!


867-884: LGTM!


886-901: LGTM!

src/phlo/__init__.py (1)

119-149: LGTM!

Also applies to: 197-241

tests/plugins/test_flow_authoring_decorators.py (3)

1-14: LGTM!


16-266: LGTM!


268-287: ⚡ Quick win

[rewritten comment text]
[exactly ONE classification tag]

Comment thread src/phlo/_flow_authoring.py
Comment thread src/phlo/transform.py
@iamgp iamgp merged commit 1764d4f into main May 18, 2026
16 checks passed
@iamgp iamgp deleted the GWP/flow-authoring-decorators branch May 18, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant