diff --git a/README.md b/README.md
index ce11692..6e45c16 100644
--- a/README.md
+++ b/README.md
@@ -31,6 +31,7 @@ Type `/lfx-skills:lfx` and describe what you want in plain language:
 - **"Where does the meeting data flow live?"** — the router classifies the task and points at the owning repos plus the relevant central skill.
 - **"I'm adding a new V2 resource service"** — routes you to `/lfx-skills:lfx-platform-architecture` for platform flow, service class, and cross-service handoff points; the owning repo's path-scoped guidance handles Go conventions.
 - **"Does this API already exist?"** — `/lfx-skills:lfx` runs a read-only research pass to verify owning repos, contracts, examples, and blockers before implementation.
+- **"Generate a new silver dbt model"** — routes to `/lfx-skills:lfx-data-engineer` for medallion-layer conventions, sqlfluff formatting, tests, and dbt validation guidance.
 - **"Add or fix Intercom in this app"** — routes to `/lfx-skills:lfx-intercom`.
 - **"Add a CDP Snowflake connector"** — routes to `/lfx-skills:lfx-cdp-snowflake-connectors`.
 - **"Catch me up on my open PRs"** — routes to `/lfx-skills:lfx-pr-catchup`.
@@ -50,7 +51,7 @@ Canonical LFX knowledge that lives in this plugin and is referenced by every LFX
 | `/lfx-skills:lfx-itx-integration` | ITX wrapper patterns: OAuth2 M2M tokens, v1 KV sync, NATS ID mapping via `lfx.lookup_v1_mapping`. |
 | `/lfx-skills:lfx-intercom` | Retained central Intercom workflow from `main`, plus Fin AI optimization: Fin Guidance, Help Center content quality, and resolution rate. |
 
-### Workflow skills (7)
+### Workflow skills (8)
 
 Cross-repo developer workflows that apply across every LFX repo.
 
@@ -63,6 +64,7 @@ Cross-repo developer workflows that apply across every LFX repo.
 | `/lfx-skills:lfx-test-journey` | Combine feature branches across repos into git worktrees for end-to-end journey testing. |
 | `/lfx-skills:lfx-snowflake-access` | Request Snowflake access or service accounts via the `lfx-snowflake-terraform` repo. |
 | `/lfx-skills:lfx-cdp-snowflake-connectors` | Scaffold a CDP snowflake-connector data source in `crowd.dev`; retained centrally from `main`. |
+| `/lfx-skills:lfx-data-engineer` | Generate PR-ready dbt models, SQL transformations, and tests for `lf-dbt`, including medallion architecture, sqlfluff conventions, macros, and validation workflow guidance. |
 
 ### Platform skill (1)
 
@@ -111,6 +113,8 @@ Each agent locates its owning repo at runtime and uses repo-qualified paths for
 │   ├── lfx-test-journey/
 │   ├── lfx-snowflake-access/
 │   ├── lfx-cdp-snowflake-connectors/
+│   ├── lfx-data-engineer/       # dbt model + SQL transformation skill
+│   │   └── references/          # dbt setup, style, macros, testing, debugging
 │   └── lfx-v2-ticket-writer/
 ├── agents/
 │   ├── lfx-committee-service-code-reviewer.md
diff --git a/skills/lfx-data-engineer/SKILL.md b/skills/lfx-data-engineer/SKILL.md
new file mode 100644
index 0000000..0a55f93
--- /dev/null
+++ b/skills/lfx-data-engineer/SKILL.md
@@ -0,0 +1,497 @@
+---
+name: lfx-data-engineer
+description: >
+  Guide non-dbt developers through building PR-ready data models, tests, and
+  transformations in the lf-dbt repo. Encodes the medallion architecture
+  (bronze/silver/gold/platinum), Snowflake SQL conventions, sqlfluff formatting,
+  dbt testing patterns, key macros, and data governance rules. Use this skill
+  any time someone asks about writing dbt models, adding data tests, creating
+  SQL transformations, fixing pipeline failures, or contributing to the lf-dbt
+  repository.
+allowed-tools: Bash, Read, Write, Edit, Glob, Grep, AskUserQuestion
+---
+
+<!-- Copyright The Linux Foundation and each contributor to LFX. -->
+<!-- SPDX-License-Identifier: MIT -->
+<!-- Tool names in this file use Claude Code vocabulary. See docs/tool-mapping.md for other platforms. -->
+
+# LFX Data Engineering
+
+You are generating dbt models and SQL transformations that must be PR-ready. This skill encodes all conventions for the `lf-dbt` repository, which implements a medallion architecture data warehouse on Snowflake.
+
+**Prerequisites:** Snowflake access must be provisioned first (via `/lfx-snowflake-access`).
+
+## Input Validation
+
+Before generating any code, verify your args include:
+
+| Required | If Missing |
+|----------|------------|
+| Specific task (what to build/modify) | Stop and ask — do not guess |
+| Which medallion layer (bronze/silver/gold/platinum) | Infer from task, but confirm |
+| Data source name (for bronze) or upstream model (for silver+) | Stop and ask — never assume |
+| Target file path(s) | Infer from naming conventions, but verify they exist |
+| Example pattern to follow | Find one yourself (see Read Before Generating) |
+
+**If invoked with a FIX: prefix**, this is an error correction. Read the error, find the file, apply the targeted fix, and re-validate.
+
+## Read Before Generating — MANDATORY
+
+Before writing ANY code, you MUST:
+
+1. **Read the target file** (if modifying) — understand what's already there
+2. **Read one example file** in the same layer and domain — match the exact patterns
+3. **Read the relevant YML test file** — ensure your model will be tested consistently
+
+Do NOT generate code from memory alone. The codebase may have evolved since your training data.
+
+```bash
+# Example: before creating a new bronze model, read an existing one in the same source
+cat models/bronze/fivetran_platform/bronze_fivetran_platform_events.sql
+# And read the test file
+cat models/bronze/fivetran_platform/bronze_fivetran_platform_tests.yml
+```
+
+## License Header
+
+Every new `.sql` file MUST start with this header:
+
+```sql
+-- Copyright The Linux Foundation and each contributor to LFX.
+-- SPDX-License-Identifier: MIT
+```
+
+Every new `.yml` file MUST start with:
+
+```yaml
+# Copyright The Linux Foundation and each contributor to LFX.
+# SPDX-License-Identifier: MIT
+```
+
+## Completion Report
+
+When you finish, output a clear summary:
+
+```
+═══════════════════════════════════════════
+/lfx-data-engineer COMPLETE
+═══════════════════════════════════════════
+Files created:
+  - models/bronze/fivetran_platform/bronze_fivetran_platform_new_table.sql
+
+Files modified:
+  - models/bronze/fivetran_platform/bronze_fivetran_platform_tests.yml — added new_table tests
+
+Validation:
+  - Ran: sqlfluff lint models/bronze/fivetran_platform/bronze_fivetran_platform_new_table.sql
+  - Result: ✓ passed / ✗ failed with: <error>
+  - Ran: dbt compile --select bronze_fivetran_platform_new_table
+  - Result: ✓ passed / ✗ failed with: <error>
+
+Notes:
+  - Source table 'new_table' must exist in the fivetran_platform source definition
+
+Errors:
+  - (none)
+═══════════════════════════════════════════
+```
+
+**Always include the Validation section.** Run `sqlfluff lint` and `dbt compile` after creating or modifying files. Report the result.
+
+---
+
+## Medallion Architecture Quick Reference
+
+| Layer | Materialization | Schema | Purpose |
+|-------|----------------|--------|---------|
+| **Bronze** | `view` (default) | `bronze_*` (per source) | 1:1 with source data — column renames, type casting, filter deletes/test data |
+| **Silver** | `table` | `silver_dim`, `silver_fact` | Business logic, joins, reusable business objects |
+| **Gold** | `table` | `gold_*` (per domain) | Aggregated metrics for specific business use cases |
+| **Platinum** | `table` | `platinum*` (per product) | Pre-computed reports with time windows for dashboards |
+
+### References
+
+| Task | Reference |
+|------|-----------|
+| Environment setup, dbt commands, clone workflow | [references/getting-started.md](references/getting-started.md) |
+| Detailed layer guide with SQL examples and decision tree | [references/medallion-architecture.md](references/medallion-architecture.md) |
+| SQL formatting, keyword casing, indentation, CTEs, JOINs | [references/sql-style-guide.md](references/sql-style-guide.md) |
+| dbt test conventions, PII tagging, primary key tests | [references/testing-patterns.md](references/testing-patterns.md) |
+| Project macros: smart_source, format_timestamp, date ranges, deltas | [references/key-macros.md](references/key-macros.md) |
+| Troubleshooting build failures, sqlfluff, incremental issues | [references/debugging-pipelines.md](references/debugging-pipelines.md) |
+
+---
+
+## Creating a Model by Layer
+
+### Bronze — Source Extraction
+
+Bronze models are 1:1 with source tables. They rename columns, cast types, and filter out deleted/test records. No business logic.
+
+```sql
+-- Copyright The Linux Foundation and each contributor to LFX.
+-- SPDX-License-Identifier: MIT
+
+SELECT
+    id AS event_id,
+    event_title AS event_name,
+    event_start_date,
+    event_end_date,
+    created_date AS event_created_ts,
+    lastmodified_date AS updated_at
+
+FROM {{ source('fivetran_platform', 'event') }}
+WHERE
+    NOT _fivetran_deleted
+    AND NOT is_test
+```
+
+**Bronze rules:**
+- Use `source()` to reference raw tables (or `smart_source()` for dev lookback)
+- Rename columns to snake_case with business-friendly names
+- Timestamps: suffix `_ts`; Dates: suffix `_date`; Booleans: prefix `is_` or `has_`
+- Filter `_fivetran_deleted` and test data rows
+- No JOINs — one source table per model
+- Use `get_warehouse('hourly')` in config if the source is large
+
+### Silver — Business Logic
+
+Silver models join bronze models, apply business rules, and create reusable objects. Split into `dim/` (dimensions) and `fact/` (facts).
+
+```sql
+-- Copyright The Linux Foundation and each contributor to LFX.
+-- SPDX-License-Identifier: MIT
+
+{% set warehouse = get_warehouse('hourly') %}
+
+{{ config(snowflake_warehouse=warehouse) }}
+
+/*
+Purpose:
+    Create a reusable project dimension with core Salesforce project attributes
+    and the latest project health score for downstream analytics.
+
+Questions answered:
+    - What are the canonical identifiers and names for each project?
+    - What is the current health score associated with each project?
+
+Data sources:
+    - bronze_fivetran_salesforce_projects
+    - silver_fact_crowd_dev_project_health_metrics
+*/
+
+WITH source_data AS (
+    SELECT
+        project_id,
+        project_name,
+        project_slug,
+        project_status
+    FROM {{ ref('bronze_fivetran_salesforce_projects') }}
+),
+
+enriched AS (
+    SELECT
+        s.project_id,
+        s.project_name,
+        s.project_slug,
+        s.project_status,
+        h.health_score
+    FROM source_data s
+    LEFT JOIN {{ ref('silver_fact_crowd_dev_project_health_metrics') }} h
+        ON s.project_slug = h.project_slug
+)
+
+SELECT
+    project_id,
+    project_name,
+    project_slug,
+    project_status,
+    health_score
+FROM enriched
+```
+
+**Silver rules:**
+- Use `ref()` to reference bronze or other silver models
+- CTEs for each logical step (one unit of work per CTE)
+- Verbose CTE names that describe what they do
+- Include a block comment at the top explaining purpose, questions answered, and data sources
+- `dim/` for slowly-changing attributes; `fact/` for events and transactions
+
+### Gold — Aggregated Metrics
+
+Gold models combine silver models into purpose-built datasets for specific use cases.
+
+```sql
+-- Copyright The Linux Foundation and each contributor to LFX.
+-- SPDX-License-Identifier: MIT
+
+{{ config(unique_key=["_key", "project_id"]) }}
+
+SELECT
+    ({{ dbt_utils.generate_surrogate_key(["c._key", "p.mapped_project_id"]) }}) AS activity_project_id,
+    c._key,
+    c.activity_id,
+    c.activity_ts,
+    p.mapped_project_id AS project_id,
+    p.mapped_project_slug AS project_slug
+
+FROM {{ ref("silver_fact_crowd_dev_activities") }} c
+LEFT JOIN {{ ref("_silver_dim_project_spine") }} p
+    ON c.project_id = p.base_project_id
+WHERE
+    p.mapped_project_id IS NOT NULL
+    AND {{ filter_code_contributions_non_bot('c') }}
+```
+
+**Gold rules:**
+- Use `dbt_utils.generate_surrogate_key()` for composite primary keys
+- Always specify `unique_key` in config for incremental models
+- Reference silver models via `ref()`, apply domain-specific macros
+- Final SELECT should explicitly list all columns — no `SELECT *`
+
+### Platinum — Pre-Computed Reports
+
+Platinum models produce dashboard-ready data with time-windowed aggregations.
+
+```sql
+-- Copyright The Linux Foundation and each contributor to LFX.
+-- SPDX-License-Identifier: MIT
+
+{% set warehouse = get_warehouse('hourly') %}
+
+{{ config(snowflake_warehouse=warehouse) }}
+
+WITH base AS (
+    SELECT
+        user_id,
+        event_id,
+        event_name,
+        event_start_date
+    FROM {{ ref('silver_fact_event_registrations') }}
+    WHERE event_name IS NOT NULL
+)
+
+SELECT
+    ({{ dbt_utils.generate_surrogate_key(['user_id', 'event_id']) }}) AS _key,
+    user_id,
+    event_id,
+    event_name,
+    event_start_date
+FROM base
+QUALIFY ROW_NUMBER() OVER (
+    PARTITION BY user_id, event_id
+    ORDER BY event_start_date
+) = 1
+```
+
+**Platinum rules:**
+- Use date range macros (`is_last_30_days`, `is_year_to_date`, etc.) for time windows
+- Use `get_warehouse()` for resource-intensive models
+- `GROUP BY ALL` is acceptable for complex aggregations
+- `QUALIFY` with `ROW_NUMBER()` for deduplication
+- Purpose-built for specific dashboards (PCC, Individual Dashboard, Org Dashboard)
+
+---
+
+## Writing Tests (YML)
+
+Every model needs a corresponding entry in a `*_tests.yml` file. Use `data_tests:` (not the deprecated `tests:`). Parameterized tests require the `arguments:` wrapper.
+
+```yaml
+# Copyright The Linux Foundation and each contributor to LFX.
+# SPDX-License-Identifier: MIT
+
+version: 2
+models:
+  - name: my_new_model
+    description: "What this model contains and its purpose."
+    columns:
+      - name: _key
+        description: "The unique primary key for the table."
+        data_tests:
+          - unique
+          - not_null
+          - dbt_utils.not_empty_string
+
+      - name: status
+        description: "The current status."
+        data_type: string
+        data_tests:
+          - not_null
+          - accepted_values:
+              arguments:
+                values: ["active", "inactive", "pending"]
+
+      - name: project_id
+        description: "Foreign key to the projects dimension."
+        data_type: string
+        data_tests:
+          - not_null
+          - relationships:
+              arguments:
+                to: ref('silver_dim_projects')
+                field: project_id
+
+      - name: email
+        description: "User email address"
+        data_type: string
+        config:
+          meta:
+            contains_pii: true
+            data_retention: "undefined"
+```
+
+See [references/testing-patterns.md](references/testing-patterns.md) for full conventions.
+
+---
+
+## SQL Style Rules (Summary)
+
+| Rule | Example |
+|------|---------|
+| Uppercase SQL keywords | `SELECT`, `FROM`, `WHERE`, `LEFT JOIN` |
+| Lowercase identifiers | `event_id`, `project_name` |
+| 4-space indentation | Indent columns under `SELECT`, conditions under `WHERE` |
+| Trailing commas | `event_id,` (not `, event_id`) |
+| CTEs over subqueries | Use `WITH ... AS (...)` instead of nested `SELECT` |
+| Default to `INNER JOIN` | Use `LEFT JOIN` only when right side may have no matches |
+| No `RIGHT JOIN` | Rewrite as `LEFT JOIN` |
+| No `SELECT DISTINCT` | Requires architect approval |
+| `GROUP BY` by number | `GROUP BY 1, 2` preferred over column names |
+| Explicit column lists | No `SELECT *` in final SELECT |
+| Pre-filter in CTEs | Complex filtering on joined tables belongs in a CTE before the join |
+
+See [references/sql-style-guide.md](references/sql-style-guide.md) for full formatting rules.
+
+---
+
+## Key Macros
+
+| Macro | Purpose | When to Use |
+|-------|---------|-------------|
+| `smart_source()` | Dev-friendly source wrapper with lookback | Bronze models reading from source tables |
+| `format_timestamp()` | Generate UTC `_ts` and local `_ts_local` columns | Bronze models normalizing timestamps |
+| `to_utc_timestamp()` | Convert local timestamp to UTC with dynamic timezone | When timezone is a column, not a constant |
+| `get_warehouse()` | Select warehouse by size (`default`, `hourly`, `medium`) | Large models needing specific compute |
+| `generate_alias_name` | Strips schema prefix from table name (e.g., `silver_dim_` → table name) | Automatic — configured in macros |
+| `is_last_7_days()`, `is_last_30_days()`, etc. | Date range filters for time windows | Platinum models with pre-computed periods |
+| `is_prev_7_days()`, `is_prev_30_days()`, etc. | Previous period for period-over-period comparison | Delta/change calculations |
+| `add_delta_columns()` | Generate `_prev`, `_diff`, `_delta` columns | Period-over-period metric comparisons |
+| `get_month()`, `get_quarter()` | Human-readable date labels | Display-friendly date columns |
+| `gdpr_filter_email()` | Exclude GDPR-suppressed emails | Any model exposing email addresses |
+| `filter_code_contributions_non_bot()` | Exclude bot code contributions | Code contribution models |
+| `format_country()` | Normalize country names to canonical values | Models with user-entered country data |
+| `comprehensive_email_filter()` | Validate email format + exclude test emails | Email-based models |
+
+See [references/key-macros.md](references/key-macros.md) for full documentation and usage examples.
+
+---
+
+## Data Governance
+
+### PII Tagging
+
+Columns containing personally identifiable information (names, emails, addresses, etc.) must be tagged in the YML file. Use `config.meta` — not top-level `meta`.
+
+```yaml
+columns:
+  - name: email
+    description: "User email address"
+    config:
+      meta:
+        contains_pii: true
+        data_retention: "undefined"
+```
+
+### Timestamp Normalization
+
+All timestamps must be normalized to UTC in the bronze layer:
+- Timestamps: `_ts` suffix, stored as `TIMESTAMP_NTZ` in UTC
+- Dates: `_date` suffix, stored as `DATE`
+- Use `format_timestamp()` macro for consistent conversion
+- Use `convert_timezone()` for explicit timezone conversion
+
+### Primary Key Convention
+
+- Use `_key` suffix for primary key columns
+- Always add unique, not_null, and not_empty_string tests
+
+---
+
+## Common Anti-Patterns — DO NOT DO THESE
+
+| Anti-Pattern | Correct Pattern |
+|-------------|-----------------|
+| Missing license header | Always add `-- Copyright The Linux Foundation...` |
+| `tests:` in YML | Use `data_tests:` (dbt v1.10.5+) |
+| `meta:` at top level in YML | Nest under `config:` → `meta:` |
+| Missing `arguments:` on parameterized tests | `accepted_values:` → `arguments:` → `values:` |
+| `tags:` at top level in YML | Nest under `config:` → `tags:` |
+| Duplicate `config:` keys in YML | Combine into a single `config:` block |
+| Custom keys directly in `config:` | Nest under `config:` → `meta:` |
+| `SELECT DISTINCT` | Use `GROUP BY` or `QUALIFY ROW_NUMBER()` |
+| `RIGHT JOIN` | Rewrite as `LEFT JOIN` |
+| Filtering right side of LEFT JOIN in `WHERE` | Filter in the `ON` clause or in a CTE |
+| `SELECT *` in final select | Explicitly list all columns |
+| Subqueries in `FROM` or `JOIN` | Use CTEs |
+| Raw `source()` in dev (large tables) | Use `smart_source()` with lookback |
+| Hardcoded warehouse name | Use `get_warehouse()` macro |
+| `console.log` / `print` debugging | Use `dbt compile` and `dbt show` |
+| Committing without `--signoff` or `-S` | Always use signed commits with DCO |
+
+---
+
+## Pre-PR Checklist
+
+### All Models
+- [ ] License header on all new `.sql` and `.yml` files
+- [ ] Model documented in corresponding `*_tests.yml` file
+- [ ] Primary key column(s) have `unique`, `not_null`, `dbt_utils.not_empty_string` tests
+- [ ] PII columns tagged with `config.meta.contains_pii: true` and `data_retention: "undefined"`
+- [ ] `sqlfluff lint` passes on all new/modified `.sql` files
+- [ ] `dbt compile --select +model_name` succeeds
+- [ ] Column naming follows conventions (`_ts`, `_date`, `is_`, `has_`, `_key`)
+- [ ] No `SELECT *` in final select statements
+- [ ] All timestamps normalized to UTC
+
+### Bronze Models
+- [ ] 1:1 with source table — no joins
+- [ ] Filters `_fivetran_deleted` and test data
+- [ ] Column renames to snake_case with business-friendly names
+- [ ] Uses `source()` or `smart_source()`
+
+### Silver Models
+- [ ] Uses `ref()` to reference upstream models
+- [ ] CTEs for each logical unit of work
+- [ ] Block comment explaining purpose and data sources
+- [ ] Placed in correct subfolder (`dim/` or `fact/`)
+
+### Gold Models
+- [ ] Surrogate key generated for composite keys
+- [ ] `unique_key` specified in config for incremental models
+- [ ] Final SELECT explicitly lists all columns
+
+### Platinum Models
+- [ ] Uses date range macros for time windows
+- [ ] `get_warehouse()` configured if resource-intensive
+- [ ] Purpose-built for a specific dashboard or use case
+
+---
+
+## Scope Boundaries
+
+**This skill DOES:**
+- Generate/modify dbt SQL models following medallion architecture
+- Create/update YML test files with proper data_tests format
+- Add source definitions for new data sources
+- Apply project macros (smart_source, format_timestamp, date ranges, etc.)
+- Run sqlfluff lint/fix validation after changes
+- Run dbt compile to verify model correctness
+
+**This skill does NOT:**
+- Run dbt build/test against the warehouse (use the `running-dbt-commands` skill)
+- Modify existing macros without architect review
+- Make architectural decisions about layer placement (ask the user)
+- Generate semantic layer definitions (use the `building-dbt-semantic-layer` skill)
+- Troubleshoot dbt Cloud job failures (use the `troubleshooting-dbt-job-errors` skill)
+- Modify protected infrastructure files (`dbt_project.yml`, `profiles.yml`, `packages.yml`) — flag for code owner
diff --git a/skills/lfx-data-engineer/references/debugging-pipelines.md b/skills/lfx-data-engineer/references/debugging-pipelines.md
new file mode 100644
index 0000000..9099262
--- /dev/null
+++ b/skills/lfx-data-engineer/references/debugging-pipelines.md
@@ -0,0 +1,328 @@
+<!-- Copyright The Linux Foundation and each contributor to LFX. -->
+<!-- SPDX-License-Identifier: MIT -->
+
+# Debugging Pipelines
+
+Common failure patterns in the lf-dbt project and how to resolve them.
+
+---
+
+## dbt compile Failures
+
+`dbt compile` validates SQL and Jinja syntax without executing against
+Snowflake. Always run it before `dbt build`.
+
+```bash
+dbt compile --select +model_name
+```
+
+### Missing `ref()` or `source()` Target
+
+**Error:** `Compilation Error: ... depends on a node named '...' which was not found`
+
+**Cause:** The model references a table or source that doesn't exist.
+
+**Fix:**
+1. Check spelling — model names must match exactly (case-sensitive in YAML)
+2. Verify the upstream model exists: `find models -name '*model_name*'`
+3. For sources, check the source definition YAML file exists
+4. Run `dbt deps` if the reference is to a package model
+
+### Jinja Syntax Error
+
+**Error:** `Compilation Error: ... unexpected '}'` or `expected token 'end of statement block'`
+
+**Cause:** Malformed Jinja template syntax.
+
+**Fix:**
+1. Check for unmatched `{% %}` or `{{ }}` blocks
+2. Verify macro calls have the right number of arguments
+3. Look for missing commas in macro arguments
+4. Check that `config()` blocks have proper Python dict syntax
+
+### Undefined Macro
+
+**Error:** `Compilation Error: 'macro_name' is undefined`
+
+**Cause:** The macro doesn't exist or packages aren't installed.
+
+**Fix:**
+1. Run `dbt deps` to install packages
+2. Check macro spelling — search `macros/` for the correct name
+3. Verify the macro is defined in `macros/` or in a package
+
+---
+
+## dbt build Failures
+
+### SQL Compilation Error in Snowflake
+
+**Error:** `Database Error: ... SQL compilation error`
+
+**Cause:** The generated SQL is invalid Snowflake syntax.
+
+**Fix:**
+1. Inspect the compiled SQL: `dbt compile --select model_name`
+2. Open the compiled file: `target/compiled/core_warehouse/models/.../model_name.sql`
+3. Copy the compiled SQL into a Snowflake worksheet and run it directly
+4. The Snowflake error message will point to the exact line
+
+### Missing Source Table
+
+**Error:** `Database Error: ... Object 'DATABASE.SCHEMA.TABLE' does not exist`
+
+**Cause:** The source table hasn't been cloned to your dev schema.
+
+**Fix:**
+```bash
+# Clone production tables to dev
+dbt run-operation clone_production_tables
+
+# Then rebuild excluding cloned data
+dbt build --select +model_name --exclude tag:cloned_data
+```
+
+### Permission Error
+
+**Error:** `Database Error: ... Insufficient privileges to operate on ...`
+
+**Cause:** Your Snowflake role doesn't have access to the source data.
+
+**Fix:**
+1. Verify your role in `.env` matches your provisioned access
+2. Check if the source table requires a specific role
+3. Contact CloudOps if you need additional permissions
+
+---
+
+## sqlfluff Lint Errors
+
+### Running the Linter
+
+```bash
+# Lint a file and see errors
+sqlfluff lint path/to/file.sql
+
+# Auto-fix what it can
+sqlfluff fix path/to/file.sql
+
+# Lint all staged files
+make lint-staged-files
+```
+
+### Common Lint Errors
+
+| Error Code | Description | Fix |
+|------------|-------------|-----|
+| `CP01` | Keyword not uppercase | Change `select` to `SELECT` |
+| `CP02` | Identifier not lowercase | Change `COLUMN_NAME` to `column_name` |
+| `CP03` | Function not uppercase | Change `count()` to `COUNT()` |
+| `CP04` | Literal not uppercase | Change `null` to `NULL`, `true` to `TRUE` |
+| `CP05` | Type cast not lowercase | Change `::INT` to `::int` |
+| `CV09` | Blocked data type | Use `INT` not `INTEGER`, `DECIMAL` not `NUMBER` |
+| `CV11` | Non-shorthand cast | Use `::int` not `CAST(x AS INT)` |
+| `ST05` | Subquery in FROM/JOIN | Extract to a CTE |
+| `RF03` | Qualified single-table ref | Remove table alias prefix when only one table |
+| `AL01` | Implicit table alias style | `FROM users u` is fine (project allows implicit) |
+
+### Ignoring Specific Rules
+
+If a specific lint rule must be violated with good reason:
+
+```sql
+-- Example: using a blocked type because the source requires it
+column_name::NUMBER  -- noqa: CV09 - source returns NUMBER type
+```
+
+### Jinja Template Errors in sqlfluff
+
+**Error:** `WARNING: Could not parse ... Traceback ...`
+
+**Cause:** sqlfluff can't parse a Jinja expression.
+
+**Fix:**
+1. Ensure `dbt deps` has been run (macros from packages are needed)
+2. Check that `load_macros_from_path = macros` is in `.sqlfluff`
+3. Complex Jinja may need `-- noqa` to skip that line
+
+---
+
+## Incremental Model Issues
+
+### Full Refresh
+
+If an incremental model has bad data or schema changes:
+
+```bash
+# Rebuild from scratch (drops and recreates)
+dbt build --select model_name --full-refresh
+```
+
+### Unique Key Conflicts
+
+**Error:** `Database Error: ... Duplicate row detected during DML action`
+
+**Cause:** The `unique_key` in the model config doesn't produce unique rows.
+
+**Fix:**
+1. Check the `unique_key` in the model's `config()` block
+2. Run `dbt show` to inspect for duplicates:
+
+```bash
+dbt show --select model_name --limit 20
+```
+
+3. Add `QUALIFY ROW_NUMBER()` to deduplicate before the final SELECT
+4. If the issue is in source data, add deduplication in a CTE
+
+### Schema Changes
+
+If you add or remove columns from an incremental model:
+
+```bash
+# Full refresh to apply schema changes
+dbt build --select model_name --full-refresh
+```
+
+Without `--full-refresh`, new columns won't appear because the existing table
+structure is preserved for incremental loads.
+
+---
+
+## Quick Data Validation
+
+### `dbt show` — Preview Results
+
+Preview the output of a model without materializing it:
+
+```bash
+# Show first 5 rows (default)
+dbt show --select model_name
+
+# Show more rows
+dbt show --select model_name --limit 20
+
+# Show with inline SQL
+dbt show --inline "SELECT COUNT(*) FROM {{ ref('model_name') }}"
+```
+
+### `dbt compile` — Inspect Generated SQL
+
+See exactly what SQL dbt will execute:
+
+```bash
+dbt compile --select model_name
+```
+
+The compiled SQL is written to:
+`target/compiled/core_warehouse/models/.../model_name.sql`
+
+Open this file to see the fully-rendered SQL with all Jinja resolved.
+
+---
+
+## Missing Data in Dev
+
+### Symptom: Model Runs But Returns No Rows
+
+**Cause:** The `smart_source()` macro limits data to the last 30 days in dev.
+If the source table has no recent data, the query returns nothing.
+
+**Fix:**
+1. Increase the lookback window: `smart_source('source', 'table', 'date_col', 90)`
+2. Or clone production data:
+
+```bash
+dbt run-operation clone_production_tables
+dbt build --select +model_name --exclude tag:cloned_data
+```
+
+### Symptom: Source Table Not Found
+
+**Cause:** Large tables (Kafka, Salesforce) aren't rebuilt in dev by default.
+
+**Fix:** Clone production data (see above). The cloned tables/views appear in
+your dev schema automatically.
+
+---
+
+## Test Failures
+
+### Running Tests
+
+```bash
+# Run all tests
+dbt test
+
+# Test a specific model
+dbt test --select model_name
+
+# Test a model and all its dependencies
+dbt test --select +model_name
+```
+
+### Debugging a Failed Test
+
+1. Read the test failure message — it tells you which test and column failed
+2. Check the compiled test SQL: `target/compiled/core_warehouse/tests/...`
+3. Run the test query directly in Snowflake to see the offending rows
+4. Use `dbt show` to inspect the model output:
+
+```bash
+dbt show --inline "
+SELECT column_name, COUNT(*)
+FROM {{ ref('model_name') }}
+GROUP BY 1
+HAVING COUNT(*) > 1
+"
+```
+
+### Known Edge Cases
+
+Some models have intentional test threshold overrides for known data quality
+issues:
+
+```yaml
+data_tests:
+  - unique:
+      config:
+        error_if: ">10"
+        warn_if: ">10"
+```
+
+If your model has a small number of expected duplicates from upstream data,
+use this pattern with a comment explaining why.
+
+---
+
+## dbt Cloud Job Failures
+
+For failures in dbt Cloud (production or staging jobs), use the
+`troubleshooting-dbt-job-errors` skill in the lf-dbt repository's
+`.agents/skills/` directory. That skill covers:
+
+- Reading job run logs via the dbt Cloud Admin API
+- Diagnosing intermittent failures
+- Checking git history for recent changes that may have caused the failure
+- Investigating data issues in source systems
+
+---
+
+## Common Debugging Workflow
+
+```text
+1. dbt compile --select model_name
+   └─ Fix Jinja/SQL syntax errors
+
+2. sqlfluff lint path/to/model.sql
+   └─ Fix formatting violations
+
+3. dbt build --select model_name
+   └─ Fix Snowflake runtime errors
+
+4. dbt test --select model_name
+   └─ Fix data quality issues
+
+5. dbt show --select model_name --limit 20
+   └─ Verify output looks correct
+```
diff --git a/skills/lfx-data-engineer/references/getting-started.md b/skills/lfx-data-engineer/references/getting-started.md
new file mode 100644
index 0000000..de1fb9e
--- /dev/null
+++ b/skills/lfx-data-engineer/references/getting-started.md
@@ -0,0 +1,222 @@
+<!-- Copyright The Linux Foundation and each contributor to LFX. -->
+<!-- SPDX-License-Identifier: MIT -->
+
+# Getting Started with lf-dbt
+
+## Prerequisites
+
+| Requirement | Details |
+|-------------|---------|
+| Python | 3.11+ with virtual environment |
+| Snowflake access | Provisioned via `lfx-snowflake-terraform` (see `/lfx-snowflake-access` skill) |
+| dbt | Installed via `pip install -r requirements.txt` |
+| Environment variables | Configured in `.env` file (see `.env.sample`) |
+
+## Initial Setup
+
+```bash
+# 1. Clone the repository
+git clone https://github.com/linuxfoundation/lf-dbt.git
+cd lf-dbt
+
+# 2. Create and activate a virtual environment
+python3 -m venv venv
+source venv/bin/activate
+
+# 3. Install dependencies
+pip install -r requirements.txt
+
+# 4. Configure environment variables
+cp .env.sample .env
+# Edit .env with your Snowflake credentials
+
+# 5. Install dbt packages
+dbt deps
+
+# 6. Verify your connection
+dbt compile
+```
+
+## Snowflake Connection
+
+The connection is configured in `profiles.yml` with the `dbt-snowflake` profile:
+
+| Setting | Source |
+|---------|--------|
+| Account | `SNOWFLAKE_ACCOUNT` env var |
+| User | `DBT_ENV_SECRET_USER` env var |
+| Password | `DBT_ENV_SECRET_PASS` env var |
+| Role | `DBT_ENV_ROLE` env var |
+| Database | `DBT_ENV_DATABASE` env var |
+| Warehouse | `DBT_ENV_WAREHOUSE` env var |
+| Default schema | `DBT_DEFAULT_SCHEMA` env var |
+
+For keypair authentication (required for CLI/programmatic access), see the
+[lf-dbt README — SnowSQL Keypair Authentication Setup](https://github.com/linuxfoundation/lf-dbt/blob/main/README.md#snowsql-keypair-authentication-setup).
+
+## Essential dbt Commands
+
+```bash
+# Install package dependencies
+dbt deps
+
+# Compile models without running (validates SQL)
+dbt compile
+
+# Build all models and run tests
+dbt build
+
+# Build excluding cloned production data (use after cloning)
+dbt build --exclude tag:cloned_data
+
+# Build excluding large Kafka tables
+dbt build --exclude tag:kafka_crowd_dev
+
+# Build a specific model and all its upstream dependencies
+dbt build --select +model_name
+
+# Build by layer
+dbt build --select tag:bronze
+dbt build --select tag:silver
+dbt build --select tag:gold
+dbt build --select tag:platinum
+
+# Run tests only
+dbt test
+
+# Preview query results without materializing
+dbt show --select model_name
+
+# Inspect the compiled SQL for a model
+dbt compile --select model_name
+# Then check target/compiled/core_warehouse/models/...
+
+# Generate and view documentation
+dbt docs generate
+dbt docs serve
+```
+
+## Cloning Production Data for Development
+
+Some bronze tables (Kafka CDP, Salesforce) are too large to rebuild in dev.
+Clone production data to your dev schema instead:
+
+```bash
+# Clone tables and create views from production (run weekly)
+dbt run-operation clone_production_tables
+
+# With custom retention if Time Travel is needed
+dbt run-operation clone_production_tables --args '{retention_days: 7}'
+
+# Then exclude cloned data from your builds
+dbt build --exclude tag:cloned_data
+```
+
+This creates 179 objects across 19 schemas:
+
+- 112 Bronze views across 17 schemas
+- 21 Bronze cloned tables across 17 schemas
+- 39 Silver Dim cloned tables
+- 7 Silver Fact cloned tables
+
+Cloned tables use 0-day retention by default (no Time Travel history) to
+optimize storage costs.
+
+## Makefile Targets
+
+The project includes shortcuts for building specific data domains:
+
+| Command | What it builds |
+|---------|---------------|
+| `make edx` | EdX course and enrollment data |
+| `make easycla` | EasyCLA signature data |
+| `make bevy` | Bevy chapter and event data |
+| `make events` | Platform event registration data |
+| `make ti` | Training Institute data |
+| `make webinars` | Webinar attendance data |
+| `make individual_memberships` | Individual membership data |
+| `make docs` | Generate dbt documentation |
+
+### Linting
+
+```bash
+# Lint a specific file
+sqlfluff lint path/to/file.sql
+
+# Auto-fix formatting issues
+sqlfluff fix path/to/file.sql
+
+# Lint a specific file via Makefile
+make lint-fix file=path/to/file.sql
+
+# Lint all staged files (before commit)
+make lint-staged-files
+
+# Auto-fix all staged files
+make fix-lint-staged-files
+```
+
+## Schema Organization
+
+Each layer maps to specific Snowflake schemas. In production, the schema name
+is used directly. In dev, it is prefixed with your default schema
+(e.g., `your_schema_bronze_fivetran_platform`).
+
+| Layer | Schema Pattern | Example |
+|-------|---------------|---------|
+| Bronze | `bronze_*` (per source) | `bronze_fivetran_platform`, `bronze_salesforce` |
+| Silver Dim | `silver_dim` | `silver_dim` |
+| Silver Fact | `silver_fact` | `silver_fact` |
+| Gold | `gold_*` (per domain) | `gold_reporting`, `gold_fact` |
+| Platinum | `platinum*` (per product) | `platinum`, `platinum_organization_dashboard` |
+
+## Project Structure
+
+```text
+lf-dbt/
+├── dbt_project.yml          # Main project configuration
+├── profiles.yml             # Snowflake connection config
+├── packages.yml             # dbt package dependencies
+├── .sqlfluff                # SQL linting rules
+├── Makefile                 # Build shortcuts
+├── macros/                  # Reusable SQL fragments
+├── models/
+│   ├── bronze/              # Source-aligned raw data
+│   │   ├── fivetran_platform/
+│   │   ├── fivetran_salesforce/
+│   │   ├── kafka_crowd_dev/
+│   │   └── ...
+│   ├── silver/              # Business logic layer
+│   │   ├── dim/             # Dimensions
+│   │   │   └── helper_models/
+│   │   └── fact/            # Facts
+│   │       └── helper_models/
+│   ├── gold/                # Aggregated metrics
+│   │   ├── fact/
+│   │   ├── reporting/
+│   │   └── ...
+│   ├── platinum/            # Pre-computed reports
+│   │   ├── individual_dashboard/
+│   │   ├── organization_dashboard/
+│   │   ├── lfx_one/
+│   │   └── ...
+│   └── semantic/            # Semantic layer definitions
+├── data/                    # Seed data
+├── tests/                   # Custom data tests
+└── snapshots/               # dbt snapshots
+```
+
+## Git Workflow
+
+All commits must be signed and include DCO signoff:
+
+```bash
+git commit -S --signoff -m "Add new bronze model for event registrations"
+```
+
+Branch naming follows the convention:
+
+- `feature/{JIRA_TICKET}-{short-description}`
+- `bug/{JIRA_TICKET}-{short-description}`
+
+Example: `feature/DL-123-add-event-registrations-model`
diff --git a/skills/lfx-data-engineer/references/key-macros.md b/skills/lfx-data-engineer/references/key-macros.md
new file mode 100644
index 0000000..dd4ca4e
--- /dev/null
+++ b/skills/lfx-data-engineer/references/key-macros.md
@@ -0,0 +1,434 @@
+<!-- Copyright The Linux Foundation and each contributor to LFX. -->
+<!-- SPDX-License-Identifier: MIT -->
+
+# Key Macros Reference
+
+The lf-dbt project includes reusable macros in the `macros/` directory. This
+reference covers the macros developers use most frequently.
+
+---
+
+## Source and Environment Macros
+
+### `smart_source(source_name, table_name, timestamp_col, lookback_window)`
+
+**File:** `macros/smart_source.sql`
+
+A development-friendly wrapper around `source()` that limits data volume in
+non-production environments.
+
+| Environment | Behavior |
+|-------------|----------|
+| `no_data` (CI) | Wraps source in `WHERE 1=0` — validates schema only, no data |
+| Dev (with `timestamp_col`) | Filters to last N days (default 30) for faster builds |
+| `prod` / `stage` | Returns raw `source()` reference — full data |
+
+**Usage:**
+
+```sql
+-- Bronze model with dev lookback on a timestamp column
+FROM {{ smart_source('fivetran_platform', 'event', 'created_date', 30) }}
+
+-- Without timestamp lookback (full table in all environments except CI)
+FROM {{ smart_source('fivetran_platform', 'event') }}
+```
+
+**When to use:** Bronze models reading from large source tables. Use instead of
+raw `source()` when the source has a timestamp column suitable for filtering.
+
+---
+
+### `get_warehouse(warehouse_type)`
+
+**File:** `macros/get_environment_warehouse.sql`
+
+Selects the appropriate Snowflake warehouse based on model size and environment.
+
+| `warehouse_type` | Production Warehouse | Dev/CI Override |
+|-------------------|---------------------|-----------------|
+| `'default'` | `DBT_PROD` | `DBT_DEV` (dev), `DBT_STG` (CI) |
+| `'hourly'` | `DBT_HOURLY` | `DBT_DEV` (dev), `DBT_STG` (CI) |
+| `'medium'` | `DBT_PROD_MED` | `DBT_DEV` (dev), `DBT_STG` (CI) |
+
+**Convenience macros:**
+
+- `get_environment_warehouse()` — alias for `get_warehouse('default')`
+- `get_hourly_warehouse()` — alias for `get_warehouse('hourly')`
+- `get_medium_warehouse()` — alias for `get_warehouse('medium')`
+
+**Usage:**
+
+```sql
+{% set warehouse = get_warehouse('hourly') %}
+
+{{ config(snowflake_warehouse=warehouse) }}
+
+SELECT ...
+```
+
+**When to use:** Any model that reads from large tables or performs heavy
+aggregations. Most bronze and platinum models use `get_warehouse('hourly')`.
+
+---
+
+### `generate_alias_name` / `generate_schema_name`
+
+**File:** `macros/generate_alias_name.sql`, `macros/generate_schema_name.sql`
+
+These macros control how dbt resolves table names in Snowflake.
+
+**`generate_alias_name`** strips the schema prefix from the model name. A model
+named `silver_dim_users.sql` configured with `+schema: silver_dim` becomes
+table `USERS` (not `SILVER_DIM_USERS`) in the `SILVER_DIM` schema.
+
+**`generate_schema_name`** handles environment-specific schema naming:
+- Production: uses the schema name directly (e.g., `SILVER_DIM`)
+- Dev: prepends your personal schema (e.g., `your_schema_SILVER_DIM`)
+
+These macros run automatically — you do not call them in model code. But
+understanding them is important for knowing where your tables will land.
+
+---
+
+## Timestamp and Date Macros
+
+### `format_timestamp(original_column_name, target_column_name, data_type, local_tz, source_tz)`
+
+**File:** `macros/format_timestamp.sql`
+
+Generates standardized timestamp/date columns with proper naming conventions.
+
+| `data_type` | Output Columns |
+|-------------|---------------|
+| `'date'` | `{target_column_name}_date` (via `TO_DATE()`) |
+| `'timestamp'` | `{target_column_name}_ts` (UTC) + `{target_column_name}_ts_local` (local timezone) |
+
+**Usage:**
+
+```sql
+SELECT
+    {{ format_timestamp('created_at', 'created', 'timestamp', 'America/New_York') }},
+    {{ format_timestamp('birth_date', 'birth', 'date', 'UTC') }}
+FROM {{ source('my_source', 'my_table') }}
+```
+
+**Produces:**
+
+```sql
+convert_timezone('UTC', 'UTC', created_at) AS created_ts,
+convert_timezone('UTC', 'America/New_York', created_at) AS created_ts_local,
+to_date(birth_date) AS birth_date
+```
+
+**When to use:** Bronze models normalizing timestamps from source systems.
+
+---
+
+### `to_utc_timestamp(local_ts, local_tz)`
+
+**File:** `macros/format_timestamp.sql`
+
+Converts a local timestamp to UTC when the timezone is stored in a column
+rather than being a constant.
+
+**Usage:**
+
+```sql
+SELECT
+    {{ to_utc_timestamp('event_start_time', 'event_timezone') }} AS event_start_ts
+FROM {{ ref('bronze_events') }}
+```
+
+**When to use:** When the timezone varies per row (e.g., events in different
+timezones with the timezone stored as a column value).
+
+---
+
+## Date Range Filter Macros
+
+**File:** `macros/date_range_helpers.sql`
+
+These macros generate `WHERE` clause conditions for time-windowed filtering.
+They are the backbone of platinum models that pre-compute metrics over specific
+time periods.
+
+### Current-Period Macros (Exclude Today by Default)
+
+| Macro | Window |
+|-------|--------|
+| `is_last_x_days(date, days)` | Generic N-day lookback |
+| `is_last_7_days(date)` | Last 7 days (days -8 to -1) |
+| `is_last_14_days(date)` | Last 14 days |
+| `is_last_30_days(date)` | Last 30 days |
+| `is_last_90_days(date)` | Last 90 days |
+| `is_last_6_months(date)` | Last 6 calendar months |
+| `is_last_12_months(date)` | Last 12 calendar months |
+| `is_last_24_months(date)` | Last 24 calendar months |
+| `is_last_48_months(date)` | Last 48 calendar months |
+| `is_last_quarter(date)` | Most recently completed quarter |
+| `is_year_to_date(date)` | Jan 1 of current year through yesterday |
+| `is_current_year(date)` | Full current calendar year |
+| `is_specific_year(date, year)` | A specific calendar year |
+| `is_alltime(date)` | All dates up to today |
+| `is_before_today(date)` | Strictly before today |
+| `is_before_or_today(date)` | Up to and including today |
+
+**Usage:**
+
+```sql
+-- Filter to last 30 days
+WHERE {{ is_last_30_days('activity_date') }}
+
+-- Filter to year-to-date
+WHERE {{ is_year_to_date('event_start_date') }}
+
+-- Generic lookback
+WHERE {{ is_last_x_days('created_ts', 60) }}
+```
+
+### Completed Year Macros
+
+| Macro | Window |
+|-------|--------|
+| `is_last_completed_year(date)` | Previous full calendar year |
+| `is_prev_completed_year(date)` | 2 years ago (full year) |
+| `is_3rd_last_completed_year(date)` | 3 years ago |
+| `is_4th_last_completed_year(date)` | 4 years ago |
+| `is_5th_last_completed_year(date)` | 5 years ago |
+
+### Quarter Macros
+
+| Macro | Window |
+|-------|--------|
+| `is_last_x_quarters(date, quarters)` | Last N completed quarters |
+| `is_x_quarters_ago(date, quarters)` | A single completed quarter N quarters ago |
+| `is_current_quarter(date)` | Current calendar quarter (from `date_range_helpers_surveys.sql`) |
+
+### Cumulative / "Up To" Macros
+
+| Macro | Window |
+|-------|--------|
+| `is_up_to_year_to_date(date)` | Everything before today |
+| `is_up_to_last_completed_year(date)` | Everything through end of last year |
+| `is_up_to_prev_completed_year(date)` | Everything through end of 2 years ago |
+
+---
+
+### Previous-Period Macros (for Period-over-Period Comparisons)
+
+These macros define the period immediately before the corresponding
+`is_last_*` window, enabling percent-change and delta calculations.
+
+| Macro | Window |
+|-------|--------|
+| `is_prev_7_days(date)` | Days -14 to -8 (the week before `is_last_7_days`) |
+| `is_prev_14_days(date)` | Days -28 to -15 |
+| `is_prev_30_days(date)` | Days -60 to -31 |
+| `is_prev_90_days(date)` | Days -180 to -91 |
+| `is_prev_6_months(date)` | Months -12 to -7 |
+| `is_prev_12_months(date)` | Months -24 to -13 |
+| `is_prev_24_months(date)` | Months -48 to -25 |
+| `is_prev_quarter(date)` | The quarter before `is_last_quarter` |
+| `is_prev_year_to_date(date)` | Same YTD window, shifted back one year (handles leap years) |
+
+**Usage:**
+
+```sql
+-- Current period
+SUM(CASE WHEN {{ is_last_30_days('activity_date') }} THEN 1 ELSE 0 END) AS last_30_days_count,
+
+-- Previous period for comparison
+SUM(CASE WHEN {{ is_prev_30_days('activity_date') }} THEN 1 ELSE 0 END) AS prev_30_days_count
+```
+
+---
+
+### "Through Today" Variants
+
+These macros shift the window to include today. Used primarily by social
+listening models. The day count stays the same but the window slides forward
+by one day.
+
+| Macro | Window |
+|-------|--------|
+| `is_last_7_days_through_today(date)` | Days -6 to 0 (includes today) |
+| `is_last_30_days_through_today(date)` | Days -29 to 0 |
+| `is_last_90_days_through_today(date)` | Days -89 to 0 |
+| `is_last_12_months_through_today(date)` | 12 months back through today |
+| `is_year_to_date_through_today(date)` | Jan 1 through today |
+
+Matching previous-period macros exist:
+`is_prev_7_days_through_today(date)`, `is_prev_30_days_through_today(date)`, etc.
+
+---
+
+### Month-Overlap Macros
+
+For monthly-grain data where you need to check if a month falls within a window:
+
+| Macro | Purpose |
+|-------|---------|
+| `month_overlaps_last_x_days(date, days)` | Does the month containing `date` overlap the last N days? |
+| `month_overlaps_last_x_months(date, months)` | Does the month containing `date` overlap the last N months? |
+
+---
+
+### Unified Time Range Filter
+
+```sql
+-- Filters based on a time_range_name column
+WHERE {{ time_range_filter('date_column', 'time_range_column') }}
+```
+
+Supports `'past_365_days'`, `'past_2_years'`, and `'alltime'` values. Used by
+ecosystem influence models.
+
+---
+
+## Date/Time Formatting Macros
+
+**File:** `macros/format_helpers.sql`
+
+### `get_short_month(date)`
+
+Returns 3-letter month abbreviation: `'Jan'`, `'Feb'`, ..., `'Dec'`
+
+### `get_month(date)`
+
+Returns full month name: `'January'`, `'February'`, ..., `'December'`
+
+### `get_quarter(date)`
+
+Returns quarter label: `'Q1'`, `'Q2'`, `'Q3'`, `'Q4'`
+
+**Usage:**
+
+```sql
+SELECT
+    {{ get_month('event_start_date') }} AS event_month,
+    {{ get_quarter('event_start_date') }} AS event_quarter,
+    {{ get_short_month('event_start_date') }} AS event_month_short
+FROM {{ ref('silver_dim_events') }}
+```
+
+---
+
+## Delta / Period-over-Period Comparison Macros
+
+**File:** `macros/delta_helpers.sql`
+
+### `add_delta_columns(metrics)`
+
+Generates `_prev`, `_diff`, and `_delta` (percent change) columns for a list
+of metric names. Expects the query to have `curr.*` and `prev.*` aliases.
+
+**Usage:**
+
+```sql
+SELECT
+    curr.project_id
+    {{ add_delta_columns(['total_commits', 'total_contributors', 'total_prs']) }}
+FROM current_period curr
+LEFT JOIN previous_period prev
+    ON curr.project_id = prev.project_id
+```
+
+**Produces** (for each metric):
+- `total_commits` — current value
+- `total_commits_prev` — previous period value
+- `total_commits_diff` — absolute difference
+- `total_commits_delta` — percent change (100% if previous was 0)
+
+### `add_share_of_total(metrics)`
+
+Generates `_share` (percent of total) and `_total_delta` columns.
+
+---
+
+## Data Quality and Filtering Macros
+
+### `gdpr_filter_email(email_field)`
+
+**File:** `macros/gdpr_filter.sql`
+
+Excludes rows where the email matches a GDPR suppression or deletion request.
+
+```sql
+WHERE {{ gdpr_filter_email('u.email') }}
+```
+
+### `gdpr_filter_email_list(email_list_field, delimiter)`
+
+Filters rows where any email in a delimited list matches a GDPR request.
+Supports `;`, `,`, `:`, `|` delimiters.
+
+```sql
+WHERE {{ gdpr_filter_email_list('cc_emails', ';') }}
+```
+
+---
+
+### Email Validation Macros
+
+**File:** `macros/email_validation.sql`
+
+| Macro | Purpose |
+|-------|---------|
+| `is_valid_email(email_field)` | Regex validation of email format |
+| `email_filter_clause(email_field)` | Not null + not empty + valid format |
+| `exclude_test_emails(email_field)` | Excludes test, example, noreply, retired addresses |
+| `comprehensive_email_filter(email_field)` | Combines `email_filter_clause` + `exclude_test_emails` |
+
+```sql
+-- Full email validation
+WHERE {{ comprehensive_email_filter('email') }}
+
+-- Just format check
+WHERE {{ is_valid_email('email') }}
+```
+
+---
+
+### Common Filters
+
+**File:** `macros/common_filters.sql`
+
+| Macro | Purpose |
+|-------|---------|
+| `filter_code_contributions_non_bot(table_alias)` | Excludes bot contributions from code activity data |
+| `exclude_individual_account(account)` | Filters out individual/no-account Salesforce records |
+| `is_organization_domain(domain)` | Checks that an email domain is not a consumer provider (gmail, yahoo, etc.) |
+
+```sql
+-- Filter to human code contributions only
+WHERE {{ filter_code_contributions_non_bot('c') }}
+
+-- Exclude individual Salesforce accounts
+WHERE {{ exclude_individual_account('account_id') }}
+```
+
+---
+
+### Formatting and Cleanup Macros
+
+**File:** `macros/format_helpers.sql`
+
+| Macro | Purpose |
+|-------|---------|
+| `format_country(country)` | Normalizes messy country names to canonical values (handles US/USA/U.S.A., UK variants, etc.) |
+| `clean_name_field(field)` | Cleans garbage/placeholder values from name fields (null, unknown, test, N/A, etc.) |
+| `format_repository_url(repository_url)` | Lowercases and strips `.git` suffix |
+| `email_to_domain(email)` | Extracts domain from an email address |
+| `extract_repo_name(url_column)` | Extracts repository name from a git URL |
+| `format_commit_url(repository_url, commit_id)` | Generates a clickable commit URL for GitHub, GitLab, Bitbucket, or kernel.org |
+| `parse_github_username(field)` | Extracts a GitHub username from a URL or raw value |
+| `parse_linkedin_username(field)` | Extracts a LinkedIn username from a URL or raw value |
+| `is_apac_country(billing_country_column)` | Checks if a country is in the APAC region (China, HK, Taiwan, Macao) |
+
+```sql
+SELECT
+    {{ format_country('raw_country') }} AS country,
+    {{ clean_name_field('first_name') }} AS first_name,
+    {{ email_to_domain('email') }} AS email_domain
+FROM {{ ref('bronze_source') }}
+```
diff --git a/skills/lfx-data-engineer/references/medallion-architecture.md b/skills/lfx-data-engineer/references/medallion-architecture.md
new file mode 100644
index 0000000..f4d0642
--- /dev/null
+++ b/skills/lfx-data-engineer/references/medallion-architecture.md
@@ -0,0 +1,433 @@
+<!-- Copyright The Linux Foundation and each contributor to LFX. -->
+<!-- SPDX-License-Identifier: MIT -->
+
+# Medallion Architecture Guide
+
+The lf-dbt project follows a four-layer medallion architecture. Each layer has
+a specific purpose, materialization strategy, and set of conventions.
+
+## Layer Overview
+
+```text
+┌──────────────────────────────────────────────────────────────────┐
+│  Platinum   │  Pre-computed reports with time windows            │
+│             │  Dashboard-ready data (PCC, ID, OD, Insights)      │
+├─────────────┼────────────────────────────────────────────────────┤
+│  Gold       │  Aggregated metrics for specific business cases    │
+│             │  Code contributions by org, enrollment counts      │
+├─────────────┼────────────────────────────────────────────────────┤
+│  Silver     │  Business logic, joins, transformations            │
+│             │  Reusable objects: Users, Projects, Activities     │
+├─────────────┼────────────────────────────────────────────────────┤
+│  Bronze     │  1:1 with source data                              │
+│             │  Column renames, type casting, delete filtering    │
+└─────────────┴────────────────────────────────────────────────────┘
+         ▲              ▲              ▲              ▲
+     Raw Sources    source()        ref()          ref()
+```
+
+---
+
+## Bronze Layer
+
+### Purpose
+
+Bronze models provide a clean, renamed view of raw source data. They are the
+only layer that reads from `source()` — all other layers use `ref()`.
+
+### Conventions
+
+- **Materialization:** `view` (default)
+- **Schema:** `bronze_*` per source system (e.g., `bronze_fivetran_platform`)
+- **One model per source table** — no joins
+- **No business logic** — only column renames, type casting, and filtering
+
+### What Belongs Here
+
+- Column renames from source naming to snake_case business names
+- Type casting (e.g., string to date)
+- Filtering deleted records (`_fivetran_deleted`)
+- Filtering test data (`is_test`)
+- Timestamp normalization to UTC using `format_timestamp()`
+
+### Example: Bronze Event Model
+
+```sql
+-- Copyright The Linux Foundation and each contributor to LFX.
+-- SPDX-License-Identifier: MIT
+
+{% set warehouse = get_warehouse('hourly') %}
+
+{{ config(snowflake_warehouse=warehouse) }}
+
+SELECT
+    id AS event_id,
+    event_start_date,
+    event_end_date,
+    event_title AS event_name,
+    currency,
+    project_id,
+    salesforce_id AS salesforce_event_id,
+    event_location,
+    IFF(event_status_name = 'Complete', 'Completed', event_status_name) AS event_status,
+    city AS event_city,
+    country AS event_country,
+    created_date AS event_created_ts,
+    event_category,
+    event_code,
+    account_stub AS event_account_stub,
+    source,
+    lastmodified_date AS updated_at
+
+FROM {{ source('fivetran_platform', 'event') }}
+WHERE
+    NOT _fivetran_deleted
+    AND NOT is_test
+```
+
+### Key Patterns
+
+- `source('schema_name', 'table_name')` or `smart_source()` for dev lookback
+- `get_warehouse('hourly')` for large source tables
+- Column naming: `_ts` for timestamps, `_date` for dates, `is_`/`has_` for booleans
+- Filter `_fivetran_deleted` when the source has Fivetran soft deletes
+
+### File Naming
+
+`bronze_{source_system}_{table_name}.sql`
+
+Examples:
+- `bronze_fivetran_platform_events.sql`
+- `bronze_fivetran_salesforce_projects.sql`
+- `bronze_kafka_crowd_dev_activities.sql`
+
+---
+
+## Silver Layer
+
+### Purpose
+
+Silver models apply business logic, join multiple bronze models, and create
+reusable business objects. They are divided into two subfolders:
+
+- **`dim/`** — Dimensions: slowly-changing attributes (users, projects, organizations)
+- **`fact/`** — Facts: events and transactions (activities, registrations, contributions)
+
+### Conventions
+
+- **Materialization:** `table`
+- **Schema:** `silver_dim` or `silver_fact`
+- **Table naming:** The `generate_alias_name` macro strips the schema prefix.
+  A model named `silver_dim_users.sql` becomes table `USERS` in the
+  `SILVER_DIM` schema (not `SILVER_DIM_USERS`).
+- **Block comment** at the top explaining purpose, questions answered, and data sources
+
+### What Belongs Here
+
+- Joins across multiple bronze models
+- Business rules and transformations
+- Deduplication logic
+- Enrichment from reference data
+- Reusable objects that serve multiple downstream use cases
+
+### Example: Silver Dimension Model
+
+```sql
+-- Copyright The Linux Foundation and each contributor to LFX.
+-- SPDX-License-Identifier: MIT
+
+/*
+This model creates a standardized dimension table for projects.
+
+## Purpose:
+- Provides a comprehensive view of projects with all relevant attributes
+
+## Questions this model can help answer:
+1. What is the hierarchical structure of projects?
+2. Which projects belong to specific foundations?
+3. What is the current health score of a project?
+
+## Data sources:
+- bronze_fivetran_salesforce_projects
+- silver_fact_crowd_dev_project_health_metrics
+*/
+
+{% set warehouse = get_warehouse('hourly') %}
+
+{{ config(snowflake_warehouse=warehouse) }}
+
+WITH latest_health_metrics AS (
+    SELECT
+        project_slug,
+        metric_date AS health_metric_date,
+        health_score,
+        health_score_category
+    FROM {{ ref('silver_fact_crowd_dev_project_health_metrics') }}
+    QUALIFY ROW_NUMBER() OVER (
+        PARTITION BY project_slug
+        ORDER BY metric_date DESC
+    ) = 1
+),
+
+projects AS (
+    SELECT
+        project_id,
+        project_name,
+        project_slug,
+        project_status
+    FROM {{ ref('bronze_fivetran_salesforce_projects') }}
+)
+
+SELECT
+    p.project_id,
+    p.project_name,
+    p.project_slug,
+    p.project_status,
+    h.health_score,
+    h.health_score_category,
+    h.health_metric_date
+FROM projects p
+LEFT JOIN latest_health_metrics h
+    ON p.project_slug = h.project_slug
+```
+
+### Helper Models
+
+Silver includes `helper_models/` subfolders for reusable SQL fragments. These
+files start with a `_` prefix (e.g., `_silver_dim_project_spine.sql`) and are
+not full models — they serve as building blocks for other models.
+
+The `_silver_dim_project_spine.sql` helper is particularly important: it fans
+out projects to their parent hierarchy for downstream aggregation.
+
+### File Naming
+
+- Dimensions: `silver_dim_{entity}.sql` (e.g., `silver_dim_users.sql`)
+- Facts: `silver_fact_{domain}_{entity}.sql` (e.g., `silver_fact_event_registrations.sql`)
+- Helpers: `_silver_{dim|fact}_{name}.sql` (e.g., `_silver_dim_project_spine.sql`)
+
+---
+
+## Gold Layer
+
+### Purpose
+
+Gold models combine silver models into purpose-built datasets for specific
+business use cases. They answer specific analytical questions without requiring
+additional joins.
+
+### Conventions
+
+- **Materialization:** `table`
+- **Schema:** `gold_*` per domain (e.g., `gold_fact`, `gold_reporting`)
+- **Surrogate keys** via `dbt_utils.generate_surrogate_key()` for composite primary keys
+- **`unique_key`** in config for incremental models
+
+### What Belongs Here
+
+- Aggregated metrics (code contributions by org, enrollment counts)
+- Purpose-built datasets that downstream consumers query directly
+- Fan-out logic using the project spine helper
+
+### Example: Gold Fact Model
+
+```sql
+-- Copyright The Linux Foundation and each contributor to LFX.
+-- SPDX-License-Identifier: MIT
+
+{{ config(unique_key=["_key", "project_id"]) }}
+
+SELECT
+    ({{ dbt_utils.generate_surrogate_key(["c._key", "p.mapped_project_id"]) }}) AS activity_project_id,
+    c._key,
+    c.activity_id,
+    c.activity_ts,
+    c.activity_type,
+    c.activity_category,
+    c.member_id,
+    c.github_username,
+    c.repository_url,
+    p.mapped_project_id AS project_id,
+    p.mapped_project_slug AS project_slug,
+    p.mapped_project_name AS project_name,
+    c.additions,
+    c.deletions,
+    COALESCE(c.is_pr_approved, FALSE) AS is_pr_approved,
+    c.is_org_contribution,
+    c.member_is_bot,
+    (
+        ROW_NUMBER() OVER (
+            PARTITION BY p.mapped_project_id, c.member_id
+            ORDER BY c.activity_ts
+        ) = 1
+    ) AS is_members_first_project_contribution
+
+FROM {{ ref("silver_fact_crowd_dev_activities") }} c
+LEFT JOIN {{ ref("_silver_dim_project_spine") }} p
+    ON c.project_id = p.base_project_id
+WHERE
+    p.mapped_project_id IS NOT NULL
+    AND {{ filter_code_contributions_non_bot('c') }}
+```
+
+### File Naming
+
+`gold_fact_{domain}.sql` or `gold_{purpose}_{entity}.sql`
+
+Examples:
+- `gold_fact_code_contributions.sql`
+- `gold_fact_enrollments.sql`
+- `gold_fact_course_purchases.sql`
+
+---
+
+## Platinum Layer
+
+### Purpose
+
+Platinum models produce dashboard-ready data with pre-computed time windows.
+Consumers query platinum tables directly without needing date range filters.
+
+### Conventions
+
+- **Materialization:** `table`
+- **Schema:** `platinum*` per product (e.g., `platinum_organization_dashboard`)
+- **Date range macros** for time-windowed aggregations
+- **`get_warehouse()`** for resource-intensive computations
+- **`GROUP BY ALL`** is acceptable for complex aggregations
+- **`QUALIFY`** with `ROW_NUMBER()` for deduplication
+
+### What Belongs Here
+
+- Pre-computed metrics by time period (last 7 days, last 30 days, YTD)
+- Period-over-period comparisons (current vs previous period)
+- Dashboard-specific data shapes
+- Delta calculations using `add_delta_columns()`
+
+### Example: Platinum Dashboard Model
+
+```sql
+-- Copyright The Linux Foundation and each contributor to LFX.
+-- SPDX-License-Identifier: MIT
+
+{% set warehouse = get_warehouse('hourly') %}
+
+{{ config(snowflake_warehouse=warehouse) }}
+
+WITH sponsors AS (
+    SELECT
+        event_id,
+        contact_id
+    FROM {{ ref('silver_fact_event_sponsorships') }}
+    GROUP BY ALL
+),
+
+event_registrations AS (
+    SELECT
+        er.registration_id,
+        mu.user_id,
+        mu.user_name,
+        er.event_id,
+        er.event_name,
+        er.event_start_date,
+        er.event_end_date,
+        er.project_id,
+        er.user_attended,
+        er.registration_status,
+        CASE
+            WHEN sp.contact_id IS NOT NULL THEN 'Sponsor'
+            WHEN er.is_event_speaker THEN 'Speaker'
+            WHEN er.user_attended = TRUE THEN 'Attendee'
+            ELSE 'Registered'
+        END AS user_role
+    FROM {{ ref('silver_fact_event_registrations') }} er
+    INNER JOIN {{ ref('bronze_fivetran_salesforce_merged_user') }} mu
+        ON er.user_id = mu.user_id
+    LEFT JOIN sponsors sp
+        ON mu.user_id = sp.contact_id
+        AND er.event_id = sp.event_id
+    WHERE
+        er.event_name IS NOT NULL
+        AND er.event_start_date IS NOT NULL
+    GROUP BY ALL
+)
+
+SELECT
+    ({{ dbt_utils.generate_surrogate_key(['user_id', 'event_id']) }}) AS _key,
+    registration_id,
+    user_id,
+    user_name,
+    event_id,
+    event_name,
+    event_start_date,
+    event_end_date,
+    project_id,
+    user_attended,
+    user_role,
+    registration_status
+FROM event_registrations
+QUALIFY ROW_NUMBER() OVER (
+    PARTITION BY user_id, event_id
+    ORDER BY event_start_date
+) = 1
+```
+
+### Product Folders
+
+Platinum models are organized by dashboard/product:
+
+| Folder | Dashboard |
+|--------|-----------|
+| `individual_dashboard/` | Individual Dashboard (ID) |
+| `organization_dashboard/` | Organization Dashboard (OD) |
+| `lfx_one/` | LFX One platform |
+| `events/` | Events metrics |
+| `code_contributions/` | Code contribution analytics |
+| `enrollments/` | Training enrollment reports |
+| `membership/` | Membership metrics |
+| `marketing/` | Marketing analytics |
+| `sales_metrics/` | Sales pipeline reports |
+
+### File Naming
+
+`platinum_{product}_{entity}.sql`
+
+Examples:
+- `platinum_individual_dashboard_event_registrations.sql`
+- `platinum_organization_dashboard_overview.sql`
+- `platinum_lfx_one_project_code_commits.sql`
+
+---
+
+## Decision Tree: Which Layer?
+
+```text
+Is this reading directly from a raw source table?
+  └─ YES → Bronze
+  └─ NO  → Does it create a reusable business object (users, projects, activities)?
+              └─ YES → Silver (dim/ for attributes, fact/ for events)
+              └─ NO  → Does it aggregate metrics for a specific use case?
+                          └─ YES → Is it pre-computed with time windows for a dashboard?
+                                      └─ YES → Platinum
+                                      └─ NO  → Gold
+                          └─ NO  → Silver (it's probably a helper or intermediate model)
+```
+
+## Schema Mapping Reference
+
+| Layer + Folder | Snowflake Schema (Production) |
+|---------------|-------------------------------|
+| `bronze/fivetran_platform/` | `BRONZE_FIVETRAN_PLATFORM` |
+| `bronze/fivetran_salesforce/` | `BRONZE_SALESFORCE` |
+| `bronze/kafka_crowd_dev/` | `BRONZE_KAFKA_CROWD_DEV` |
+| `bronze/stripe/` | `BRONZE_STRIPE` |
+| `silver/dim/` | `SILVER_DIM` |
+| `silver/fact/` | `SILVER_FACT` |
+| `gold/fact/` | `GOLD_FACT` |
+| `gold/reporting/` | `GOLD_REPORTING` |
+| `platinum/individual_dashboard/` | `PLATINUM_INDIVIDUAL_DASHBOARD` |
+| `platinum/organization_dashboard/` | `PLATINUM_ORGANIZATION_DASHBOARD` |
+| `platinum/lfx_one/` | `PLATINUM_LFX_ONE` |
+
+In dev, schemas are prefixed with your personal schema:
+`{your_schema}_BRONZE_FIVETRAN_PLATFORM`, etc.
diff --git a/skills/lfx-data-engineer/references/sql-style-guide.md b/skills/lfx-data-engineer/references/sql-style-guide.md
new file mode 100644
index 0000000..f5bb3ef
--- /dev/null
+++ b/skills/lfx-data-engineer/references/sql-style-guide.md
@@ -0,0 +1,289 @@
+<!-- Copyright The Linux Foundation and each contributor to LFX. -->
+<!-- SPDX-License-Identifier: MIT -->
+
+# SQL Style Guide
+
+This guide consolidates the formatting rules enforced by `.sqlfluff` and the
+project's coding standards. All SQL files must pass `sqlfluff lint` before
+being committed.
+
+## Keyword and Identifier Casing
+
+| Element | Casing | Example |
+|---------|--------|---------|
+| SQL keywords | UPPERCASE | `SELECT`, `FROM`, `WHERE`, `LEFT JOIN`, `GROUP BY` |
+| Column names | lowercase | `event_id`, `project_name`, `created_ts` |
+| Table aliases | lowercase | `FROM users u`, `JOIN projects p` |
+| Functions | UPPERCASE | `SUM()`, `COUNT()`, `COALESCE()`, `ROW_NUMBER()` |
+| Literals | UPPERCASE | `TRUE`, `FALSE`, `NULL` |
+| Type casts | lowercase shorthand | `::int`, `::string`, `::date` (not `CAST()`) |
+
+## Indentation
+
+- Use **4 spaces** (not tabs)
+- Do not right-align aliases
+- Use **trailing commas** in SELECT statements
+
+```sql
+-- CORRECT
+SELECT
+    user_id,
+    user_name,
+    email,
+    created_ts
+
+-- WRONG (leading commas)
+SELECT
+    user_id
+    , user_name
+    , email
+    , created_ts
+
+-- WRONG (right-aligned aliases)
+SELECT
+    userId                                as user_id,
+    convert_timezone('UTC', createdDate)  as created_date
+```
+
+## SELECT Statements
+
+- Fields should be stated before aggregates and window functions
+- Group-by columns are always listed first in the SELECT
+- Final SELECT must explicitly list all columns — no `SELECT *`
+- `SELECT DISTINCT` is not allowed (requires architect approval)
+- Use `GROUP BY` or `QUALIFY ROW_NUMBER()` instead of `DISTINCT`
+
+```sql
+-- CORRECT: explicit columns, group-by fields first
+SELECT
+    project_id,
+    project_name,
+    COUNT(*) AS total_events,
+    SUM(revenue) AS total_revenue
+FROM events
+GROUP BY 1, 2
+
+-- WRONG: SELECT *
+SELECT * FROM events
+```
+
+## GROUP BY and ORDER BY
+
+- Prefer ordering and grouping **by number**: `GROUP BY 1, 2`
+- If grouping by more than a few columns, reconsider the model design
+- `GROUP BY ALL` is acceptable in platinum models for complex aggregations
+
+```sql
+-- CORRECT
+GROUP BY 1, 2, 3
+
+-- ACCEPTABLE in platinum models
+GROUP BY ALL
+```
+
+## JOINs
+
+- **Default to INNER JOIN** — use LEFT JOIN only when the right side may have
+  no matches and you still want rows from the left
+- **RIGHT JOIN is not allowed** — rewrite as LEFT JOIN
+- Specify join keys explicitly — **do not use `USING`** (Snowflake has
+  inconsistencies with `USING` results)
+- When joining two or more tables, always **prefix columns with the table alias**
+- Pre-filter complex conditions in a CTE before the join
+- Do **not** filter on the right side of a LEFT JOIN in the `WHERE` clause
+  (this negates the LEFT JOIN). Filter in the `ON` clause or in a CTE.
+
+```sql
+-- CORRECT: filter in ON clause
+SELECT
+    l.user_id,
+    r.event_name
+FROM users l
+LEFT JOIN events r
+    ON l.user_id = r.user_id
+    AND r.event_status = 'Active'
+
+-- WRONG: filtering right side in WHERE (turns LEFT JOIN into INNER JOIN)
+SELECT
+    l.user_id,
+    r.event_name
+FROM users l
+LEFT JOIN events r
+    ON l.user_id = r.user_id
+WHERE
+    r.event_status = 'Active'
+
+-- WRONG: using USING
+FROM users u
+JOIN events e USING (user_id)
+```
+
+## CTEs (Common Table Expressions)
+
+- Use CTEs instead of subqueries in `FROM` or `JOIN` clauses (enforced by
+  sqlfluff rule `ST05`)
+- Each CTE should perform a **single, logical unit of work**
+- CTE names should be **verbose** enough to convey what they do
+- CTEs with confusing or notable logic should have a comment
+- CTEs duplicated across models should be pulled into their own models or macros
+
+```sql
+-- CORRECT: CTEs for logical units
+WITH active_events AS (
+    SELECT
+        event_id,
+        event_name,
+        event_start_date
+    FROM {{ ref('bronze_fivetran_platform_events') }}
+    WHERE event_status = 'Active'
+),
+
+event_registrations AS (
+    SELECT
+        event_id,
+        COUNT(*) AS registration_count
+    FROM {{ ref('silver_fact_event_registrations') }}
+    GROUP BY 1
+)
+
+SELECT
+    e.event_id,
+    e.event_name,
+    e.event_start_date,
+    COALESCE(r.registration_count, 0) AS registration_count
+FROM active_events e
+LEFT JOIN event_registrations r
+    ON e.event_id = r.event_id
+
+-- WRONG: subquery in FROM
+SELECT *
+FROM (
+    SELECT event_id, event_name
+    FROM events
+    WHERE event_status = 'Active'
+) e
+```
+
+## Table Aliasing
+
+- Use the `AS` keyword when aliasing columns
+- Table aliases do not require `AS` (implicit aliasing is allowed)
+- When selecting from a single table, do **not** prefix columns with the alias
+
+```sql
+-- CORRECT: single table, no prefix
+SELECT
+    user_id,
+    user_name,
+    email
+FROM users
+
+-- CORRECT: multiple tables, always prefix
+SELECT
+    u.user_id,
+    u.user_name,
+    e.event_name
+FROM users u
+INNER JOIN events e
+    ON u.user_id = e.organizer_id
+```
+
+## CASE Statements
+
+- `CASE` and `END` on their own lines
+- Conditions indented inside the block
+- Multiple boolean conditions on separate lines
+
+```sql
+-- CORRECT
+CASE
+    WHEN status = 'Active'
+    AND is_verified = TRUE
+    THEN 'Active Verified'
+    WHEN status = 'Inactive'
+    THEN 'Inactive'
+    ELSE 'Unknown'
+END AS status_label,
+```
+
+## WHERE Clauses
+
+- Single conditions can be inline: `WHERE event_status = 'Active'`
+- Multiple conditions on separate lines, indented
+- `OR` conditions enclosed in parentheses
+
+```sql
+-- CORRECT: multiple conditions
+WHERE
+    event_status = 'Active'
+    AND event_start_date >= CURRENT_DATE()
+    AND (
+        event_type = 'Conference'
+        OR event_type = 'Meetup'
+    )
+```
+
+## Data Types
+
+The project normalizes data types. These names are **blocked** by sqlfluff:
+
+| Blocked Type | Use Instead |
+|-------------|-------------|
+| `NUMBER`, `NUMERIC` | `DECIMAL` |
+| `INTEGER`, `BIGINT`, `SMALLINT`, `TINYINT`, `BYTEINT` | `INT` |
+| `DOUBLE`, `REAL` | `FLOAT` |
+| `CHARACTER` | `CHAR` |
+| `DATETIME` | `TIMESTAMP_NTZ` |
+
+If an exception is required, add `-- noqa: L062` with a comment explaining why.
+
+## Type Casting
+
+Use shorthand casting (enforced by sqlfluff):
+
+```sql
+-- CORRECT
+column_name::int
+column_name::date
+column_name::string
+
+-- WRONG
+CAST(column_name AS INT)
+CONVERT(INT, column_name)
+```
+
+## Newlines and Readability
+
+**DO NOT OPTIMIZE FOR A SMALLER NUMBER OF LINES OF CODE.**
+Newlines are cheap; brain time is expensive.
+
+- Long lines should be broken up if it improves readability
+- Any clause with more than one item should be listed on new lines, indented
+- Conform to the existing style in a file, even if it contradicts this guide
+
+## Running sqlfluff
+
+```bash
+# Lint a specific file
+sqlfluff lint models/bronze/fivetran_platform/bronze_fivetran_platform_events.sql
+
+# Auto-fix formatting issues
+sqlfluff fix models/bronze/fivetran_platform/bronze_fivetran_platform_events.sql
+
+# Lint via Makefile
+make lint-fix file=models/bronze/fivetran_platform/bronze_fivetran_platform_events.sql
+
+# Lint all staged files before commit
+make lint-staged-files
+
+# Auto-fix all staged files
+make fix-lint-staged-files
+```
+
+sqlfluff uses the `.sqlfluff` configuration at the repo root. Key settings:
+
+- Dialect: Snowflake
+- Templater: dbt (understands `ref()`, `source()`, Jinja)
+- No max line length
+- Macros loaded from `macros/` directory
+- Subqueries forbidden in `FROM` and `JOIN` (use CTEs)
diff --git a/skills/lfx-data-engineer/references/testing-patterns.md b/skills/lfx-data-engineer/references/testing-patterns.md
new file mode 100644
index 0000000..cfd733a
--- /dev/null
+++ b/skills/lfx-data-engineer/references/testing-patterns.md
@@ -0,0 +1,355 @@
+<!-- Copyright The Linux Foundation and each contributor to LFX. -->
+<!-- SPDX-License-Identifier: MIT -->
+
+# dbt Testing Patterns
+
+This guide covers the test conventions for the lf-dbt project, aligned with
+dbt v1.10.5+. All models must have corresponding tests in a `*_tests.yml` file
+co-located in the same directory as the model.
+
+## Test File Structure
+
+Test files use `version: 2` and the `models:` key. Each model entry includes
+a description and column definitions with data types and tests.
+
+```yaml
+# Copyright The Linux Foundation and each contributor to LFX.
+# SPDX-License-Identifier: MIT
+
+version: 2
+models:
+  - name: bronze_fivetran_platform_events
+    description: "Event data from the Fivetran Platform source."
+    config:
+      tags:
+        - "events"
+    columns:
+      - name: event_id
+        description: "Unique identifier for the event."
+        data_type: string
+        data_tests:
+          - unique
+          - not_null
+
+      - name: event_name
+        description: "The name of the event."
+        data_type: string
+        data_tests:
+          - not_null
+          - dbt_utils.not_empty_string
+
+      - name: event_start_date
+        description: "The start date of the event."
+        data_type: timestamp_tz
+        data_tests:
+          - not_null
+```
+
+---
+
+## Key Rules
+
+### Use `data_tests:` (not `tests:`)
+
+The `tests:` key is deprecated in dbt v1.10.5+. Always use `data_tests:`.
+
+```yaml
+# CORRECT
+columns:
+  - name: event_id
+    data_tests:
+      - unique
+      - not_null
+
+# WRONG (deprecated)
+columns:
+  - name: event_id
+    tests:
+      - unique
+      - not_null
+```
+
+### Use `arguments:` for Parameterized Tests
+
+Tests that accept parameters (like `accepted_values`, `relationships`) must
+wrap their arguments under the `arguments:` property.
+
+```yaml
+# CORRECT
+columns:
+  - name: status
+    data_tests:
+      - accepted_values:
+          arguments:
+            values: ["active", "inactive", "pending"]
+
+  - name: project_id
+    data_tests:
+      - relationships:
+          arguments:
+            to: ref('silver_dim_projects')
+            field: project_id
+
+# WRONG (missing arguments: wrapper)
+columns:
+  - name: status
+    data_tests:
+      - accepted_values:
+          values: ["active", "inactive", "pending"]
+```
+
+Simple tests without arguments (`unique`, `not_null`, `dbt_utils.not_empty_string`)
+do NOT need the `arguments:` wrapper.
+
+---
+
+## Primary Key Tests
+
+Every column named `_key` or `_pk` must have these three tests:
+
+```yaml
+columns:
+  - name: _key
+    description: "The unique primary key for the table."
+    data_tests:
+      - unique
+      - not_null
+      - dbt_utils.not_empty_string
+```
+
+This pattern is enforced across all layers.
+
+---
+
+## PII Tagging
+
+Columns containing personally identifiable information must be tagged using
+`config.meta`. Do NOT put `meta` at the top level — it must be nested inside
+`config`.
+
+```yaml
+# CORRECT
+columns:
+  - name: email
+    description: "User email address"
+    data_type: string
+    config:
+      meta:
+        contains_pii: true
+        data_retention: "undefined"
+
+# WRONG (meta at top level — triggers deprecation warnings)
+columns:
+  - name: email
+    description: "User email address"
+    meta:
+      contains_pii: true
+      data_retention: "undefined"
+```
+
+Always include `data_retention: "undefined"` when adding a `contains_pii` tag.
+
+Do NOT duplicate PII information across `tags` and `meta`:
+
+```yaml
+# WRONG (redundant — tags and meta both indicate PII)
+columns:
+  - name: email
+    config:
+      tags:
+        - "contains_pii"
+      meta:
+        contains_pii: true
+        data_retention: "undefined"
+
+# CORRECT (meta is the single source of truth)
+columns:
+  - name: email
+    config:
+      meta:
+        contains_pii: true
+        data_retention: "undefined"
+```
+
+### What Counts as PII
+
+- Full, first, middle, or last name
+- Email addresses
+- Phone numbers
+- Physical addresses
+- Government IDs (SSN, passport numbers)
+- Financial information
+
+---
+
+## Model-Level Configuration
+
+Tags and meta at the model level also go under `config:`:
+
+```yaml
+models:
+  - name: my_model
+    description: "Model description"
+    config:
+      tags:
+        - "events"
+      meta:
+        contains_pii: false
+        data_retention: "undefined"
+    columns:
+      - name: _key
+        data_tests:
+          - unique
+          - not_null
+```
+
+Never define `config:` twice in the same block:
+
+```yaml
+# WRONG (duplicate config key)
+models:
+  - name: my_model
+    config:
+      tags:
+        - "events"
+    config:
+      contract: { enforced: true }
+
+# CORRECT (single config block)
+models:
+  - name: my_model
+    config:
+      tags:
+        - "events"
+      contract: { enforced: true }
+```
+
+---
+
+## Test Configuration
+
+Use `config:` for test-level settings like `where`, `severity`, and error
+thresholds. Custom keys must go in `config.meta`:
+
+```yaml
+columns:
+  - name: order_id
+    data_tests:
+      - unique:
+          config:
+            error_if: ">10"
+            warn_if: ">10"
+      - not_null
+      - accepted_values:
+          arguments:
+            values: ["placed", "shipped", "completed", "returned"]
+          config:
+            where: "order_date >= CURRENT_DATE - INTERVAL '30 days'"
+            meta:
+              severity: warn
+```
+
+---
+
+## Common Test Types
+
+### Simple Tests (no arguments needed)
+
+```yaml
+data_tests:
+  - unique
+  - not_null
+  - dbt_utils.not_empty_string
+```
+
+### Accepted Values
+
+```yaml
+data_tests:
+  - accepted_values:
+      arguments:
+        values: ["Active", "Completed", "Cancelled", "Pending"]
+```
+
+### Relationships (Foreign Keys)
+
+```yaml
+data_tests:
+  - relationships:
+      arguments:
+        to: ref('silver_dim_projects')
+        field: project_id
+```
+
+### Custom Error Thresholds
+
+For known edge cases where a few duplicates are expected:
+
+```yaml
+data_tests:
+  - unique:
+      config:
+        error_if: ">10"
+        warn_if: ">10"
+```
+
+---
+
+## Unit Tests
+
+For unit tests, use the `unit_tests:` key. Custom keys like `severity` must
+go in `config.meta`:
+
+```yaml
+unit_tests:
+  - name: test_my_model_logic
+    model: my_model
+    config:
+      meta:
+        severity: warn
+    given:
+      - input: ref('source_model')
+        rows:
+          - { id: "123", status: "active" }
+          - { id: "456", status: "inactive" }
+    expect:
+      rows:
+        - { id: "123", status: "active" }
+```
+
+For detailed unit test patterns, see the `adding-dbt-unit-test` skill in the
+lf-dbt repository's `.agents/skills/` directory.
+
+---
+
+## Test File Organization
+
+Test files are co-located with models and follow this naming convention:
+
+| Layer | Test File |
+|-------|-----------|
+| Bronze | `models/bronze/{source}/bronze_{source}_tests.yml` |
+| Silver | `models/silver/dim/silver_dim_tests.yml` or `models/silver/fact/silver_fact_tests.yml` |
+| Gold | `models/gold/fact/gold_fact_tests.yml` |
+| Platinum | `models/platinum/platinum_tests.yml` or per-folder |
+
+Some layers use a single consolidated test file (like `silver_dim_tests.yml`),
+while others have per-source test files. Check the existing pattern in the
+target directory and follow it.
+
+---
+
+## Checklist for New Tests
+
+- [ ] License header at top of YML file
+- [ ] `version: 2` declared
+- [ ] `data_tests:` used (not deprecated `tests:`)
+- [ ] `arguments:` wrapper on parameterized tests
+- [ ] Primary key columns have `unique`, `not_null`, `dbt_utils.not_empty_string`
+- [ ] PII columns tagged with `config.meta.contains_pii: true`
+- [ ] `data_retention: "undefined"` included with PII tags
+- [ ] `meta` and `tags` nested under `config:` (not at top level)
+- [ ] No duplicate `config:` keys in the same block
+- [ ] Custom keys nested in `config.meta` (not directly in `config`)
+- [ ] Column `data_type` specified for key columns
+- [ ] Descriptions provided for all columns