Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 24 additions & 4 deletions .opencode/skills/cost-report/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Analyze Snowflake query costs and identify optimization opportuniti

## Requirements
**Agent:** any (read-only analysis)
**Tools used:** sql_execute, sql_analyze, finops_analyze_credits, finops_expensive_queries, finops_warehouse_advice
**Tools used:** sql_execute, sql_analyze, finops_analyze_credits, finops_expensive_queries, finops_warehouse_advice, finops_unused_resources, finops_query_history

Analyze Snowflake warehouse query costs, identify the most expensive queries, detect anti-patterns, and recommend optimizations.

Expand Down Expand Up @@ -60,7 +60,17 @@ Analyze Snowflake warehouse query costs, identify the most expensive queries, de

5. **Warehouse analysis** - Run `finops_warehouse_advice` to check if warehouses used by the top offenders are right-sized.

6. **Output the final report** as a structured markdown document:
6. **Unused resource detection** - Run `finops_unused_resources` to find:
- **Stale tables**: Tables not accessed in the last 30+ days (candidates for archival/drop)
- **Idle warehouses**: Warehouses with no query activity (candidates for suspension/removal)

Include findings in the report under a "Waste Detection" section.

7. **Query history enrichment** - Run `finops_query_history` to fetch recent execution patterns:
- Identify frequently-run expensive queries (high frequency × high cost = top optimization target)
- Find queries that could benefit from result caching or materialization

8. **Output the final report** as a structured markdown document:

```
# Snowflake Cost Report (Last 30 Days)
Expand Down Expand Up @@ -99,10 +109,20 @@ Analyze Snowflake warehouse query costs, identify the most expensive queries, de

...

## Waste Detection
### Unused Tables
| Table | Last Accessed | Size | Recommendation |
|-------|--------------|------|----------------|

### Idle Warehouses
| Warehouse | Last Query | Size | Recommendation |
|-----------|-----------|------|----------------|

## Recommendations
1. Top priority optimizations
2. Warehouse sizing suggestions
3. Scheduling recommendations
3. Unused resource cleanup
4. Scheduling recommendations
```

## Usage
Expand All @@ -111,4 +131,4 @@ The user invokes this skill with:
- `/cost-report` -- Analyze the last 30 days
- `/cost-report 7` -- Analyze the last 7 days (adjust the DATEADD interval)

Use the tools: `sql_execute`, `sql_analyze`, `finops_analyze_credits`, `finops_expensive_queries`, `finops_warehouse_advice`.
Use the tools: `sql_execute`, `sql_analyze`, `finops_analyze_credits`, `finops_expensive_queries`, `finops_warehouse_advice`, `finops_unused_resources`, `finops_query_history`.
13 changes: 11 additions & 2 deletions .opencode/skills/dbt-analyze/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Analyze downstream impact of dbt model changes using column-level l

## Requirements
**Agent:** any (read-only analysis)
**Tools used:** bash (runs `altimate-dbt` commands), read, glob, dbt_manifest, lineage_check, sql_analyze
**Tools used:** bash (runs `altimate-dbt` commands), read, glob, dbt_manifest, lineage_check, dbt_lineage, sql_analyze, altimate_core_extract_metadata

## When to Use This Skill

Expand Down Expand Up @@ -45,10 +45,19 @@ For the full downstream tree, recursively call `children` on each downstream mod

### 3. Run Column-Level Lineage

Use the `lineage_check` tool on the changed model's SQL to understand:
**With manifest (preferred):** Use `dbt_lineage` to compute column-level lineage for a dbt model. This reads the manifest.json, extracts compiled SQL and upstream schemas, and traces column flow via the Rust engine. More accurate than raw SQL lineage because it resolves `ref()` and `source()` to actual schemas.

```
dbt_lineage(model: <model_name>)
```

**Without manifest (fallback):** Use `lineage_check` on the raw SQL to understand:
- Which source columns flow to which output columns
- Which columns were added, removed, or renamed

**Extract structural metadata:** Use `altimate_core_extract_metadata` on the SQL to get tables referenced, columns used, CTEs, subqueries — useful for mapping the full dependency surface.


### 4. Cross-Reference with Downstream

For each downstream model:
Expand Down
8 changes: 6 additions & 2 deletions .opencode/skills/dbt-develop/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Create and modify dbt models — staging, intermediate, marts, incr

## Requirements
**Agent:** builder or migrator (requires file write access)
**Tools used:** bash (runs `altimate-dbt` commands), read, glob, write, edit
**Tools used:** bash (runs `altimate-dbt` commands), read, glob, write, edit, schema_search, dbt_profiles, sql_analyze, altimate_core_validate, altimate_core_column_lineage

## When to Use This Skill

Expand Down Expand Up @@ -41,11 +41,15 @@ altimate-dbt parents --model <upstream> # understand what feeds this model
altimate-dbt children --model <downstream> # understand what consumes it
```

**Check warehouse connection:** Run `dbt_profiles` to discover available profiles and map them to warehouse connections. This tells you which adapter (Snowflake, BigQuery, Postgres, etc.) and target the project uses — essential for dialect-aware SQL.


### 2. Discover — Understand the Data Before Writing

**Never write SQL without deeply understanding your data first.** The #1 cause of wrong results is writing SQL blind — assuming grain, relationships, column names, or values without checking.

**Step 2a: Read all documentation and schema definitions**
**Step 2a: Search for relevant tables and columns**
- Use `schema_search` with natural-language queries to find tables/columns in large warehouses (e.g., `schema_search(query: "customer orders")` returns matching tables and columns from the indexed schema cache)
- Read `sources.yml`, `schema.yml`, and any YAML files that describe the source/parent models
- These contain column descriptions, data types, tests, and business context
- Pay special attention to: primary keys, unique constraints, relationships between tables, and what each column represents
Expand Down
20 changes: 17 additions & 3 deletions .opencode/skills/dbt-test/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Add schema tests, unit tests, and data quality checks to dbt models

## Requirements
**Agent:** builder or migrator (requires file write access)
**Tools used:** bash (runs `altimate-dbt` commands), read, glob, write, edit
**Tools used:** bash (runs `altimate-dbt` commands), read, glob, write, edit, altimate_core_testgen, altimate_core_validate

## When to Use This Skill

Expand Down Expand Up @@ -52,13 +52,27 @@ read <yaml_file>

### 3. Generate Tests

Apply test rules based on column patterns — see [references/schema-test-patterns.md](references/schema-test-patterns.md).
**Auto-generate with `altimate_core_testgen`:** Pass the compiled SQL and schema to generate boundary-value, NULL-handling, and edge-case test assertions automatically. This produces executable test SQL covering cases you might miss manually.

```
altimate_core_testgen(sql: <compiled_sql>, schema_context: <schema_object>)
```

Review the generated tests — keep what makes sense, discard trivial ones. Then apply test rules based on column patterns — see [references/schema-test-patterns.md](references/schema-test-patterns.md).

### 4. Write YAML

Merge into existing schema.yml (don't duplicate). Use `edit` for existing files, `write` for new ones.

### 5. Run Tests
### 5. Validate SQL

Before running, validate the compiled model SQL to catch syntax and schema errors early:

```
altimate_core_validate(sql: <compiled_sql>, schema_context: <schema_object>)
```

### 6. Run Tests

```bash
altimate-dbt test --model <name> # run tests for this model
Expand Down
15 changes: 14 additions & 1 deletion .opencode/skills/dbt-troubleshoot/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Debug dbt errors — compilation failures, runtime database errors,

## Requirements
**Agent:** any (read-only diagnosis), builder (if applying fixes)
**Tools used:** bash (runs `altimate-dbt` commands), read, glob, edit, altimate_core_semantics, altimate_core_column_lineage, altimate_core_correct
**Tools used:** bash (runs `altimate-dbt` commands), read, glob, edit, altimate_core_semantics, altimate_core_column_lineage, altimate_core_correct, altimate_core_fix, sql_fix

## When to Use This Skill

Expand Down Expand Up @@ -81,6 +81,19 @@ altimate_core_column_lineage --sql <compiled_sql>
altimate_core_correct --sql <compiled_sql>
```

**Quick-fix tools** — use these when the error type is clear:

```
# Schema-based fix: fuzzy-matches table/column names against schema to fix typos and wrong references
altimate_core_fix(sql: <compiled_sql>, schema_context: <schema_object>)

# Error-message fix: given a failing query + database error, analyzes root cause and proposes corrections
sql_fix(sql: <compiled_sql>, error_message: <error_message>, dialect: <dialect>)
```

`altimate_core_fix` is best for compilation errors (wrong names, missing objects). `sql_fix` is best for runtime errors (the database told you what's wrong). Use `altimate_core_correct` for iterative multi-round correction when the first fix doesn't resolve the issue.


Common findings:
- **Wrong join type**: `INNER JOIN` dropping rows that should appear → switch to `LEFT JOIN`
- **Fan-out**: One-to-many join inflating row counts → add deduplication or aggregate
Expand Down
117 changes: 117 additions & 0 deletions .opencode/skills/pii-audit/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
---
name: pii-audit
description: Classify schema columns for PII (SSN, email, phone, name, address, credit card) and check whether queries expose them. Use for GDPR/CCPA/HIPAA compliance audits.
---

# PII Audit

## Requirements
**Agent:** any (read-only analysis)
**Tools used:** altimate_core_classify_pii, altimate_core_query_pii, schema_detect_pii, schema_inspect, read, glob

## When to Use This Skill

**Use when the user wants to:**
- Scan a database schema for PII columns (SSN, email, phone, name, address, credit card, IP)
- Check if a specific query exposes PII data
- Audit dbt models for PII leakage before production deployment
- Generate a PII inventory for compliance (GDPR, CCPA, HIPAA)

**Do NOT use for:**
- SQL injection scanning -> use `sql-review`
- General SQL quality checks -> use `sql-review`
- Access control auditing -> finops role tools in `cost-report`

## Workflow

### 1. Classify Schema for PII

**Option A — From schema YAML/JSON:**

```
altimate_core_classify_pii(schema_context: <schema_object>)
```

Analyzes column names, types, and patterns to detect PII categories:
- **Direct identifiers**: SSN, email, phone, full name, credit card number
- **Quasi-identifiers**: Date of birth, zip code, IP address, device ID
- **Sensitive data**: Salary, health records, religious affiliation

**Option B — From warehouse connection:**

First index the schema, inspect it, then classify:
```
schema_index(warehouse: <name>)
schema_inspect(warehouse: <name>, database: <db>, schema: <schema>, table: <table>)
schema_detect_pii(warehouse: <name>)
```

`schema_detect_pii` scans all indexed columns using pattern matching against the schema cache (requires `schema_index` to have been run).

### 2. Check Query PII Exposure

For each query or dbt model, check which PII columns it accesses:

```
altimate_core_query_pii(sql: <sql>, schema_context: <schema_object>)
```

Returns:
- Which PII-classified columns are selected, filtered, or joined on
- Risk level per column (HIGH for direct identifiers, MEDIUM for quasi-identifiers)
- Whether PII is exposed in the output (SELECT) vs only used internally (WHERE/JOIN)

### 3. Audit dbt Models (Batch)

For a full project audit:
```bash
glob models/**/*.sql
```

For each model:
1. Read the compiled SQL
2. Run `altimate_core_query_pii` against the project schema
3. Classify the model's PII risk level

### 4. Present the Audit Report

```
PII Audit Report
================

Schema: analytics.public (42 tables, 380 columns)

PII Columns Found: 18

HIGH RISK (direct identifiers):
customers.email -> EMAIL
customers.phone_number -> PHONE
customers.ssn -> SSN
payments.card_number -> CREDIT_CARD

MEDIUM RISK (quasi-identifiers):
customers.date_of_birth -> DOB
customers.zip_code -> ZIP
events.ip_address -> IP_ADDRESS

Model PII Exposure:

| Model | PII Columns Exposed | Risk | Action |
|-------|-------------------|------|--------|
| stg_customers | email, phone, ssn | HIGH | Mask or hash before mart layer |
| mart_user_profile | email | HIGH | Requires access control |
| int_order_summary | (none) | SAFE | No PII in output |
| mart_daily_revenue | zip_code | MEDIUM | Aggregation reduces risk |

Recommendations:
1. Hash SSN and credit_card in staging layer (never expose raw)
2. Add column-level masking policy for email and phone
3. Restrict mart_user_profile to authorized roles only
4. Document PII handling in schema.yml column descriptions
```

## Usage

- `/pii-audit` -- Scan the full project schema for PII
- `/pii-audit models/marts/mart_customers.sql` -- Check a specific model for PII exposure
- `/pii-audit --schema analytics.public` -- Audit a specific database schema
20 changes: 15 additions & 5 deletions .opencode/skills/query-optimize/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ description: Analyze and optimize SQL queries for better performance

## Requirements
**Agent:** any (read-only analysis)
**Tools used:** sql_optimize, sql_analyze, read, glob, schema_inspect, warehouse_list
**Tools used:** sql_optimize, sql_analyze, sql_explain, altimate_core_equivalence, read, glob, schema_inspect, warehouse_list

Analyze SQL queries for performance issues and suggest concrete optimizations including rewritten SQL.

Expand All @@ -27,7 +27,17 @@ Analyze SQL queries for performance issues and suggest concrete optimizations in
4. **Run detailed analysis**:
- Call `sql_analyze` with the same SQL and dialect to get the full anti-pattern breakdown with recommendations

5. **Present findings** in a structured format:
5. **Get execution plan** (if warehouse connected):
- Call `sql_explain` to run EXPLAIN on the query and get the execution plan
- Look for: full table scans, sort operations on large datasets, inefficient join strategies, missing partition pruning
- Include key findings in the report under "Execution Plan Insights"

6. **Verify rewrites preserve correctness**:
- If `sql_optimize` produced a rewritten query, call `altimate_core_equivalence` to verify the original and optimized queries produce the same result set
- If not equivalent, flag the difference and present both versions for the user to decide
- This prevents "optimization" that silently changes query semantics

7. **Present findings** in a structured format:

```
Query Optimization Report
Expand Down Expand Up @@ -62,9 +72,9 @@ Anti-Pattern Details:
-> Consider selecting only the columns you need.
```

6. **If schema context is available**, mention that the optimization used real table schemas for more accurate suggestions (e.g., expanding SELECT * to actual columns).
8. **If schema context is available**, mention that the optimization used real table schemas for more accurate suggestions (e.g., expanding SELECT * to actual columns).

7. **If no issues are found**, confirm the query looks well-optimized and briefly explain why (no anti-patterns, proper use of limits, explicit columns, etc.).
9. **If no issues are found**, confirm the query looks well-optimized and briefly explain why (no anti-patterns, proper use of limits, explicit columns, etc.).

## Usage

Expand All @@ -73,4 +83,4 @@ The user invokes this skill with SQL or a file path:
- `/query-optimize models/staging/stg_orders.sql` -- Optimize SQL from a file
- `/query-optimize` -- Optimize the most recently discussed SQL in the conversation

Use the tools: `sql_optimize`, `sql_analyze`, `read` (for file-based SQL), `glob` (to find SQL files), `schema_inspect` (for schema context), `warehouse_list` (to check connections).
Use the tools: `sql_optimize`, `sql_analyze`, `sql_explain` (execution plans), `altimate_core_equivalence` (rewrite verification), `read` (for file-based SQL), `glob` (to find SQL files), `schema_inspect` (for schema context), `warehouse_list` (to check connections).
Loading
Loading