diff --git a/agents/monitoring/api-runtime/SKILL.md b/agents/monitoring/api-runtime/SKILL.md
index 84ece4a..f3d579a 100644
--- a/agents/monitoring/api-runtime/SKILL.md
+++ b/agents/monitoring/api-runtime/SKILL.md
@@ -1,84 +1,263 @@
 ---
 name: api-runtime-monitor
-description: Monitors LTX API runtime performance, latency, error rates, and throughput. Alerts on performance degradation or errors.
-tags: [monitoring, api, performance, latency, errors]
+description: "Monitor LTX API runtime performance, latency, error rates, and throughput. Detects performance degradation and errors. Use when: (1) detecting API latency issues, (2) alerting on error rate spikes, (3) investigating throughput drops by endpoint/model/org."
+tags: [monitoring, api, performance, latency, errors, throughput]
 ---
 
 # API Runtime Monitor
 
-## When to use
-
-- "Monitor API latency"
-- "Alert on API errors"
-- "Track API throughput"
-- "Monitor inference time"
-- "Alert on API performance degradation"
-
-## What it monitors
-
-- **Latency**: Request processing time, inference time, queue time
-- **Error rates**: % of failed requests, error types, error sources
-- **Throughput**: Requests per hour/day, by endpoint/model
-- **Performance**: P50/P95/P99 latency, success rate
-- **Utilization**: API usage by org, model, resolution
-
-## Steps
-
-1. **Gather requirements from user:**
-   - Which performance metric to monitor (latency, errors, throughput)
-   - Alert threshold (e.g., "P95 latency > 30s", "error rate > 5%", "throughput drops > 20%")
-   - Time window (hourly, daily)
-   - Scope (all requests, specific endpoint, specific org)
-   - Notification channel
-
-2. **Read shared files:**
-   - `shared/bq-schema.md` — GPU cost table (has API runtime data) and ltxvapi tables
-   - `shared/metric-standards.md` — Performance metric patterns
-
-3. **Identify data source:**
-   - For LTX API: Use `ltxvapi_api_requests_with_be_costs` or `gpu_request_attribution_and_cost`
-   - **Key columns explained:**
-     - `request_processing_time_ms`: Total time from request submission to completion
-     - `request_inference_time_ms`: GPU processing time (actual model inference)
-     - `request_queue_time_ms`: Time waiting in queue before processing starts
-     - `result`: Request outcome (success, failed, timeout, etc.)
-     - `error_type`: Classification of errors (infrastructure vs applicative)
-     - `endpoint`: API endpoint called (e.g., /generate, /upscale)
-     - `model_type`: Model used (ltxv2, retake, etc.)
-     - `org_name`: Customer organization making the request
-
-4. **Write monitoring SQL:**
-   - Query relevant performance metric
-   - Calculate percentiles (P50, P95, P99) for latency
-   - Calculate error rate (failed / total requests)
-   - Compare against baseline
-
-5. **Present to user:**
-   - Show SQL query
-   - Show example alert format with performance breakdown
-   - Confirm threshold values
-
-6. **Set up alert** (manual for now):
-   - Document SQL
-   - Configure notification to engineering team
-
-## Reference files
-
-| File | Read when |
-|------|-----------|
-| `shared/product-context.md` | LTX products and business context |
-| `shared/bq-schema.md` | API tables and GPU cost table schema |
-| `shared/metric-standards.md` | Performance metric patterns |
-| `shared/event-registry.yaml` | Feature events (if analyzing event-driven metrics) |
-| `shared/gpu-cost-query-templates.md` | GPU cost queries (if analyzing cost-related performance) |
-| `shared/gpu-cost-analysis-patterns.md` | Cost analysis patterns (if analyzing cost-related performance) |
-
-## Rules
-
-- DO use APPROX_QUANTILES for percentile calculations (P50, P95, P99)
-- DO separate errors by error_source (infrastructure vs applicative)
-- DO filter by result = 'success' for success rate calculations
-- DO break down by endpoint, model, and resolution for detailed analysis
-- DO compare current performance against historical baseline
-- DO alert engineering team for infrastructure errors, product team for applicative errors
-- DO partition by dt for performance
+## 1. Overview (Why?)
+
+LTX API performance varies by endpoint, model, and customer organization. Latency issues, error rate spikes, and throughput drops can indicate infrastructure problems, model regressions, or customer-specific issues that require engineering intervention.
+
+This skill provides **autonomous API runtime monitoring** that detects performance degradation (P95 latency spikes), error rate increases, throughput drops, and queue time issues — with breakdown by endpoint, model, and organization for root cause analysis.
+
+**Problem solved**: Detect API performance problems and errors before they impact customer experience — with segment-level (endpoint/model/org) root cause identification.
+
+## 2. Requirements (What?)
+
+Monitor these outcomes autonomously:
+
+- [ ] P95 latency spikes (> 2x baseline or > 60s)
+- [ ] Error rate increases (> 5% or DoD increase > 50%)
+- [ ] Throughput drops (> 30% DoD/WoW)
+- [ ] Queue time excessive (> 50% of processing time)
+- [ ] Infrastructure errors (> 10 requests/hour)
+- [ ] Alerts include breakdown by endpoint, model, organization
+- [ ] Results formatted by priority (infrastructure vs applicative errors)
+- [ ] Findings routed to appropriate team (API team or Engineering)
+
+## 3. Progress Tracker
+
+* [ ] Read shared knowledge (schema, metrics, performance patterns)
+* [ ] Identify data source (ltxvapi tables or GPU cost table)
+* [ ] Write monitoring SQL with percentile calculations
+* [ ] Execute query for target date range
+* [ ] Analyze results by endpoint, model, organization
+* [ ] Separate infrastructure vs applicative errors
+* [ ] Present findings with performance breakdown
+* [ ] Route alerts to appropriate team
+
+## 4. Implementation Plan
+
+### Phase 1: Read Alert Thresholds
+
+**Generic thresholds** (data-driven analysis pending):
+- P95 latency > 2x baseline or > 60s
+- Error rate > 5% or DoD increase > 50%
+- Throughput drops > 30% DoD/WoW
+- Queue time > 50% of processing time
+- Infrastructure errors > 10 requests/hour
+
+[!IMPORTANT] These are generic thresholds. Consider creating production thresholds based on endpoint/model-specific analysis (similar to usage/GPU cost monitoring).
+
+### Phase 2: Read Shared Knowledge
+
+Before writing SQL, read:
+- **`shared/product-context.md`** — LTX products, user types, business model, API context
+- **`shared/bq-schema.md`** — GPU cost table (has API runtime data), ltxvapi tables, API request schema
+- **`shared/metric-standards.md`** — Performance metric patterns (latency, error rates, throughput)
+- **`shared/event-registry.yaml`** — Feature events (if analyzing event-driven metrics)
+- **`shared/gpu-cost-query-templates.md`** — GPU cost queries (if analyzing cost-related performance)
+- **`shared/gpu-cost-analysis-patterns.md`** — Cost analysis patterns (if analyzing cost-related performance)
+
+**Data nuances**:
+- Primary table: `ltx-dwh-prod-processed.web.ltxvapi_api_requests_with_be_costs`
+- Alternative: `ltx-dwh-prod-processed.gpu_costs.gpu_request_attribution_and_cost`
+- Partitioned by `action_ts` (TIMESTAMP) or `dt` (DATE) — filter for performance
+- Key columns: `request_processing_time_ms`, `request_inference_time_ms`, `request_queue_time_ms`, `result`, `endpoint`, `model_type`, `org_name`
+
+### Phase 3: Identify Data Source
+
+✅ **PREFERRED: Use ltxvapi_api_requests_with_be_costs for API runtime metrics**
+
+**Key columns**:
+- `request_processing_time_ms`: Total time from request submission to completion
+- `request_inference_time_ms`: GPU processing time (actual model inference)
+- `request_queue_time_ms`: Time waiting in queue before processing starts
+- `result`: Request outcome (success, failed, timeout, etc.)
+- `error_type` or `error_source`: Classification of errors (infrastructure vs applicative)
+- `endpoint`: API endpoint called (e.g., /generate, /upscale)
+- `model_type`: Model used (ltxv2, retake, etc.)
+- `org_name`: Customer organization making the request
+
+[!IMPORTANT] Verify column name: `error_type` vs `error_source` in actual schema
+
+### Phase 4: Write Monitoring SQL
+
+✅ **PREFERRED: Calculate percentiles and error rates with baseline comparisons**
+
+```sql
+WITH api_metrics AS (
+  SELECT
+    DATE(action_ts) AS dt,
+    endpoint,
+    model_type,
+    org_name,
+    COUNT(*) AS total_requests,
+    COUNTIF(result = 'success') AS successful_requests,
+    COUNTIF(result != 'success') AS failed_requests,
+    SAFE_DIVIDE(COUNTIF(result != 'success'), COUNT(*)) * 100 AS error_rate_pct,
+    APPROX_QUANTILES(request_processing_time_ms, 100)[OFFSET(50)] AS p50_latency_ms,
+    APPROX_QUANTILES(request_processing_time_ms, 100)[OFFSET(95)] AS p95_latency_ms,
+    APPROX_QUANTILES(request_processing_time_ms, 100)[OFFSET(99)] AS p99_latency_ms,
+    AVG(request_queue_time_ms) AS avg_queue_time_ms,
+    AVG(request_inference_time_ms) AS avg_inference_time_ms
+  FROM `ltx-dwh-prod-processed.web.ltxvapi_api_requests_with_be_costs`
+  WHERE action_ts >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
+    AND action_ts < TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
+  GROUP BY dt, endpoint, model_type, org_name
+),
+metrics_with_baseline AS (
+  SELECT
+    *,
+    AVG(p95_latency_ms) OVER (
+      PARTITION BY endpoint, model_type
+      ORDER BY dt ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING
+    ) AS p95_latency_baseline_7d,
+    AVG(error_rate_pct) OVER (
+      PARTITION BY endpoint, model_type
+      ORDER BY dt ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING
+    ) AS error_rate_baseline_7d
+  FROM api_metrics
+)
+SELECT * FROM metrics_with_baseline
+WHERE dt = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY);
+```
+
+**Key patterns**:
+- **Percentiles**: Use `APPROX_QUANTILES` for P50/P95/P99
+- **Error rate**: `SAFE_DIVIDE(failed, total) * 100`
+- **Baseline**: 7-day rolling average by endpoint and model
+- **Time window**: Last 7 days (shorter than usage monitoring due to higher frequency data)
+
+### Phase 5: Execute Query
+
+Run query using:
+```bash
+bq --project_id=ltx-dwh-explore query --use_legacy_sql=false --format=pretty "
+<query>
+"
+```
+
+### Phase 6: Analyze Results
+
+**For latency trends**:
+- Compare P95 latency vs baseline (7-day avg)
+- Flag if P95 > 2x baseline or > 60s absolute
+- Identify which endpoint/model/org drove spikes
+
+**For error rate analysis**:
+- Compare error rate vs baseline
+- Separate errors by `error_type`/`error_source` (infrastructure vs applicative)
+- Flag if error rate > 5% or DoD increase > 50%
+
+**For throughput**:
+- Track requests per hour/day by endpoint
+- Flag throughput drops > 30% DoD/WoW
+- Identify which endpoints lost traffic
+
+**For queue analysis**:
+- Calculate queue time as % of total processing time
+- Flag if queue time > 50% of processing time
+- Indicates capacity/scaling issues
+
+### Phase 7: Present Findings
+
+Format results with:
+- **Summary**: Key finding (e.g., "P95 latency spiked to 85s for /v1/text-to-video")
+- **Root cause**: Which endpoint/model/org drove the issue
+- **Breakdown**: Performance metrics by dimension
+- **Error classification**: Infrastructure vs applicative errors
+- **Recommendation**: Route to API team (applicative) or Engineering team (infrastructure)
+
+**Alert format**:
+```
+⚠️ API PERFORMANCE ALERT:
+  • Endpoint: /v1/text-to-video
+    Model: ltxv2
+    Metric: P95 Latency
+    Current: 85s | Baseline: 30s
+    Change: +183%
+
+  Error rate: 8.2% (baseline: 2.1%)
+  Error type: Infrastructure
+
+Recommendation: Alert Engineering team for infrastructure issue
+```
+
+### Phase 8: Route Alert
+
+For ongoing monitoring:
+1. Save SQL query
+2. Set up in BigQuery scheduled query or Hex Thread
+3. Configure notification by error type:
+   - Infrastructure errors → Engineering team
+   - Applicative errors → API/Product team
+4. Include endpoint, model, and org details in alert
+
+## 5. Context & References
+
+### Shared Knowledge
+- **`shared/product-context.md`** — LTX products and API context
+- **`shared/bq-schema.md`** — API tables and GPU cost table schema
+- **`shared/metric-standards.md`** — Performance metric patterns
+- **`shared/event-registry.yaml`** — Feature events (if analyzing event-driven metrics)
+- **`shared/gpu-cost-query-templates.md`** — GPU cost queries (if analyzing cost-related performance)
+- **`shared/gpu-cost-analysis-patterns.md`** — Cost analysis patterns
+
+### Data Sources
+
+**Primary table**: `ltx-dwh-prod-processed.web.ltxvapi_api_requests_with_be_costs`
+- Partitioned by `action_ts` (TIMESTAMP)
+- Key columns: `request_processing_time_ms`, `request_inference_time_ms`, `request_queue_time_ms`, `result`, `endpoint`, `model_type`, `org_name`
+
+**Alternative**: `ltx-dwh-prod-processed.gpu_costs.gpu_request_attribution_and_cost`
+- Contains API runtime data but not primary source for performance metrics
+
+### Endpoints
+Common endpoints: `/v1/text-to-video`, `/v1/image-to-video`, `/v1/upscale`, `/generate`
+
+### Models
+Common models: `ltxv2`, `retake`, etc.
+
+## 6. Constraints & Done
+
+### DO NOT
+
+- **DO NOT** use absolute thresholds without baseline comparison
+- **DO NOT** mix infrastructure and applicative errors in same alert
+- **DO NOT** skip partition filtering — always filter on `action_ts` or `dt` for performance
+- **DO NOT** forget to separate errors by error type/source
+
+[!IMPORTANT] Verify column name in schema: `error_type` vs `error_source`
+
+### DO
+
+- **DO** use `APPROX_QUANTILES` for percentile calculations (P50, P95, P99)
+- **DO** separate errors by error_source (infrastructure vs applicative)
+- **DO** filter by `result = 'success'` for success rate calculations
+- **DO** break down by endpoint, model, and organization for detailed analysis
+- **DO** compare current performance against historical baseline (7-day rolling avg)
+- **DO** alert engineering team for infrastructure errors
+- **DO** alert product/API team for applicative errors
+- **DO** partition on `action_ts` or `dt` for performance
+- **DO** use `ltx-dwh-explore` as execution project
+- **DO** calculate error rate with `SAFE_DIVIDE(failed, total) * 100`
+- **DO** flag P95 latency > 2x baseline or > 60s
+- **DO** flag error rate > 5% or DoD increase > 50%
+- **DO** flag throughput drops > 30% DoD/WoW
+- **DO** flag queue time > 50% of processing time
+- **DO** flag infrastructure errors > 10 requests/hour
+- **DO** include endpoint, model, org details in all alerts
+- **DO** validate unusual patterns with API/Engineering team before alerting
+
+### Completion Criteria
+
+✅ All performance metrics monitored (latency, errors, throughput, queue time)
+✅ Alerts fire with thresholds (generic pending production analysis)
+✅ Endpoint/model/org breakdown provided
+✅ Errors separated by type (infrastructure vs applicative)
+✅ Findings routed to appropriate team
+✅ Partition filtering applied for performance
+✅ Column name verified (error_type vs error_source)
diff --git a/agents/monitoring/be-cost/SKILL.md b/agents/monitoring/be-cost/SKILL.md
index c2f177e..fb5dd29 100644
--- a/agents/monitoring/be-cost/SKILL.md
+++ b/agents/monitoring/be-cost/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: be-cost-monitoring
-description: Monitor and analyze backend GPU costs for LTX API and LTX Studio. Use when analyzing cost trends, detecting anomalies, breaking down costs by endpoint/model/org/process, monitoring utilization, or investigating cost efficiency drift.
+description: "Monitor backend GPU costs for LTX API and LTX Studio with data-driven thresholds. Detects cost anomalies and utilization issues. Use when: (1) detecting cost spikes, (2) analyzing idle vs inference costs, (3) investigating efficiency drift by endpoint/model/org."
 tags: [monitoring, costs, gpu, infrastructure, alerts]
 compatibility:
   - BigQuery access (ltx-dwh-prod-processed)
@@ -9,52 +9,98 @@ compatibility:
 
 # Backend Cost Monitoring
 
-## When to use
+## 1. Overview (Why?)
 
-- "Monitor GPU costs"
-- "Analyze cost trends (daily/weekly/monthly)"
-- "Detect cost anomalies or spikes"
-- "Break down API costs by endpoint/model/org"
-- "Break down Studio costs by process/workspace"
-- "Compare API vs Studio cost distribution"
-- "Investigate cost-per-request efficiency drift"
-- "Monitor GPU utilization and idle costs"
-- "Alert on cost budget breaches"
-- "Day-over-day or week-over-week cost comparisons"
+LTX GPU infrastructure spending is dominated by idle costs (~72% of total), with actual inference representing only ~27%. Cost patterns vary significantly by vertical (API vs Studio) and day-of-week. Generic alerting thresholds miss true anomalies while flagging normal variance.
 
-## Steps
+This skill provides **autonomous cost monitoring** with data-driven thresholds derived from 60-day statistical analysis. It detects genuine cost problems (autoscaler issues, efficiency regression, volume surges) while suppressing noise from expected patterns.
 
-### 1. Gather Requirements
+**Problem solved**: Detect cost spikes, utilization degradation, and efficiency drift that indicate infrastructure problems or wasteful spending — without manual threshold tuning or false alarms.
 
-Ask the user:
-- **What to monitor**: Total costs, cost by product/feature, cost efficiency, utilization, anomalies
-- **Scope**: LTX API, LTX Studio, or both? Specific endpoint/org/process?
-- **Time window**: Daily, weekly, monthly? How far back?
-- **Analysis type**: Trends, comparisons (DoD/WoW), anomaly detection, breakdowns
-- **Alert threshold** (if setting up alerts): Absolute ($X/day) or relative (spike > X% vs baseline)
+**Critical timing**: Analyzes data from **3 days ago** (cost data needs time to finalize).
 
-### 2. Read Shared Knowledge
+## 2. Requirements (What?)
 
-Before writing SQL:
-- **`shared/product-context.md`** — LTX products and business context
+Monitor these outcomes autonomously:
+
+- [ ] Idle cost spikes (over-provisioning or traffic drop without scaledown)
+- [ ] Inference cost spikes (volume surge, costlier model, heavy customer)
+- [ ] Idle-to-inference ratio degradation (utilization dropping)
+- [ ] Failure rate spikes (wasted compute + service quality issue)
+- [ ] Cost-per-request drift (model regression or resolution creep)
+- [ ] Day-over-day cost jumps (early warning signals)
+- [ ] Volume drops (potential outage)
+- [ ] Overhead spikes (system anomalies)
+- [ ] Alerts prioritized by tier (High/Medium/Low)
+- [ ] Vertical-specific thresholds (API vs Studio)
+
+## 3. Progress Tracker
+
+* [ ] Read production thresholds and shared knowledge
+* [ ] Select appropriate query template(s)
+* [ ] Execute query for 3 days ago
+* [ ] Analyze results by tier priority
+* [ ] Identify root cause (endpoint, model, org, process)
+* [ ] Present findings with cost breakdown
+* [ ] Route alerts to appropriate team (API/Studio/Engineering)
+
+## 4. Implementation Plan
+
+### Phase 1: Read Production Thresholds
+
+Production thresholds from `/Users/dbeer/workspace/dwh-data-model-transforms/queries/gpu_cost_alerting_thresholds.md`:
+
+#### Tier 1 — High Priority (daily monitoring)
+| Alert | Threshold | Signal |
+|-------|-----------|--------|
+| **Idle cost spike** | > $15,600/day | Autoscaler over-provisioning or traffic drop without scaledown |
+| **Inference cost spike** | > $5,743/day | Volume surge, costlier model, or heavy new customer |
+| **Idle-to-inference ratio** | > 4:1 | GPU utilization degrading (baseline ~2.7:1) |
+
+#### Tier 2 — Medium Priority (daily monitoring)
+| Alert | Threshold | Signal |
+|-------|-----------|--------|
+| **Failure rate spike** | > 20.4% overall | Wasted compute + service quality issue |
+| **Cost-per-request drift** | API > $0.70, Studio > $0.46 | Model regression or duration/resolution creep |
+| **DoD cost jump** | API > 30%, Studio > 15% | Early warning before absolute thresholds breach |
+
+#### Tier 3 — Low Priority (weekly review)
+| Alert | Threshold | Signal |
+|-------|-----------|--------|
+| **Volume drop** | < 18,936 requests/day | Possible outage or upstream feed issue |
+| **Overhead spike** | > $138/day | System overhead anomaly (review weekly) |
+
+**Per-Vertical Thresholds:**
+- **LTX API**: Total daily > $5,555, Failure rate > 5.7%, DoD change > 30%
+- **LTX Studio**: Total daily > $11,928, Failure rate > 22.6%, DoD change > 15%
+
+**Alert logic**: Cost metric exceeds WARNING (avg+2σ) or CRITICAL (avg+3σ) threshold
+
+### Phase 2: Read Shared Knowledge
+
+Before writing SQL, read:
+- **`shared/product-context.md`** — LTX products, user types, business model
 - **`shared/bq-schema.md`** — GPU cost table schema (lines 418-615)
 - **`shared/metric-standards.md`** — GPU cost metric patterns (section 13)
-- **`shared/event-registry.yaml`** — Feature events (if analyzing feature-level costs)
 - **`shared/gpu-cost-query-templates.md`** — 11 production-ready SQL queries
 - **`shared/gpu-cost-analysis-patterns.md`** — Analysis workflows and benchmarks
+- **`shared/event-registry.yaml`** — Feature events (if analyzing feature-level costs)
 
-Key learnings:
+**Data nuances**:
 - Table: `ltx-dwh-prod-processed.gpu_costs.gpu_request_attribution_and_cost`
 - Partitioned by `dt` (DATE) — always filter for performance
-- `cost_category`: inference (requests), idle, overhead, unused
-- Total cost = `row_cost + attributed_idle_cost + attributed_overhead_cost` (inference rows only)
-- For infrastructure cost: `SUM(row_cost)` across all categories
+- **Target date**: `DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)` (cost data from 3 days ago)
+- Cost categories: inference, idle, overhead, unused
+- Total cost per request = `row_cost + attributed_idle_cost + attributed_overhead_cost` (inference only)
+- Infrastructure cost = `SUM(row_cost)` across all categories
 
-### 3. Run Query Templates
+### Phase 3: Select Query Template
 
-**If user didn't specify anything:** Run all 11 query templates from `shared/gpu-cost-query-templates.md` to provide comprehensive cost overview.
+✅ **PREFERRED: Run all 11 templates for comprehensive overview**
 
-**If user specified specific analysis:** Select appropriate template:
+If user didn't specify, run all templates from `shared/gpu-cost-query-templates.md`:
+
+**If user specified specific analysis**, select appropriate template:
 
 | User asks... | Use template |
 |-------------|-------------|
@@ -68,9 +114,7 @@ Key learnings:
 | "Cost efficiency by model" | Cost per Request by Model |
 | "Which orgs cost most?" | API Cost by Organization |
 
-See `shared/gpu-cost-query-templates.md` for all 11 query templates.
-
-### 4. Execute Query
+### Phase 4: Execute Query
 
 Run query using:
 ```bash
@@ -79,84 +123,107 @@ bq --project_id=ltx-dwh-explore query --use_legacy_sql=false --format=pretty "
 "
 ```
 
-Or use BigQuery console with project `ltx-dwh-explore`.
+**Or** use BigQuery console with project `ltx-dwh-explore`
+
+**Critical**: Always filter on `dt = DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)`
 
-### 5. Analyze Results
+### Phase 5: Analyze Results
 
-**For cost trends:**
+**For cost trends**:
 - Compare current period vs baseline (7-day avg, prior week, prior month)
 - Calculate % change and flag significant shifts (>15-20%)
 
-**For anomaly detection:**
+**For anomaly detection**:
 - Flag days with Z-score > 2 (cost or volume deviates > 2 std devs from rolling avg)
 - Investigate root cause: specific endpoint/model/org, error rate spike, billing type change
 
-**For breakdowns:**
+**For breakdowns**:
 - Identify top cost drivers (endpoint, model, org, process)
 - Calculate cost per request to spot efficiency issues
 - Check failure costs (wasted spend on errors)
 
-### 6. Present Findings
+### Phase 6: Present Findings
 
 Format results with:
-- **Summary**: Key finding (e.g., "GPU costs spiked 45% yesterday")
+- **Summary**: Key finding (e.g., "GPU costs spiked 45% 3 days ago")
 - **Root cause**: What drove the change (e.g., "LTX API /v1/text-to-video requests +120%")
-- **Breakdown**: Top contributors by dimension
+- **Breakdown**: Top contributors by dimension (endpoint, model, org, process)
 - **Recommendation**: Action to take (investigate org X, optimize model Y, alert team)
+- **Priority tier**: Tier 1 (High), Tier 2 (Medium), or Tier 3 (Low)
 
-### 7. Set Up Alert (if requested)
+### Phase 7: Set Up Alert (Optional)
 
 For ongoing monitoring:
 1. Save SQL query
 2. Set up in BigQuery scheduled query or Hex Thread
-3. Configure notification threshold
-4. Route alerts to Slack channel or Linear issue
+3. Configure notification threshold by tier
+4. Route alerts to appropriate team:
+   - API cost spikes → API team
+   - Studio cost spikes → Studio team
+   - Infrastructure issues → Engineering team
 
-## Schema Reference
+## 5. Context & References
 
-For detailed table schema including all dimensions, columns, and cost calculations, see `references/schema-reference.md`.
+### Production Thresholds
+- **`/Users/dbeer/workspace/dwh-data-model-transforms/queries/gpu_cost_alerting_thresholds.md`** — Production thresholds for LTX division (60-day analysis)
 
-## Reference Files
+### Shared Knowledge
+- **`shared/product-context.md`** — LTX products and business context
+- **`shared/bq-schema.md`** — GPU cost table schema (lines 418-615)
+- **`shared/metric-standards.md`** — GPU cost metric patterns (section 13)
+- **`shared/gpu-cost-query-templates.md`** — 11 production-ready SQL queries
+- **`shared/gpu-cost-analysis-patterns.md`** — Analysis workflows, benchmarks, investigation playbooks
+- **`shared/event-registry.yaml`** — Feature events (if analyzing feature-level costs)
+
+### Data Source
+Table: `ltx-dwh-prod-processed.gpu_costs.gpu_request_attribution_and_cost`
+- Partitioned by `dt` (DATE)
+- Division: `LTX` (LTX API + LTX Studio)
+- Cost categories: inference (27%), idle (72%), overhead (0.3%)
+- Key columns: `cost_category`, `row_cost`, `attributed_idle_cost`, `attributed_overhead_cost`, `result`, `endpoint`, `model_type`, `org_name`, `process_name`
+
+### Query Templates
+See `shared/gpu-cost-query-templates.md` for 11 production-ready queries
 
-| File | Read when |
-|------|-----------|
-| `references/schema-reference.md` | GPU cost table dimensions, columns, and cost calculations |
-| `shared/bq-schema.md` | Understanding GPU cost table schema (lines 418-615) |
-| `shared/metric-standards.md` | GPU cost metric SQL patterns (section 13) |
-| `shared/gpu-cost-query-templates.md` | Selecting query template for analysis (11 production-ready queries) |
-| `shared/gpu-cost-analysis-patterns.md` | Interpreting results, workflows, benchmarks, investigation playbooks |
+## 6. Constraints & Done
 
-## Rules
+### DO NOT
 
-### Query Best Practices
+- **DO NOT** analyze yesterday's data — use 3 days ago (cost data needs time to finalize)
+- **DO NOT** sum row_cost + attributed_* across all cost_categories — causes double-counting
+- **DO NOT** mix inference and non-inference rows in same aggregation without filtering
+- **DO NOT** use absolute thresholds — always compare to baseline (avg+2σ/avg+3σ)
+- **DO NOT** skip partition filtering — always filter on `dt` for performance
+
+### DO
 
 - **DO** always filter on `dt` partition column for performance
+- **DO** use `DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY)` as the target date
 - **DO** filter `cost_category = 'inference'` for request-level analysis
-- **DO** exclude Lightricks team requests with `is_lt_team IS FALSE` for customer-facing cost analysis
-- **DO** include LT team requests only when analyzing total infrastructure spend or debugging
+- **DO** exclude Lightricks team with `is_lt_team IS FALSE` for customer-facing cost analysis
+- **DO** include LT team only when analyzing total infrastructure spend or debugging
 - **DO** use `ltx-dwh-explore` as execution project
 - **DO** calculate cost per request with `SAFE_DIVIDE` to avoid division by zero
-- **DO** compare against baseline (7-day avg, prior period) for trends
-- **DO** round cost values to 2 decimal places for readability
-
-### Cost Calculation
-
 - **DO** sum all three cost columns (row_cost + attributed_idle + attributed_overhead) for fully loaded cost per request
 - **DO** use `SUM(row_cost)` across all rows for total infrastructure cost
-- **DO NOT** sum row_cost + attributed_* across all cost_categories (double-counting)
-- **DO NOT** mix inference and non-inference rows in same aggregation without filtering
-
-### Analysis
-
+- **DO** compare against baseline (7-day avg, prior period) for trends
+- **DO** round cost values to 2 decimal places for readability
 - **DO** flag anomalies with Z-score > 2 (cost or volume deviation > 2 std devs)
 - **DO** investigate failure costs (wasted spend on errors)
 - **DO** break down by endpoint/model for API, by process for Studio
 - **DO** check cost per request trends to spot efficiency degradation
 - **DO** validate results against total infrastructure spend
-
-### Alerts
-
 - **DO** set thresholds based on historical baseline, not absolute values
 - **DO** alert engineering team for cost spikes > 30% vs baseline
 - **DO** include cost breakdown and root cause in alerts
 - **DO** route API cost alerts to API team, Studio alerts to Studio team
+
+### Completion Criteria
+
+✅ All cost categories analyzed (idle, inference, overhead)
+✅ Production thresholds applied by tier (High/Medium/Low)
+✅ Alerts prioritized and routed to appropriate teams
+✅ Root cause investigation completed (endpoint/model/org/process)
+✅ Findings presented with cost breakdown and recommendations
+✅ Query uses 3-day lookback (not yesterday)
+✅ Partition filtering applied for performance
diff --git a/agents/monitoring/enterprise/SKILL.md b/agents/monitoring/enterprise/SKILL.md
index 2cf15ce..39c2403 100644
--- a/agents/monitoring/enterprise/SKILL.md
+++ b/agents/monitoring/enterprise/SKILL.md
@@ -1,67 +1,255 @@
 ---
 name: enterprise-monitor
-description: Monitors enterprise account health, usage, and contract compliance. Alerts on low engagement, quota breaches, or churn risk.
-tags: [monitoring, enterprise, accounts, contracts]
+description: "Monitor enterprise account health, usage, and contract compliance. Detects low engagement, quota breaches, and churn risk. Use when: (1) detecting enterprise account usage drops, (2) alerting on inactive accounts, (3) investigating power user engagement changes."
+tags: [monitoring, enterprise, accounts, contracts, churn-risk]
 ---
 
 # Enterprise Monitor
 
-## When to use
-
-- "Monitor enterprise account usage"
-- "Monitor enterprise churn risk"
-- "Alert when enterprise account is inactive"
-
-## What it monitors
-
-- **Account usage**: DAU, WAU, MAU per enterprise org
-- **Token consumption**: Usage vs contracted quota, historical consumption trends
-- **User activation**: % of seats active
-- **Engagement**: Video generations, image generations, downloads per org
-- **Churn signals**: Declining usage, inactive users
-
-## Steps
-
-1. **Gather requirements from user:**
-   - Which enterprise org(s) to monitor (or all)
-   - Alert threshold based on historical usage of each account (e.g., "usage drops > 30% vs their baseline", "MAU below their 30-day average", "< 50% of contracted quota")
-   - Time window (weekly, monthly)
-   - Notification channel (Slack, email, Linear issue)
-
-2. **Read shared files:**
-   - `shared/product-context.md` — LTX products, enterprise business model, user types
-   - `shared/bq-schema.md` — Enterprise user segmentation queries
-   - `shared/metric-standards.md` — Enterprise metrics, quota tracking
-   - `shared/event-registry.yaml` — Feature events (if analyzing engagement)
-   - `shared/gpu-cost-query-templates.md` — GPU cost queries (if analyzing infrastructure costs)
-   - `shared/gpu-cost-analysis-patterns.md` — Cost analysis patterns (if analyzing infrastructure costs)
-
-3. **Identify enterprise users:**
-   - Use enterprise segmentation CTE from bq-schema.md (lines 441-461)
-   - Apply McCann split (McCann_NY vs McCann_Paris)
-   - Exclude Lightricks and Popular Pays
-
-4. **Write monitoring SQL:**
-   - Query org-level usage metrics
-   - Set baseline for each org based on their historical usage (e.g., 30-day average, 90-day trend)
-   - Compare current usage against org-specific baseline or contracted quota
-   - Flag orgs below threshold or showing decline
-   - Flag meaningful drops for power users (users with top usage within each org)
-
-5. **Present to user:**
-   - Show SQL query
-   - Show example alert format with org name and metrics
-   - Confirm threshold values and alert logic
-
-6. **Set up alert** (manual for now):
-   - Document SQL
-   - Configure notification to customer success team
-
-## Rules
-
-- DO use EXACT enterprise segmentation CTE from bq-schema.md without modification
-- DO apply McCann split (McCann_NY vs McCann_Paris)
-- DO exclude Lightricks and Popular Pays from enterprise orgs
-- DO break out pilot vs contracted accounts
-- DO NOT alert on free/self-serve users — this agent is enterprise-only
-- DO include org name in alert for easy customer success follow-up
+## 1. Overview (Why?)
+
+Enterprise accounts (contract and pilot) represent high-value customers with negotiated quotas and specific engagement patterns. Unlike self-serve users, enterprise usage should be monitored per-organization with org-specific baselines, since each has different team sizes, use cases, and contract terms.
+
+This skill provides **autonomous enterprise account monitoring** that detects declining usage, underutilization of quotas, inactive periods, and power user drops — all of which signal churn risk or engagement problems that require customer success intervention.
+
+**Problem solved**: Identify enterprise churn risk early through usage signals — before contracts end or accounts go completely inactive — with org-level root cause analysis.
+
+## 2. Requirements (What?)
+
+Monitor these outcomes autonomously:
+
+- [ ] DAU/WAU/MAU drops per enterprise org (> 30% vs org baseline)
+- [ ] Token consumption vs contracted quota (underutilization < 50%)
+- [ ] User activation (% of seats active)
+- [ ] Video/image generation engagement per org
+- [ ] Power user drops within org (> 20% decline)
+- [ ] Zero activity for 7+ consecutive days
+- [ ] Alerts include org name for customer success follow-up
+- [ ] Pilot vs contract accounts separated
+- [ ] McCann split applied (McCann_NY vs McCann_Paris)
+- [ ] Lightricks and Popular Pays excluded
+
+## 3. Progress Tracker
+
+* [ ] Read shared knowledge (enterprise segmentation, schema, metrics)
+* [ ] Identify enterprise users with segmentation CTE
+* [ ] Write monitoring SQL with org-specific baselines
+* [ ] Execute query for target date range
+* [ ] Analyze results by org and account type
+* [ ] Identify power user drops within each org
+* [ ] Present findings with org-level details
+* [ ] Route alerts to customer success team
+
+## 4. Implementation Plan
+
+### Phase 1: Read Alert Thresholds
+
+**Generic thresholds** (data-driven analysis pending):
+- DoD/WoW usage drops > 30% vs org's baseline
+- MAU below org's 30-day average
+- Token consumption < 50% of contracted quota (underutilization)
+- Power user drops > 20% within org
+- Zero activity for 7+ consecutive days
+
+[!IMPORTANT] These are generic thresholds. Consider creating production thresholds based on org-specific analysis.
+
+### Phase 2: Read Shared Knowledge
+
+Before writing SQL, read:
+- **`shared/product-context.md`** — LTX products, enterprise business model, user types
+- **`shared/bq-schema.md`** — Enterprise user segmentation queries (lines 441-461)
+- **`shared/metric-standards.md`** — Enterprise metrics, quota tracking
+- **`shared/event-registry.yaml`** — Feature events (if analyzing engagement)
+
+**Data nuances**:
+- Use EXACT enterprise segmentation CTE from bq-schema.md (lines 441-461) without modification
+- Apply McCann split: `McCann_NY` vs `McCann_Paris`
+- Exclude: `Lightricks`, `Popular Pays`, `None`
+- Contract accounts: Indegene, HearWell_BeWell, Novig, Cylndr Studios, Miroma, Deriv, McCann_Paris
+- Pilot accounts: All other enterprise orgs
+
+### Phase 3: Identify Enterprise Users
+
+✅ **PREFERRED: Use exact segmentation CTE from bq-schema.md**
+
+```sql
+WITH ent_users AS (
+  SELECT DISTINCT
+    lt_id,
+    CASE
+      WHEN COALESCE(enterprise_name_at_purchase, current_enterprise_name, organization_name) = 'McCann_NY' THEN 'McCann_NY'
+      WHEN COALESCE(enterprise_name_at_purchase, current_enterprise_name, organization_name) LIKE '%McCann%' THEN 'McCann_Paris'
+      ELSE COALESCE(enterprise_name_at_purchase, current_enterprise_name, organization_name)
+    END AS org
+  FROM `ltx-dwh-prod-processed.web.ltxstudio_users`
+  WHERE is_enterprise_user
+    AND current_customer_plan_type IN ('contract', 'pilot')
+    AND COALESCE(enterprise_name_at_purchase, current_enterprise_name, organization_name) NOT IN ('Lightricks', 'Popular Pays', 'None')
+),
+enterprise_users AS (
+  SELECT DISTINCT
+    lt_id,
+    CASE
+      WHEN org IN ('Indegene', 'HearWell_BeWell', 'Novig', 'Cylndr Studios', 'Miroma', 'Deriv', 'McCann_Paris')
+        THEN org
+      ELSE CONCAT(org, ' Pilot')
+    END AS org,
+    CASE
+      WHEN org IN ('Indegene', 'HearWell_BeWell', 'Novig', 'Cylndr Studios', 'Miroma', 'Deriv', 'McCann_Paris')
+        THEN 'Contract'
+      ELSE 'Pilot'
+    END AS account_type
+  FROM ent_users
+  WHERE org NOT IN ('Lightricks', 'Popular Pays', 'None')
+)
+```
+
+### Phase 4: Write Monitoring SQL
+
+✅ **PREFERRED: Monitor all enterprise orgs with org-specific baselines**
+
+```sql
+WITH org_metrics AS (
+  SELECT
+    a.dt,
+    e.org,
+    e.account_type,
+    COUNT(DISTINCT a.lt_id) AS dau,
+    SUM(a.num_tokens_consumed) AS tokens,
+    SUM(a.num_generate_image) AS image_gens,
+    SUM(a.num_generate_video) AS video_gens
+  FROM `ltx-dwh-prod-processed.web.ltxstudio_agg_user_date` a
+  JOIN enterprise_users e ON a.lt_id = e.lt_id
+  WHERE a.dt >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
+    AND a.dt <= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
+  GROUP BY a.dt, e.org, e.account_type
+),
+metrics_with_baseline AS (
+  SELECT
+    *,
+    AVG(dau) OVER (
+      PARTITION BY org
+      ORDER BY dt ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING
+    ) AS dau_baseline_30d,
+    AVG(tokens) OVER (
+      PARTITION BY org
+      ORDER BY dt ROWS BETWEEN 30 PRECEDING AND 1 PRECEDING
+    ) AS tokens_baseline_30d
+  FROM org_metrics
+)
+SELECT * FROM metrics_with_baseline
+WHERE dt = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY);
+```
+
+**Key patterns**:
+- **Org-specific baselines**: Each org compared to its own 30-day average
+- **McCann split**: Separate McCann_NY and McCann_Paris
+- **Account type**: Contract vs Pilot
+
+### Phase 5: Analyze Results
+
+**For usage trends**:
+- Compare org's current usage vs their 30-day baseline
+- Flag orgs with DoD/WoW drops > 30% vs baseline
+- Flag orgs with MAU below their 30-day average
+
+**For quota analysis**:
+- Compare token consumption vs contracted quota
+- Flag underutilization (< 50% of quota)
+- Identify orgs approaching or exceeding quota
+
+**For engagement**:
+- Track power users within each org (top 20% by token usage)
+- Flag power user drops > 20% within org
+- Flag zero activity for 7+ consecutive days
+
+### Phase 6: Present Findings
+
+Format results with:
+- **Summary**: Key finding (e.g., "Novig enterprise account inactive for 7 days")
+- **Org details**: Org name, account type (Contract/Pilot), baseline usage
+- **Metrics**: DAU, tokens, generations vs baseline
+- **Recommendation**: Customer success action (reach out, investigate, adjust quota)
+
+**Alert format**:
+```
+⚠️ ENTERPRISE ALERT:
+  • Org: Novig (Contract)
+    Metric: Token consumption
+    Current: 0 tokens | Baseline: 27K/day
+    Drop: -100% | Zero activity for 7 days
+
+Recommendation: Contact account manager immediately
+```
+
+### Phase 7: Route Alert
+
+For ongoing monitoring:
+1. Save SQL query
+2. Set up in BigQuery scheduled query or Hex Thread
+3. Configure notification for customer success team
+4. Include org name and account manager contact in alert
+
+## 5. Context & References
+
+### Shared Knowledge
+- **`shared/product-context.md`** — LTX products, enterprise business model, user types
+- **`shared/bq-schema.md`** — Enterprise user segmentation queries (lines 441-461)
+- **`shared/metric-standards.md`** — Enterprise metrics, quota tracking
+- **`shared/event-registry.yaml`** — Feature events for engagement analysis
+
+### Data Sources
+- **Users table**: `ltx-dwh-prod-processed.web.ltxstudio_users`
+- **Usage table**: `ltx-dwh-prod-processed.web.ltxstudio_agg_user_date`
+- Key columns: `lt_id`, `enterprise_name_at_purchase`, `current_enterprise_name`, `organization_name`, `current_customer_plan_type`
+
+### Enterprise Orgs
+
+**Contract accounts**:
+- Indegene
+- HearWell_BeWell
+- Novig
+- Cylndr Studios
+- Miroma
+- Deriv
+- McCann_Paris
+
+**Pilot accounts**: All other enterprise orgs (suffixed with " Pilot")
+
+**Excluded**: Lightricks, Popular Pays, None
+
+## 6. Constraints & Done
+
+### DO NOT
+
+- **DO NOT** modify enterprise segmentation CTE — use exact version from bq-schema.md
+- **DO NOT** alert on free/self-serve users — this agent is enterprise-only
+- **DO NOT** combine McCann_NY and McCann_Paris — keep them separate
+- **DO NOT** include Lightricks or Popular Pays in enterprise monitoring
+- **DO NOT** use generic baselines — each org compared to its own historical usage
+
+### DO
+
+- **DO** use EXACT enterprise segmentation CTE from bq-schema.md (lines 441-461) without modification
+- **DO** apply McCann split (McCann_NY vs McCann_Paris)
+- **DO** exclude Lightricks, Popular Pays, and None from enterprise orgs
+- **DO** break out pilot vs contracted accounts
+- **DO** include org name in alert for customer success follow-up
+- **DO** use org-specific baselines (each org's 30-day average)
+- **DO** flag DoD/WoW usage drops > 30% vs org baseline
+- **DO** flag MAU below org's 30-day average
+- **DO** flag token consumption < 50% of contracted quota
+- **DO** flag power user drops > 20% within org
+- **DO** flag zero activity for 7+ consecutive days
+- **DO** route alerts to customer success team with org details
+- **DO** validate unusual patterns with customer success before alerting
+
+### Completion Criteria
+
+✅ All enterprise orgs monitored (Contract and Pilot)
+✅ Org-specific baselines applied
+✅ McCann split applied (NY vs Paris)
+✅ Lightricks and Popular Pays excluded
+✅ Alerts include org name and account type
+✅ Usage drops, quota issues, and engagement drops detected
+✅ Findings routed to customer success team
diff --git a/agents/monitoring/revenue/SKILL.md b/agents/monitoring/revenue/SKILL.md
index 07f45b4..804d688 100644
--- a/agents/monitoring/revenue/SKILL.md
+++ b/agents/monitoring/revenue/SKILL.md
@@ -1,64 +1,224 @@
 ---
 name: revenue-monitor
-description: Monitors revenue metrics, tracks subscription changes, and alerts on revenue anomalies or threshold breaches.
-tags: [monitoring, revenue, subscriptions]
+description: "Monitor LTX Studio revenue metrics and subscription changes. Detects revenue anomalies, churn spikes, and refund issues. Use when: (1) detecting revenue drops, (2) alerting on churn rate changes, (3) investigating subscription tier movements."
+tags: [monitoring, revenue, subscriptions, churn, mrr]
 ---
 
 # Revenue Monitor
 
-## When to use
+## 1. Overview (Why?)
 
-- "Monitor revenue trends"
-- "Alert when revenue drops"
-- "Track subscription churn"
-- "Monitor MRR/ARR changes"
-- "Alert on refund spikes"
+LTX Studio revenue varies by tier (Free/Lite/Standard/Pro/Enterprise) and plan type (self-serve/contract/pilot). Revenue monitoring requires tracking both top-line metrics (MRR, ARR) and operational indicators (churn, refunds, tier movements) to detect problems early.
 
-## What it monitors
+This skill provides **autonomous revenue monitoring** with alerting on revenue drops, churn rate spikes, refund increases, and subscription changes. It monitors all segments and identifies which tiers or plan types drive changes.
 
-- **Revenue metrics**: MRR, ARR, daily revenue
-- **Subscription metrics**: New subscriptions, cancellations, churns, renewals
-- **Refunds**: Refund rate, refund amount
-- **Tier changes**: Upgrades, downgrades
-- **Enterprise contracts**: Contract value, renewals
+**Problem solved**: Detect revenue problems, churn risk, and subscription health issues before they compound — with segment-level root cause analysis.
 
-## Steps
+## 2. Requirements (What?)
 
-1. **Gather requirements from user:**
-   - Which revenue metric to monitor
-   - Alert threshold (e.g., "drop > 10%", "< $X per day")
-   - Time window (daily, weekly, monthly)
-   - Notification channel (Slack, email)
+Monitor these outcomes autonomously:
 
-2. **Read shared files:**
-   - `shared/bq-schema.md` — Subscription tables (ltxstudio_user_tiers_dates, etc.)
-   - `shared/metric-standards.md` — Revenue metric definitions
+- [ ] Revenue drops by segment (tier, plan type)
+- [ ] MRR and ARR trend changes
+- [ ] Churn rate increases above baseline
+- [ ] Refund rate spikes
+- [ ] New subscription volume drops
+- [ ] Tier movements (upgrades vs downgrades)
+- [ ] Enterprise contract renewals approaching (< 90 days)
+- [ ] Alerts fire only when changes exceed thresholds
+- [ ] Root cause identifies which tiers/plans drove changes
 
-3. **Write monitoring SQL:**
-   - Query current metric value
-   - Compare against historical baseline or threshold
-   - Flag anomalies or breaches
+## 3. Progress Tracker
 
-4. **Present to user:**
-   - Show SQL query
-   - Show example alert format
-   - Confirm threshold values
+* [ ] Read shared knowledge (schema, metrics, business context)
+* [ ] Write monitoring SQL with baseline comparisons
+* [ ] Execute query for target date range
+* [ ] Analyze results by segment (tier, plan type)
+* [ ] Identify root cause (which segments drove changes)
+* [ ] Present findings with severity and recommendations
+* [ ] Set up ongoing alerts (if requested)
 
-5. **Set up alert** (manual for now):
-   - Document SQL in Hex or BigQuery scheduled query
-   - Configure Slack webhook or notification
+## 4. Implementation Plan
 
-## Reference files
+### Phase 1: Read Alert Thresholds
 
-| File | Read when |
-|------|-----------|
-| `shared/bq-schema.md` | Writing SQL for subscription/revenue tables |
-| `shared/metric-standards.md` | Defining revenue metrics |
+**Generic thresholds** (data-driven analysis pending):
+- Revenue drop > 15% DoD or > 10% WoW
+- Churn rate > 5% or increase > 2x baseline
+- Refund rate > 3% or spike > 50% DoD
+- New subscriptions drop > 20% WoW
+- Enterprise renewals < 90 days out
 
-## Rules
+[!IMPORTANT] These are generic thresholds. Consider creating production thresholds based on 60-day analysis (similar to usage/GPU cost monitoring).
 
-- DO use LTX Studio subscription tables from bq-schema.md
-- DO exclude is_lt_team unless explicitly requested
-- DO validate thresholds with user before setting alerts
-- DO NOT hardcode dates — use rolling windows
-- DO account for timezone differences in daily revenue calculations
+### Phase 2: Read Shared Knowledge
+
+Before writing SQL, read:
+- **`shared/product-context.md`** — LTX products, user types, business model, enterprise context
+- **`shared/bq-schema.md`** — Subscription tables (ltxstudio_user_tiers_dates, ltxstudio_subscriptions, etc.), user segmentation queries
+- **`shared/metric-standards.md`** — Revenue metric definitions (MRR, ARR, churn, LTV)
+- **`shared/event-registry.yaml`** — Feature events (if analyzing feature-driven revenue)
+
+**Data nuances**:
+- Tables: `ltxstudio_user_tiers_dates`, `ltxstudio_subscriptions`
+- Key columns: `lt_id`, `griffin_tier_name`, `subscription_start_date`, `subscription_end_date`, `plan_type`
+- Exclude `is_lt_team` unless explicitly requested
+- Revenue calculations vary by plan type (self-serve vs contract/pilot)
+
+### Phase 3: Write Monitoring SQL
+
+✅ **PREFERRED: Monitor all segments with baseline comparisons**
+
+```sql
+WITH revenue_metrics AS (
+  SELECT
+    dt,
+    griffin_tier_name,
+    plan_type,
+    COUNT(DISTINCT lt_id) AS active_subscribers,
+    SUM(CASE WHEN subscription_start_date = dt THEN 1 ELSE 0 END) AS new_subs,
+    SUM(CASE WHEN subscription_end_date = dt THEN 1 ELSE 0 END) AS churned_subs,
+    SUM(mrr_amount) AS total_mrr,
+    SUM(arr_amount) AS total_arr
+  FROM subscription_table
+  WHERE dt >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
+    AND dt <= DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY)
+    AND is_lt_team IS FALSE
+  GROUP BY dt, griffin_tier_name, plan_type
+),
+metrics_with_baseline AS (
+  SELECT
+    *,
+    AVG(total_mrr) OVER (
+      PARTITION BY griffin_tier_name, plan_type
+      ORDER BY dt ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING
+    ) AS mrr_baseline_7d,
+    AVG(churned_subs) OVER (
+      PARTITION BY griffin_tier_name, plan_type
+      ORDER BY dt ROWS BETWEEN 7 PRECEDING AND 1 PRECEDING
+    ) AS churn_baseline_7d
+  FROM revenue_metrics
+)
+SELECT * FROM metrics_with_baseline
+WHERE dt = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY);
+```
+
+**Key patterns**:
+- **Segmentation**: By tier (Free/Lite/Standard/Pro/Enterprise) and plan type (self-serve/contract/pilot)
+- **Baseline**: 7-day rolling average for comparison
+- **Time window**: Yesterday only
+
+### Phase 4: Execute Query
+
+Run query using:
+```bash
+bq --project_id=ltx-dwh-explore query --use_legacy_sql=false --format=pretty "
+<query>
+"
+```
+
+### Phase 5: Analyze Results
+
+**For revenue trends**:
+- Compare current period vs baseline (7-day avg, prior week, prior month)
+- Calculate % change and flag significant shifts (>15% DoD, >10% WoW)
+- Identify which tiers/plans drove changes
+
+**For churn analysis**:
+- Calculate churn rate = churned_subs / active_subscribers
+- Compare to baseline and flag if > 5% or increase > 2x baseline
+- Identify which tiers have highest churn
+
+**For subscription health**:
+- Track new subscription volume by tier
+- Monitor upgrade vs downgrade ratios
+- Flag enterprise renewal dates < 90 days out
+
+### Phase 6: Present Findings
+
+Format results with:
+- **Summary**: Key finding (e.g., "MRR dropped 18% yesterday")
+- **Root cause**: Which segment drove the change (e.g., "Standard tier churned 12 users")
+- **Breakdown**: Metrics by tier and plan type
+- **Recommendation**: Action to take (investigate churn reason, contact at-risk accounts)
+
+**Alert format**:
+```
+⚠️ REVENUE ALERT:
+  • Metric: MRR Drop
+    Current: $X | Baseline: $Y
+    Change: -Z%
+    Segment: Standard tier, self-serve
+
+Recommendation: Investigate recent Standard tier churns
+```
+
+### Phase 7: Set Up Alert (Optional)
+
+For ongoing monitoring:
+1. Save SQL query
+2. Set up in BigQuery scheduled query or Hex Thread
+3. Configure notification threshold
+4. Route alerts to Revenue/Growth team
+
+## 5. Context & References
+
+### Shared Knowledge
+- **`shared/product-context.md`** — LTX products, user types, business model, enterprise context
+- **`shared/bq-schema.md`** — Subscription tables, user segmentation queries
+- **`shared/metric-standards.md`** — Revenue metric definitions (MRR, ARR, churn, LTV, retention)
+- **`shared/event-registry.yaml`** — Feature events for feature-driven revenue analysis
+
+### Data Sources
+Tables: `ltxstudio_user_tiers_dates`, `ltxstudio_subscriptions`
+- Key columns: `lt_id`, `griffin_tier_name`, `subscription_start_date`, `subscription_end_date`, `plan_type`, `mrr_amount`, `arr_amount`
+- Filter: `is_lt_team IS FALSE` for customer revenue
+
+### Tiers
+- Free (no revenue)
+- Lite (lowest paid tier)
+- Standard (mid-tier)
+- Pro (high-tier)
+- Enterprise (contract/pilot)
+
+### Plan Types
+- Self-serve (automated subscription)
+- Contract (enterprise contract)
+- Pilot (enterprise pilot)
+
+## 6. Constraints & Done
+
+### DO NOT
+
+- **DO NOT** include is_lt_team users unless explicitly requested
+- **DO NOT** hardcode dates — use rolling windows
+- **DO NOT** use absolute thresholds — compare to baseline
+- **DO NOT** mix plan types without proper segmentation
+- **DO NOT** ignore timezone differences in daily revenue calculations
+
+### DO
+
+- **DO** use LTX Studio subscription tables from bq-schema.md
+- **DO** exclude is_lt_team unless explicitly requested
+- **DO** validate thresholds with user before setting alerts
+- **DO** use rolling windows (7-day, 30-day baselines)
+- **DO** account for timezone differences in daily revenue calculations
+- **DO** segment by tier AND plan type for root cause analysis
+- **DO** compare against baseline (7-day avg, prior period) for trends
+- **DO** calculate churn rate = churned / active subscribers
+- **DO** flag MRR drops > 15% DoD or > 10% WoW
+- **DO** flag churn rate > 5% or increase > 2x baseline
+- **DO** flag refund rate > 3% or spike > 50% DoD
+- **DO** flag new subscription drops > 20% WoW
+- **DO** track enterprise renewal dates < 90 days out
+- **DO** include segment breakdown in all alerts
+- **DO** validate unusual patterns with Revenue/Growth team before alerting
+
+### Completion Criteria
+
+✅ All revenue metrics monitored (MRR, ARR, churn, refunds, new subs)
+✅ Alerts fire with thresholds (generic pending production analysis)
+✅ Segment-level root cause identified
+✅ Findings presented with recommendations
+✅ Timezone handling applied to daily revenue
+✅ Enterprise renewals tracked
diff --git a/agents/monitoring/usage/README.md b/agents/monitoring/usage/README.md
new file mode 100644
index 0000000..2fff6c2
--- /dev/null
+++ b/agents/monitoring/usage/README.md
@@ -0,0 +1,184 @@
+# Usage Monitor
+
+Autonomous usage monitoring for LTX Studio using statistical anomaly detection.
+
+## Overview
+
+Detects **data spikes** (both increases and decreases) in user engagement metrics by comparing yesterday's values against the last 10 same-day-of-week data points using 2 standard deviations (2σ) threshold.
+
+**Problem solved:** Early detection of significant changes in user behavior, product adoption, feature launches, enterprise churn risk, or engagement shifts.
+
+## Quick Start
+
+```bash
+# Install dependency (one-time)
+pip3 install google-cloud-bigquery
+
+# Monitor yesterday (default)
+python3 usage_monitor.py
+
+# Monitor specific date
+python3 usage_monitor.py --date 2026-03-09
+
+# Show help
+python3 usage_monitor.py --help
+```
+
+## What It Monitors
+
+### Segments (prioritized)
+1. **Enterprise Contract** - Contracted enterprise accounts (weekdays only)
+2. **Enterprise Pilot** - Pilot enterprise accounts (weekdays only)
+3. **Heavy Users** - Active 4+ weeks, paying, consuming tokens
+4. **Paying non-Enterprise** - Paid tiers excluding enterprise
+5. **Free** - Free tier users
+
+### Metrics per Segment
+- **DAU** (Daily Active Users)
+- **Token consumption**
+- **Image generations**
+- **Video generations** (skipped for Enterprise - too volatile)
+
+## Statistical Method
+
+**Alert Logic:** `|yesterday_value - μ| > 2σ`
+
+Where:
+- **μ (mean)** = average of last 10 same-day-of-week values
+- **σ (stddev)** = standard deviation of last 10 same-day-of-week values
+- **z-score** = (yesterday - μ) / σ
+
+**Why same-day-of-week?**
+- Monday usage differs from Friday usage
+- Comparing Monday to last 10 Mondays (not all days)
+- Accounts for weekly seasonality
+
+**Why 2 standard deviations?**
+- Captures 95.4% of normal variance
+- Auto-adapts to each segment's natural patterns
+- Balances early detection with false positive reduction
+
+**Severity Levels:**
+- **NOTICE**: `2 < |z| ≤ 3` - Minor deviation, monitor
+- **WARNING**: `3 < |z| ≤ 4.5` - Significant anomaly, investigate
+- **CRITICAL**: `|z| > 4.5` - Extreme anomaly, immediate action
+
+## Example Output
+
+```
+================================================================================
+ LTX STUDIO USAGE MONITORING - 2026-03-09
+ Day: WEEKDAY
+ Method: 2 Standard Deviations (2σ) from last 10 same-day-of-week
+================================================================================
+
+⏳ Running BigQuery query...
+✅ Query complete (12,345,678 bytes processed)
+
+🔴 3 ALERTS DETECTED
+
+================================================================================
+
+⚠️  WARNING ALERTS (2):
+
+  • Free - Tokens
+    Current: 4,497,947 | Mean (μ): 3,068,455 | Std Dev (σ): 426,074
+    Z-score: 3.36 (3σ < |z| ≤ 4.5σ)
+    Change: +46.6% from mean
+
+  • Heavy Users - Tokens
+    Current: 2,157,891 | Mean (μ): 1,246,532 | Std Dev (σ): 263,847
+    Z-score: 3.45 (3σ < |z| ≤ 4.5σ)
+    Change: +73.0% from mean
+
+ℹ️  NOTICE ALERTS (1):
+
+  • Paying non-Enterprise - Dau
+    Current: 1,234 | Mean (μ): 1,450 | Std Dev (σ): 89
+    Z-score: -2.43 (2σ < |z| ≤ 3σ)
+    Change: -14.9% from mean
+
+================================================================================
+Total: 0 CRITICAL, 2 WARNING, 1 NOTICE
+================================================================================
+
+Note: 2σ threshold captures 95.4% of normal variance
+      CRITICAL: |z| > 4.5, WARNING: 3 < |z| ≤ 4.5, NOTICE: 2 < |z| ≤ 3
+```
+
+## Root Cause Investigation
+
+When alerts fire for **Enterprise** segments, use `investigate_root_cause.sql` to drill down to organization level:
+
+```bash
+# Copy SQL to BigQuery console or bq CLI
+# Shows week-over-week change by organization
+```
+
+For other segments (Heavy, Paying, Free), investigate:
+- Tier distribution (Standard vs Pro vs Lite)
+- Feature launches or product changes
+- Marketing campaigns or promotional activity
+
+## Exception Handling
+
+**Enterprise weekends are suppressed:**
+- Enterprise Contract and Pilot alerts only fire on weekdays (Mon-Fri)
+- Weekend usage is too sparse and volatile for statistical reliability
+- Prevents false positives from low weekend activity
+
+## Files
+
+- **`SKILL.md`** - Agent instructions and workflow documentation
+- **`usage_monitor.py`** - Combined SQL + alerting logic (303 lines)
+- **`investigate_root_cause.sql`** - Org-level drill-down for Enterprise alerts
+- **`README.md`** - This file
+
+## Interpreting Results
+
+**Positive spikes (increase):**
+- Feature launches driving adoption
+- Marketing campaigns succeeding
+- Product improvements resonating
+- Enterprise expansion
+
+**Negative spikes (decrease):**
+- Churn events
+- Product issues or bugs
+- Competitive threats
+- Enterprise account going dormant
+
+**For CRITICAL alerts:**
+- Immediate investigation required
+- Contact account managers (if Enterprise)
+- Check for system outages or data pipeline issues
+
+**For WARNING alerts:**
+- Monitor for persistence (repeats next day?)
+- Investigate root cause during business hours
+
+**For NOTICE alerts:**
+- Track trend over next few days
+- Document for pattern analysis
+
+## Integration
+
+**Future enhancements:**
+- Slack notifications for CRITICAL/WARNING alerts
+- Linear issue creation for persistent alerts
+- Historical alert log and trend analysis
+- Automated root cause recommendations
+
+## Requirements
+
+- Python 3.7+
+- `google-cloud-bigquery` package
+- BigQuery access to `ltx-dwh-prod-processed` dataset
+- Execution project: `ltx-dwh-explore`
+
+## Data Source
+
+- **Table:** `ltx-dwh-prod-processed.web.ltxstudio_agg_user_date`
+- **Partitioned by:** `dt` (DATE)
+- **Lookback window:** 70 days (ensures 10+ same-DOW data points)
+- **LT team excluded:** Already filtered at table level