Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions demos/smart_factory/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
__pycache__/
*.pyc
.env
.venv/
venv/
node_modules/
# frontend/dist/ — intentionally tracked for Databricks Apps deployment
.databricks/
*.egg-info/
.DS_Store
uv.lock
17 changes: 17 additions & 0 deletions demos/smart_factory/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
.PHONY: build-frontend deploy dev clean

build-frontend:
cd frontend && npm ci && npm run build

deploy: build-frontend
databricks bundle deploy -t dev

deploy-prod: build-frontend
databricks bundle deploy -t prod

dev:
cd frontend && npm run dev &
uvicorn app:app --reload --port 8000

clean:
rm -rf frontend/dist frontend/node_modules
152 changes: 152 additions & 0 deletions demos/smart_factory/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# SmartFactory — IoT Streaming Demo

A customer-facing demo showing how Databricks turns factory sensor data into actionable insights — from machine floor to decision — with no Kafka, no infrastructure to manage, and full governance out of the box.

## What This Demonstrates

| Capability | Business Value |
|---|---|
| **ZeroBus Ingest** | Eliminate Kafka and message bus infrastructure. Sensor data flows directly into governed Delta tables. |
| **SDP Streaming Pipeline** | Catch equipment anomalies as they happen, not in tomorrow's batch report. Continuous, serverless, pure SQL. |
| **ML in the Pipeline** | Every sensor reading scored for anomalies inline — predictive maintenance without a separate ML platform. |
| **Live Operations Dashboard** | Plant managers and technicians see machine health the moment it changes. Faster response, less downtime. |
| **Unity Catalog Governance** | Every table governed from the first byte. Lineage, access control, audit — ready for compliance on day one. |
| **Databricks Apps** | Full-stack app deployed and managed by Databricks. No separate hosting. |
| **DABs** | Entire demo — app, pipeline, dashboard — deploys with a single command. |

## Architecture

![Architecture](docs/architecture.png)

## The Machines

| Machine | Sensors | Fault Scenario |
|---|---|---|
| **CNC Mill** | Temperature, Vibration, Spindle RPM | Overheating, vibration spike |
| **Hydraulic Press** | Pressure, Temperature, Cycle Count | Pressure surge, cycle slowdown |
| **Conveyor Belt** | Belt Speed, Load Weight, Motor Current | Speed drop, overcurrent |

## Demo Story (6 minutes)

> **Pre-flight**: Start streaming + pipeline 30s before presenting. Confirm dashboard has data.

### Act 1 — "This is your factory" (IoT Simulation tab)
> "3 machines, IoT sensors streaming every 2 seconds, directly into Databricks. No Kafka."

- Gauges are already updating, event feed scrolling
- Point out "Streaming" and "Pipeline Running" in the header
- Expand ZeroBus info panel — highlight ≤200ms ack, 10 GB/s, Joby Aviation quote

### Act 2 — "Here's the pipeline" (SDP in Databricks UI)
> "Declarative SQL. Streaming and batch in one pipeline. Fully serverless."

- Switch to Databricks workspace, open the SDP pipeline DAG
- Show Bronze → Silver → Gold with streaming indicators
- Click into Silver SQL — "anomaly detection is a SQL JOIN. Any SQL developer can own this."
- Three SDP benefits: declarative, streaming+batch unified, serverless

### Act 3 — "Let's break something" (Inject a fault)
> "Watch what happens when the CNC Mill starts overheating."

- Click **Fault: CNC Mill** — watch temperature climb, gauges go red
- Event feed lights up with warnings and criticals
- Switch to **Operations Dashboard** — health scores dropping, anomaly log filling

### Act 4 — "Everything is governed" (Unity Catalog)
> "Every table governed. Full lineage from raw sensor event to dashboard."

- Open Catalog Explorer, click Gold table → show lineage graph
- "One command to deploy. No Kafka. No ML infrastructure. Just push and go."

### Act 5 — "Clear the fault" (Resolution)
- Click **Clear All** — readings normalize, health scores recover

See [docs/demo-script.md](docs/demo-script.md) for the full script with talking points and objection handling.

## Quick Start

### Prerequisites
- [Databricks CLI](https://docs.databricks.com/dev-tools/cli/index.html) v0.288+ installed
- CLI profile configured for your target workspace
- Node.js 18+ and npm installed
- An existing Unity Catalog catalog with cloud storage configured

### One-command setup
```bash
git clone <repo-url>
cd smartfactory-demo
./setup.sh <databricks-cli-profile> <catalog_name>
```

This script handles everything:
1. Finds and starts a SQL warehouse
2. Creates the `smartfactory` schema and landing table
3. Builds the React frontend
4. Deploys all resources via DABs (app, pipeline, dashboard)
5. Detects the app service principal and grants all permissions
6. Starts the app and deploys code
7. Sets the SDP pipeline to continuous mode

### After setup
1. Open the app URL printed by the script
2. Click **Streaming** in the header to start the data simulator
3. Click **Start Pipeline** to begin continuous SDP processing
4. Switch between **IoT Simulation** and **Operations Dashboard** tabs
5. Inject faults and watch anomalies flow through the pipeline

> **Important: When you're done demoing, stop the streaming and pipeline!**
> Both consume compute resources. Click **Streaming** (to pause) and **Pipeline Running** (to stop) in the header bar.
> The simulator and pipeline start paused by default — you must manually start them each demo session.

### Redeploying after code changes
```bash
cd frontend && npm run build && cd ..
databricks bundle deploy -t dev
databricks apps deploy smartfactory-app \
--source-code-path /Workspace/Users/<you>/.bundle/smartfactory-demo/dev/files
```

## Project Structure

```
smartfactory-demo/
├── setup.sh # One-command setup script
├── databricks.yml # DABs bundle (app + pipeline + dashboard)
├── app.yaml # Databricks App config
├── app.py # FastAPI backend (WebSocket + REST + pipeline control)
├── simulator.py # 3-machine IoT sensor simulator with fault injection
├── zerobus_client.py # ZeroBus SDK wrapper with SQL INSERT fallback
├── pipeline/
│ ├── bronze.sql # Validated ingestion (streaming table)
│ ├── silver.sql # Anomaly scoring via threshold JOIN (streaming table)
│ └── gold.sql # Health KPIs + anomaly timeline (materialized views)
├── frontend/
│ ├── src/
│ │ ├── App.tsx # Tabbed layout (IoT Simulation + Dashboard)
│ │ ├── components/
│ │ │ ├── FactoryFloor # SVG machine visuals with live sensor readouts
│ │ │ ├── MachineCard # Per-machine gauge cards
│ │ │ ├── SensorGauge # Circular SVG gauge component
│ │ │ ├── ControlPanel # Fault injection buttons
│ │ │ ├── EventFeed # Live scrolling event log
│ │ │ ├── DashboardView # Charts, KPI tables, anomaly log
│ │ │ └── PipelineBanner# Pipeline flow + UC governance badge
│ │ └── hooks/
│ │ └── useWebSocket # Auto-reconnecting WebSocket hook
│ └── dist/ # Pre-built frontend (deployed with app)
├── dashboard.lvdash.json # Lakeview dashboard definition
└── CLAUDE.md # Development notes and known issues
```

## Tech Stack

| Layer | Technology |
|---|---|
| Frontend | React 18, TypeScript, Vite, TailwindCSS, Recharts |
| Backend | FastAPI, Uvicorn, WebSocket |
| Ingestion | ZeroBus SDK (with SQL INSERT fallback) |
| Pipeline | SDP (Spark Declarative Pipelines), serverless |
| ML | SQL threshold-based anomaly detection in SDP Silver layer |
| Governance | Unity Catalog (lineage, access control, audit) |
| Deployment | Databricks Asset Bundles (DABs) |
| Dashboard | Lakeview (AI/BI) + in-app React dashboard |
19 changes: 19 additions & 0 deletions demos/smart_factory/app.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
command: ["uvicorn", "src.app:app", "--host", "0.0.0.0", "--port", "8000"]

env:
- name: CATALOG_NAME
value: dilan_catalog
- name: LANDING_SCHEMA
value: smartfactory
- name: ZEROBUS_TABLE
value: dilan_catalog.smartfactory.raw_sensor_events
- name: WAREHOUSE_ID
value: "01c7fe04f060528e"
- name: PIPELINE_SCHEMA
value: dev_dilan_patel_smartfactory
- name: PIPELINE_ID
value: "4b993ed3-336f-40b5-8bb7-55ff9f056707"
- name: ENABLE_SIMULATOR
value: "false"
- name: SIMULATOR_INTERVAL_MS
value: "1000"
145 changes: 145 additions & 0 deletions demos/smart_factory/dashboards/smartfactory.lvdash.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
{
"pages": [
{
"name": "machine-health-overview",
"displayName": "Machine Health Overview",
"layout": [
{
"widget": {
"name": "machine_health_scores",
"queries": [
{
"name": "main_query",
"query": {
"datasetName": "machine_summary",
"disaggregated": false
}
}
],
"spec": {
"version": 2,
"widgetType": "bar",
"encodings": {
"x": { "fieldName": "machine_id", "displayName": "Machine" },
"y": { "fieldName": "avg_health_score", "displayName": "Health Score" },
"color": { "fieldName": "machine_type", "displayName": "Type" }
}
}
},
"position": { "x": 0, "y": 0, "width": 3, "height": 2 }
},
{
"widget": {
"name": "anomaly_counts",
"queries": [
{
"name": "main_query",
"query": {
"datasetName": "machine_summary",
"disaggregated": false
}
}
],
"spec": {
"version": 2,
"widgetType": "bar",
"encodings": {
"x": { "fieldName": "machine_id", "displayName": "Machine" },
"y": { "fieldName": "total_criticals", "displayName": "Critical Alerts" },
"color": { "fieldName": "machine_id", "displayName": "Machine" }
}
}
},
"position": { "x": 3, "y": 0, "width": 3, "height": 2 }
}
]
},
{
"name": "sensor-trends",
"displayName": "Sensor Trends",
"layout": [
{
"widget": {
"name": "sensor_timeseries",
"queries": [
{
"name": "main_query",
"query": {
"datasetName": "enriched_events",
"disaggregated": false
}
}
],
"spec": {
"version": 2,
"widgetType": "line",
"encodings": {
"x": { "fieldName": "timestamp", "displayName": "Time" },
"y": { "fieldName": "value", "displayName": "Sensor Value" },
"color": { "fieldName": "sensor_name", "displayName": "Sensor" }
}
}
},
"position": { "x": 0, "y": 0, "width": 6, "height": 3 }
}
]
},
{
"name": "anomaly-log",
"displayName": "Anomaly Log",
"layout": [
{
"widget": {
"name": "anomaly_table",
"queries": [
{
"name": "main_query",
"query": {
"datasetName": "anomaly_timeline",
"disaggregated": false
}
}
],
"spec": {
"version": 2,
"widgetType": "table",
"encodings": {
"columns": [
{ "fieldName": "timestamp", "displayName": "Time" },
{ "fieldName": "machine_id", "displayName": "Machine" },
{ "fieldName": "sensor_name", "displayName": "Sensor" },
{ "fieldName": "value", "displayName": "Value" },
{ "fieldName": "anomaly_status", "displayName": "Status" },
{ "fieldName": "critical_threshold", "displayName": "Threshold" }
]
}
}
},
"position": { "x": 0, "y": 0, "width": 6, "height": 4 }
}
]
}
],
"datasets": [
{
"name": "machine_summary",
"displayName": "Machine Summary",
"query": "SELECT * FROM dilan_catalog.smartfactory.gold.machine_summary"
},
{
"name": "enriched_events",
"displayName": "Enriched Events",
"query": "SELECT * FROM dilan_catalog.smartfactory.silver.enriched_events WHERE timestamp > current_timestamp() - INTERVAL 1 HOUR ORDER BY timestamp DESC LIMIT 5000"
},
{
"name": "anomaly_timeline",
"displayName": "Anomaly Timeline",
"query": "SELECT * FROM dilan_catalog.smartfactory.gold.anomaly_timeline LIMIT 500"
},
{
"name": "health_kpis",
"displayName": "Health KPIs",
"query": "SELECT * FROM dilan_catalog.smartfactory.gold.machine_health_kpis"
}
]
}
Loading