Skip to content

feat: canonical alert-triage app + k3d/local-Temporal on-ramp #74

@lex00

Description

@lex00

chant's tutorials all currently start with "first, get an AWS account / GCP project / Azure subscription." That's a meaningful adoption barrier. A canonical app that runs end-to-end locally on k3d + Docker Compose Temporal — clone the repo, npm run dev, see it work in 60 seconds — fixes the on-ramp problem.

This issue ships the foundational pieces: the canonical app itself, the local stack (k3d + Temporal docker-compose), and the rename of temporal-self-hostedtemporal-stack to reflect what it actually is. Substrate-specific tutorial standardization and lexicon pruning come in a follow-up issue once this is closer to done.

What the app does

Working name: alert-triage. A Temporal-driven incident-response bot that consumes events from two sources and runs an agentic triage workflow over each:

  1. External alerts — webhook receiver accepts incoming alerts (Datadog / PagerDuty shape; tutorial uses a synthetic source)
  2. Internal drift eventsWatchOp periodically diffs declared vs cloud state; emits a drift event when something fires; same triage workflow handles those too

Per event, the workflow:

  1. Classifies severity and gathers context via a ToolRegistry (kubectl, logs, dig, etc.)
  2. Proposes a remediation
  3. Gates on human approval via Temporal signal when severity is ambiguous or the remediation is risky
  4. Posts the diagnosis + outcome to Slack (or stdout in local dev)

Why this app vs alternatives:

  • Chant-native by defaultWatchOp already exists; this is what consumes its output. Showcases drift detection as first-class.
  • Useful as a real template — anyone building on-call tooling can fork it and replace the synthetic alert source with their actual monitoring webhook.
  • Demonstrates real lexicons in concert — worker = K8s Deployment; webhook = Service + Ingress; secrets via ESO; Temporal namespace + search attributes from the temporal lexicon.

Architecture

Single source tree under examples/alert-triage/:

```
examples/alert-triage/
├── chant.config.ts # one config, multiple temporal profiles (local, cloud)
├── src/
│ ├── alert-triage.op.ts # Temporal Op declaration (the triage workflow)
│ ├── activities/ # agent + tool implementations
│ ├── k8s.ts # manifests: worker, webhook, ingress, ESO
│ └── temporal.ts # TemporalNamespace + SearchAttribute resources
├── docker-compose.yml # local Temporal (generated by chant build)
├── k3d-config.yaml # local cluster definition
└── package.json
```

npm run dev does:

  1. k3d cluster create -c k3d-config.yaml
  2. docker compose up -d (local Temporal + UI)
  3. chant build
  4. kubectl apply -f dist/k8s.yaml
  5. npm run worker
  6. npm run alert demo

Total: 60 seconds wall-clock if Docker is warm. Temporal Web UI at localhost:8080 shows the workflow running.

Rename: temporal-self-hostedtemporal-stack

Current name implies self-hosted-only; the example actually covers both local Docker and prod Helm. Rename and clarify the scope ("Temporal infrastructure: local Docker for dev, Helm for prod"). The agentic-gate showcase does NOT belong here — it belongs in the canonical app.

Tasks

  • Rename examples/temporal-self-hosted/examples/temporal-stack/; update README, sidebar entry, internal links
  • Create examples/alert-triage/ with the structure above
  • Implement worker + webhook + workflow + activities — target under 500 LOC across the lot
  • Wire up ToolRegistry (Anthropic API key via env in local; ESO + Secret Manager in cloud profile)
  • Generate docker-compose.yml via TemporalDevStack (already exists in temporal lexicon)
  • Provide k3d-config.yaml and npm run dev orchestration
  • Provide npm run alert demo synthetic source
  • Wire WatchOp against the deployed namespace as the second event source
  • Tutorial: docs/src/content/docs/tutorials/alert-triage-local.mdx — k3d + Temporal local only
  • CI smoke test: build, lint, activity unit tests

Done when

  • New user with Docker installed clones repo, runs npm run dev in examples/alert-triage, sees the workflow execute in the Temporal UI within 60 seconds
  • Same source tree is deployable to plain K8s + Temporal Cloud with only chant.config.ts profile changes (no app-code changes)
  • temporal-stack is the renamed, scope-clarified infra example with no overlap
  • Tutorial reads: this is the canonical app; substrate variants come separately

Out of scope (this issue)

  • Cloud-substrate tutorials (AKS, EKS, GKE) — follow-up issue
  • Lexicon pruning — follow-up issue
  • Tutorial-template standardization across all examples — follow-up issue
  • Multi-region / HA versions — KISS for now

Why now

On-ramp friction is the biggest "huh let me try this" barrier in chant's tutorials. Local stack solves it. The canonical app gives the rest of the doc rollup something concrete to standardize against.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions