Skip to content

Loki: client, deterministic signal, and investigation skill #11

@ooloth

Description

@ooloth

Why

Production logs are the most time-sensitive signal hub is missing. When something goes wrong in production you need to know — without having to remember to check Grafana. And when you do notice a signal, investigating it today means navigating to the project repo, configuring logcli, forming queries manually, and iterating without any pre-loaded context.

What this adds

Three thin, complementary layers:

1. clients/loki — HTTP client

A thin Rust client that fetches raw log entries for a configured LogQL query. No persistence, no aggregation — just the data.

2. workflows/loki — deterministic signal

A workflow that calls the Loki client and applies simple rules to surface a signal in hub status. Starting point: if error-level entries exceed a count threshold in the lookback window, emit a High urgency item. No SQLite persistence yet — live fetch on each hub status call.

3. .claude/skills/loki-investigate.md — investigation skill

A Claude Code skill that lives in hub's repo and reads its configuration (endpoint, LogQL query, project name, environment) from hub.toml context. When invoked, Claude iterates over as many logcli queries as needed to answer the question at hand — forming hypotheses, validating them, surfacing a diagnosis. No hub binary involvement; this is a conversation skill.

Config

[[project.environment]]
env = "prod"

[[project.environment.workflow]]
name = "loki-logs"
endpoint = "https://loki.example.com"
query = '{app="my-app", env="prod"}'
lookback = "1h"
error_threshold = 10

Secrets

  • LOKI_TOKEN — bearer token, injected via op run --env-file=.env

Out of scope

  • SQLite persistence (save for later)
  • Trend history or count-over-time (save for later)
  • Positive signals or non-error log levels (save for later)
  • agents/ crate automation (save for later)

Starting points

  • clients/src/github.rs — pattern to follow for a new HTTP client
  • config/src/toml.rs — where WorkflowConfig variants are defined
  • workflows/src/status.rs — add a StatusItem::Loki(LokiAlert) variant here and push items in run(); the unified list handles sort and render (see Urgency-ranked unified output #26)
  • ui/cli/src/commands/status.rs — add a render_line match arm for the new variant

Done when

  1. hub status shows a Loki error signal for a configured environment when errors exceed the threshold
  2. hub status omits the signal when errors are below the threshold
  3. The loki-investigate Claude Code skill can be invoked from hub's repo and successfully queries the configured Loki endpoint using context from hub.toml

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions