Skip to content

feat(db): discovery_snapshots table — daily Discovery-Tracking storage (P2 sub-step)#55

Merged
MoltyCel merged 1 commit into
mainfrom
feat/discovery-snapshots-migration
May 20, 2026
Merged

feat(db): discovery_snapshots table — daily Discovery-Tracking storage (P2 sub-step)#55
MoltyCel merged 1 commit into
mainfrom
feat/discovery-snapshots-migration

Conversation

@MoltyCel
Copy link
Copy Markdown
Owner

Summary

P2 Sub-Step (a) per Discovery-Tracking-Baseline SPEC (PR #54 merged): neue Tabelle discovery_snapshots für tägliche Discovery-Surface-Snapshots.

Klein + fokussiert — nur CREATE TABLE + INDEX + COMMENTs. Kein Code-Change. Idempotent (IF NOT EXISTS).

Schema

CREATE TABLE discovery_snapshots (
  id           BIGSERIAL PRIMARY KEY,
  snapshot_at  DATE        NOT NULL UNIQUE,        -- one row per day
  generated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
  payload      JSONB       NOT NULL,
  source_run_status TEXT   NOT NULL DEFAULT 'ok'
    CHECK (source_run_status IN ('ok','partial','failed'))
);

CREATE INDEX idx_discovery_snapshots_at ON discovery_snapshots(snapshot_at DESC);

JSONB-payload per SPEC §3.5 (5 Sub-Sections: self_probes, gsc, bot_hits, github, errors).

Pre-Commit-Diff (§8)

 migrations/2026-05-21_create_discovery_snapshots.sql | 30 ++++++++++++++++++++++
 1 file changed, 30 insertions(+)

Genau 1 File, etabliertes migrations/YYYY-MM-DD_<desc>.sql-Schema, kein Fremd-Scope.

Branch-Hygiene (§11.4)

Branch ab frischem origin/main (cd1b0e5, 0 behind), Worktree ~/moltrust-api-I.

§2.3 Cross-Review

Skip — reine Schema-Definition, kein Auth-/Credential-/Token-Pfad. Tabelle ist Container für eigene aggregierte Metriken.

Test plan

  • Merge per Merge-Commit
  • Apply via psql -d moltstack -f migrations/2026-05-21_create_discovery_snapshots.sql
  • Verify: \d discovery_snapshots zeigt 5 Spalten + Index + Comments
  • Folge-Sub-Step: Baseline-Snapshot manuell einsammeln + INSERTen (one-off, kein Repo-Track)

Migration per Discovery-Tracking-Baseline SPEC (PR #54 merged 2026-05-20)
§3.5 + §4 + §6 P2.

Schema:
- BIGSERIAL primary key
- DATE-unique (one row per day; UPSERT-safe)
- JSONB payload (schema-flexible — V1 fields can grow without ALTER TABLE)
- source_run_status enum (ok/partial/failed) for daily-cron-health visibility

Idempotent (IF NOT EXISTS on table + index).

Baseline row for 2026-05-21 will be INSERTed manually post-migration-apply
(SPEC §3.6 — Pflicht-Termin heute Abend für Delta-Messbarkeit morgen).
Cron-Script (scripts/discovery_snapshot.py) folgt in P3.
@MoltyCel MoltyCel merged commit 2298618 into main May 20, 2026
10 checks passed
@MoltyCel MoltyCel deleted the feat/discovery-snapshots-migration branch May 20, 2026 23:09
HaraldeRoessler pushed a commit to HaraldeRoessler/moltrust-api that referenced this pull request May 22, 2026
Discovery-Tracking P3.1 per SPEC docs/specs/2026-05-21_discovery-tracking-
baseline-SPEC.md §3.5 + §5.2.

Self-contained daily cron script. Captures 5 sources into the
discovery_snapshots table (migration in PR MoltyCel#55):
- self_probes : GET 4 Discovery surfaces (sitemap.xml URL-count,
  llms.txt MoltGuard-block, /guard/openapi.json path-count,
  /extendedAgentCard MoltGuard-extensions)
- bot_hits    : parse /var/log/nginx/access.log* (last 7d), bot-UA ×
  endpoint-class. moltstack is in `adm` group → cron reads logs
  without sudo. Privacy §3.7: no IPs persisted, only UA-counts.
- github      : GH_TOKEN-authenticated repo + traffic API, 6 MoltyCel
  repos. Graceful "pat-not-configured" if GH_TOKEN absent.
- gsc         : manual-pending (V0 per §9.1).
- errors      : non-fatal failures collected; source_run_status
  ok/partial/failed computed accordingly.

Idempotenz: UPSERT ON CONFLICT (snapshot_at) DO UPDATE — repeated
same-day runs refresh the row, never create a 2nd. DB literal is
dollar-quoted ($disco$) — injection-safe without escaping.

Alerts: Telegram on partial/failed status (TELEGRAM_BOT_TOKEN/CHAT_ID
from ~/.moltrust_secrets).

Flags:
- --dry-run        assemble + print, no DB write
- --date YYYY-MM-DD  override snapshot_at (backfill / throwaway test)

Test-Run verified 2026-05-21 against throwaway date 2099-12-31:
4/4 probes, 16 bots / 1664 hits, 6/6 GitHub repos, upsert ok,
throwaway row deleted, baseline 2026-05-21 untouched.

Crontab entry (server-side, NOT repo-managed per CLAUDE.md §Geltungsbereich
— applied manually post-merge with audit note):
  30 0 * * * set -a && source /home/moltstack/.moltrust_secrets && set +a \
    && cd /home/moltstack/moltstack \
    && /home/moltstack/moltstack/venv/bin/python scripts/discovery_snapshot.py \
    >> logs/discovery_snapshot.log 2>&1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant