Skip to content

feat: community mode + trustworthy telemetry (v0.12.0.0)#416

Open
garrytan wants to merge 28 commits intomainfrom
garrytan/community-mode
Open

feat: community mode + trustworthy telemetry (v0.12.0.0)#416
garrytan wants to merge 28 commits intomainfrom
garrytan/community-mode

Conversation

@garrytan
Copy link
Owner

Summary

  • Trustworthy telemetry data. Source field (live/test/dev) tags every event. All dashboard and edge function queries filter source=live. E2E test noise (~230 of 232 events) is now separated from production data.
  • You can now count real users. UUID install fingerprint for all tiers. Update-check pings ungated from telemetry opt-in (sends only version + OS + random UUID).
  • 56-year durations fixed. _TEL_START persisted to file via $PPID instead of shell variable. Duration capped at 86,400s with CHECK constraint.
  • Community tier. Email OTP auth, cloud backup, benchmarks, skill recommendations.
  • Growth funnel metrics. SQL views for install → activate → retain, version adoption velocity, daily active installs.
  • One-liner installer. bash <(curl -fsSL https://raw.githubusercontent.com/garrytan/gstack/main/install.sh)
  • Dead code cleanup. Deleted unused telemetry-ingest edge function.
  • 8 new telemetry tests (source field, duration caps, fingerprint persistence).

Reviews

  • CEO Review: CLEAR (SELECTIVE EXPANSION — 3 cherry-picks accepted)
  • Eng Review: CLEAR (3 migration bugs caught + fixed)
  • Codex Review: 12 issues found, all addressed (expand-contract migration, growth funnel logic, transparency timing, etc.)

Test Coverage

585 tests pass, 0 failures. 8 new telemetry tests covering source field, duration validation, UUID fingerprint generation/persistence, backward-compat field mapping.

Pre-Landing Review

No issues found (review at commit b437b53).

Deployment Notes

Requires Supabase admin access (manual steps):

  1. Run Phase 4A: UPDATE telemetry_events SET duration_s = NULL WHERE duration_s > 86400 OR duration_s < 0
  2. Run migration 003 (supabase/migrations/003_source_and_guards.sql)
  3. Run Phase 4B: source backfill SQL
  4. Deploy updated edge functions (community-pulse, community-benchmarks)

Test plan

  • 585 tests pass (telemetry + gen-skill-docs + skill-validation)
  • Source field in JSONL output (--source flag + env fallback)
  • Duration capping (>86400 → null, <0 → null)
  • UUID fingerprint generated, persisted, lowercase-normalized
  • install_fingerprint in JSON (replaces installation_id)
  • Backward-compat: old JSONL installation_id mapped to install_fingerprint in sync
  • Update-check pings fire with telemetry=off
  • Dashboard filters source=eq.live on all queries

🤖 Generated with Claude Code

garrytan and others added 27 commits March 19, 2026 22:54
Adds user_id, email, config/analytics/retro snapshots, and backup
versioning to installations. Creates community_benchmarks table with
public read + service-role write RLS. Foundation for authenticated
backup and community intelligence features.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two-path authentication: enter 6-digit code in terminal OR click magic
link in email. Races both paths — whichever completes first wins.
Saves JWT to ~/.gstack/auth-token.json with auto-refresh. Includes
status and logout subcommands.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three bug fixes:
- Telemetry-sync now pings update_checks on successful event sync
  (previously only in gstack-update-check on cache-miss path)
- community-pulse falls back to distinct session_id count when
  update_checks is empty
- Dashboard queries session_id and shows unique session count

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- gstack-community-backup: syncs config/analytics/retro to Supabase
  using auth JWT, rate-limited to 30min intervals
- gstack-community-restore: pulls backup from Supabase, merges with
  local state (local wins on conflicts), supports --dry-run
- gstack-community-benchmarks: compares your per-skill duration avg
  against community median with delta percentages

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- community-benchmarks: computes per-skill median/p25/p75 duration,
  total runs, and success rate from last 30 days of telemetry events.
  Upserts into community_benchmarks table, cached 1 hour.
- community-recommendations: co-occurrence-based skill suggestions
  ("used by X% of /qa users"). Cached 24 hours.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Telemetry prompt now offers Community (backup/benchmarks/email),
Anonymous, or Off. Community tier triggers gstack-auth OTP flow.
Adds one-time upgrade prompt for existing anonymous users.
Preamble emits EMAIL, COMM_PROMPTED, AUTH status vars.
All 33 SKILL.md files regenerated for Claude Code + Codex/agents.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
E2E test runner now sets GSTACK_STATE_DIR to a temp directory so
skill preamble telemetry goes to /tmp/ instead of ~/.gstack/. Prevents
test runs from polluting production Supabase with fake crash events
(was causing 252 spurious "timeout" crashes from a single test session).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds error_message (max 200 chars, e.g. "bun test: 3 tests failed")
and failed_step (e.g. "run_tests", "create_pr") to telemetry events.
Schema, ingest function, and local logger all updated. Makes crash
reports actionable instead of just "timeout — 252 occurrences".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Magic link requires matching the Supabase Site URL to a dynamic local
port, which doesn't work reliably. OTP is the right UX for a CLI tool
— user is already in a terminal, typing 6 digits is fast. Removes
bun callback server, nc listener, port detection, and cleanup traps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Crash clusters now grouped by error_class (not duplicated per version).
Shows errors with skill, error class, count, failed step, example
message, and unique session count — so you can tell if it's one user
or widespread.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Epilogue now instructs Claude to classify errors (error_class from a
defined taxonomy), write a one-line error_message, and identify the
failed_step. All 33 SKILL.md files regenerated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Accept main's generated SKILL.md files (will be regenerated by bun run build).
Resolve gen-skill-docs.ts: keep community tier 3-option prompt from branch,
keep error context fields from branch, add PLAN MODE EXCEPTION from main.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve conflict in gen-skill-docs.ts by keeping both the detailed
error field instructions (community-mode) and the new Plan Status
Footer section (main).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ration guards

- Add source field (live/test/dev) to telemetry pipeline: --source flag in
  gstack-telemetry-log, GSTACK_TELEMETRY_SOURCE env fallback, pass-through
  in telemetry-sync, source=eq.live filter on all dashboard queries
- Replace SHA-256 installation_id with UUID install_fingerprint for all tiers
  (not just community). Expand-contract migration: ADD new column + trigger
  to copy installation_id, preserving backward compat with old clients
- Fix duration bug: persist _TEL_START to file via $PPID (stable across bash
  blocks), cap durations at 86400s, reject negative values
- Ungate update-check pings from telemetry=off — sends only version + OS +
  random UUID. Generate .install-id in update-check for telemetry=off users
- Migration 003: source columns, install_fingerprint, duration CHECK
  constraint, indexes, recreated views with source filter, growth funnel
  (first-seen based), materialized views for daily installs + version adoption
- E2E test isolation: session-runner sets GSTACK_TELEMETRY_SOURCE=test
- 8 new telemetry tests (source field, duration caps, fingerprint persistence)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Regenerated via bun run gen:skill-docs. Preamble now persists TEL_START
and SESSION_ID to $PPID files + echoes them. Epilogue reads from files
and passes --source flag.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- install.sh: curl-pipe-bash installer with prereq checks (git, bun),
  upgrade detection (git pull if already installed), transparency note
  about update-check pings
- setup: add install ping at end (gstack-update-check --force) to
  register day-zero installs in Supabase
- Install ping only in setup (not install.sh) to avoid double-counting
  (Codex review fix #7)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- community-benchmarks: add .eq("source", "live") to telemetry_events query
- community-pulse: use distinct install_fingerprint count instead of raw
  count, add source=live filter to all queries

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Community tier auth, backup/restore, and test updates that were already
on this branch before the telemetry sprint. Includes updated telemetry
prompt test to match 3-option community tier flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add update-check transparency note to telemetry prompt (Codex fix #9):
  users see the disclosure about version pings at first telemetry prompt
- Add one-liner install to README: bash <(curl -fsSL .../install.sh)
  alongside the existing Claude Code paste-in-terminal approach

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
telemetry-sync POSTs directly to Supabase REST API (/rest/v1/telemetry_events),
not through this edge function. Two ingest paths = maintenance burden for zero
value. Identified during eng review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Community mode + trustworthy telemetry: source tagging, UUID fingerprinting,
duration guards, growth funnel metrics, one-liner installer, edge function
source filtering, dead code cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolve conflicts in VERSION, package.json, and CHANGELOG.md.
Keep 0.12.0.0 version with community mode entry on top,
followed by 0.11.12.0 and 0.11.11.0 entries from main.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link

github-actions bot commented Mar 24, 2026

E2E Evals: ❌ FAIL

77/113 tests passed | $19.69 total cost | 12 parallel runners

Suite Result Status Cost
e2e-browse 7/7 $0.34
e2e-deploy 4/4 $0.56
e2e-design 7/7 $2.11
e2e-plan 6/6 $2.68
e2e-qa-bugs 3/3 $1.56
e2e-qa-workflow 4/4 $1.16
e2e-review 7/7 $1.85
e2e-routing 6/21 $4.09
e2e-workflow 3/9 $0.77
llm-judge 24/24 $0.48
e2e-routing 6/21 $4.09

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

Failures

  • ❌ journey-ideation: success
  • ❌ journey-plan-eng: success
  • ❌ journey-visual-qa: success
  • ❌ journey-qa: success
  • ❌ journey-debug: success
  • ❌ journey-ideation: success
  • ❌ journey-qa: success
  • ❌ journey-visual-qa: success
  • ❌ journey-ideation: success
  • ❌ journey-debug: success
  • ❌ journey-visual-qa: success
  • ❌ journey-design-system: success
  • ❌ journey-qa: success
  • ❌ journey-debug: success
  • ❌ journey-design-system: success
  • ❌ /ship local workflow: success
  • ❌ /ship local workflow: success
  • ❌ /ship local workflow: success
  • ❌ /setup-browser-cookies detect: error_max_turns
  • ❌ /setup-browser-cookies detect: error_max_turns
  • ❌ /setup-browser-cookies detect: error_max_turns
  • ❌ journey-ideation: success
  • ❌ journey-plan-eng: success
  • ❌ journey-visual-qa: success
  • ❌ journey-qa: success
  • ❌ journey-debug: success
  • ❌ journey-ideation: success
  • ❌ journey-qa: success
  • ❌ journey-visual-qa: success
  • ❌ journey-ideation: success
  • ❌ journey-debug: success
  • ❌ journey-visual-qa: success
  • ❌ journey-design-system: success
  • ❌ journey-qa: success
  • ❌ journey-debug: success
  • ❌ journey-design-system: success

Resolve conflicts from v0.11.13.0 merge (worktree isolation + resolver
refactor). Keep 0.12.0.0 version, take main's modular gen-skill-docs
resolvers, regenerate all SKILL.md files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant