Integrations: Telegram workstation journey, onboarding, and admin panel#181
Open
scion-gteam[bot] wants to merge 88 commits into
Open
Integrations: Telegram workstation journey, onboarding, and admin panel#181scion-gteam[bot] wants to merge 88 commits into
scion-gteam[bot] wants to merge 88 commits into
Conversation
added 30 commits
June 9, 2026 14:01
Add workstation-only system endpoints gated by requireWorkstation: - GET /system/check: GatherDiagnostics returns structured results (git, runtime, config checks) with a ready flag - GET /system/runtime: detect available runtime, return configured profile - PUT /system/runtime: validate and persist runtime choice - GET /system/status: ComputeOnboardingStatus for first-run wizard - POST /system/init: call InitMachine with user-selected harnesses - PUT /system/identity: already wired in Phase 1 All endpoints require loopback origin and return JSON.
Adds comprehensive tests for the workstation onboarding endpoints and security primitives introduced in phases 0-5: - requireWorkstation middleware: 404 when disabled, pass-through when enabled - assertLoopback: table-driven IPv4/IPv6 loopback validation - ClassifyPath: managed path detection, legacy groves path, AlreadyLinked via store - POST /system/init: valid harnesses, unknown harness rejection, empty list rejection - PUT /system/identity: writes and echoes display name and email - POST /system/fs/validate-path: managed-path overlap error, normal path classification - GET /system/fs/list: home directory listing, hidden file filtering, outside-home rejection Also adds missing server.auth.display_name, server.auth.email, and server.auth.username key handling in UpdateVersionedSetting, which the identity endpoint depends on.
Replace the "not yet able to provide pre-built binaries" note with a Homebrew quick start path. Lead with `brew install scion` + `scion server start` which opens the onboarding wizard, then keep the existing go install path as "Install from Source". Add a tip noting that the wizard handles machine init automatically.
…rofile GET/PUT inconsistency (N1)
…n dir-browser (H2, H3, M1)
Fix 2: Reorder setup handler to hot-start the plugin BEFORE persisting to settings.yaml. If hot-start fails, return a clear error (Success:false) and leave no half-applied state. Config is only written after the plugin is confirmed running. Fix 3: Add wireBrokerMu to serialize the get-or-create proxy path in WireBrokerPlugin. Prevents concurrent setups from both seeing proxy==nil and creating separate FanOutBrokers (the second StartMessageBroker is a no-op, losing that spoke).
The telegram plugin (broker_v2.go) only accepts "poll" or "webhook" for inbound_mode — anything else makes Configure() return an error, silently preventing the bot from starting getUpdates polling. Our setup code was writing "polling" in both the hot-start entry and settings.yaml persistence, causing the bot to never respond. Add regression test asserting the value matches the plugin's accepted set.
The telegram plugin runs v1 by default (silently ignores inbound_mode, no /setup, no group links). The hub requires v2 behavior. Without SCION_TELEGRAM_V2=1 in the subprocess env, the bot never polls. Add per-plugin Env support: - PluginEntry.Env (config.go): extra env vars for plugin subprocess - DiscoveredPlugin.Env (discovery.go): propagated through discovery - loadPlugin (manager.go): sets cmd.Env when Env is non-empty - V1PluginEntryLike.Env (settings.go): keeps adapter in sync Set SCION_TELEGRAM_V2=1 for the telegram plugin in both paths: - Hot-start: system_telegram.go PluginEntry.Env - Startup: server_foreground.go initPluginManager conversion loop Add tests verifying the env var is set and propagated.
Polling was gated on Subscribe() being called — but on hot-start, the telegram spoke's Subscribe() is never reached (existing subscriptions were snapshotted before the spoke was added). Even on fresh install with no running agents, bootstrapExistingProjects finds nothing to subscribe. Fix: start polling idempotently at the end of Configure() when inbound_mode==poll AND hub credentials (hub_url + hmac_key) are present. Configure() is called twice: once at load (no creds), once via ConfigureBroker (with creds). startPolling() is already idempotent (no-op if pollCancel != nil or webhook mode). This is correct because inbound messages are delivered via HTTP POST to /api/v1/broker/inbound, independent of Subscribe() handlers. Subscribe() was an incidental gate — the broker should poll whenever it's configured and credentialed. Add tests: - TestV2_Configure_PollStartsWithHubCreds: no polling without creds, polling starts after second Configure with creds - TestV2_Configure_PollIdempotent: re-configure doesn't restart polling
Telegram rejects http:// and localhost URLs in inline keyboard buttons (BUTTON_URL_INVALID), causing /register to silently fail on workstation. Fix: check if hub URL is a valid public HTTPS URL. If not (localhost, http), send a plain-text message with the linking code and local URL instead of an inline button. If even the keyboard send fails on a public URL, fall back to plain text so the user is never left in silence. Also: use a fresh 10s context for the Telegram send (was reusing the 15s ctx already partly consumed by the hub POST), and drop the Markdown parse mode to avoid parse edge cases. Add isPublicHTTPS helper with tests covering https/http/localhost/ loopback. Add TestHandleRegister_LocalhostUsesPlainText verifying the plain-text code path with no inline keyboard.
The wizard resume logic only advanced past steps 0-3 using server-side status flags (identitySet, runtimeOK, harnessesSeeded). Steps 4-6 (Images, Workspace, Telegram) had no resume path, and the onboardingStarted flag used sessionStorage which is lost on browser close. Now currentStep is saved to localStorage on every change and restored on load (capped by the saved value and gated by onboardingStarted). Both onboardingStarted and onboardingStep are cleared when the wizard completes.
Two issues caused the onboarding image step to hang: 1. Race condition: the pull handler started a goroutine that published SSE events immediately, but the frontend only opened its EventSource after receiving the HTTP response. When images already existed, all events fired before the subscriber connected and were lost — the frontend waited forever. Fix: pre-check image existence synchronously in the handler, return results inline (initialResults/needPull), and only start the SSE goroutine for images that actually need pulling. 2. PullImage on all runtimes (podman, docker, apple container) used runInteractiveCommand which attaches stdin/stdout/stderr — wrong when called from a headless server goroutine. Changed to runSimpleCommand which captures output properly.
The macOS auto-detection path in GetRuntime only checked for the Apple 'container' CLI and fell back directly to docker, skipping podman entirely. This caused the runtime broker to select 'container' (or docker) even when podman was the only installed/configured runtime on macOS — breaking agent create/dispatch/list and heartbeat (#177). Fix: check container → podman → docker on macOS, mirroring the Linux path which already checks podman before docker. Each candidate is verified via exec.LookPath so only an actually-present binary is selected. Add a test for settings-based podman resolution.
GetDefaultSettingsData and GetDefaultSettingsDataYAML hardcoded "container" as the default runtime on macOS regardless of whether the binary existed. These functions produce the BASE layer of the settings merge chain (loaded before global/project settings), so even with correct user settings the embedded "profiles.local.runtime: container" would be used whenever profile resolution fell back to the "local" profile. Fix: call DetectLocalRuntime() (which probes podman → container → docker by actual binary availability) instead of a bare OS check. The OS-only fallback is retained as a last resort if no runtime is found. This is the second part of the runtime-selection regression fix — the first commit fixed the auto-detection in factory.go, but the embedded defaults were a separate path that also hardcoded "container" on macOS.
Reorder macOS auto-detection in GetRuntime to match the priority used by DetectLocalRuntime (podman → container → docker). This ensures consistent behavior across both the settings defaults layer and the factory auto-detection fallback.
…e of truth Replace the inline LookPath-based auto-detection in factory.go with a call to config.DetectLocalRuntime(), which is the authoritative runtime probe (podman → container → docker, with --version functional checks). This eliminates the duplicated detection logic and ensures consistent behavior across all runtime selection paths. Update tests to use expectedDefaultRuntime() helper instead of hardcoded OS-based expectations, matching the probing behavior.
…nue UX When no image_registry is configured, the Images step now shows an editable input pre-filled with 'ghcr.io/homebrew-scion' and an "Accept & Continue" button. On accept, it persists the value via a new PUT /api/v1/system/image-registry endpoint (loopback-only), then proceeds to the normal image pull flow. This replaces the previous error-only block that required the user to run a CLI command and restart the server.
Add FanOutEventBus.RemoveSpoke(name) to remove a spoke from the fan-out at runtime, mirroring AddSpoke. Idempotent (no-op if the name doesn't exist). Add plugin.Manager.StopPlugin(type, name) to kill a single plugin subprocess and remove it from the manager's maps. Idempotent (no-op if already stopped). Config and external state (telegram_v2.db) are preserved for re-enable. These primitives enable the integration admin disable flow: StopPlugin → RemoveSpoke → persist enabled:false.
Add multi-integration admin API: - GET /api/v1/admin/integrations — list all integrations with status (merges settings.yaml config + plugin HealthCheck + runtime state) - POST /api/v1/admin/integrations/telegram/enable — hot-start via WireBrokerPlugin, persists telegram to message_broker.types - POST /api/v1/admin/integrations/telegram/disable — StopPlugin + RemoveSpoke + removes from message_broker.types (preserves config) Disable persistence survives restarts: startup's initPluginManager loads the binary, but the broker wiring loop only wires types listed in message_broker.types. Removing "telegram" from types means it won't be wired or credentialed on restart. Enable/disable are both idempotent. Auth: requireWorkstation (loopback) for now, with comment noting future switch to requireAdmin.
Add /admin/integrations page with a multi-integration card layout:
- Shell renders a list of IntegrationStatus cards, each showing type
icon, status badge, details grid, and an enable/disable toggle
- Telegram is the first integration panel (status + toggle)
- INTEGRATION_META registry makes adding future integrations
(Google Chat, GitHub App) a single entry addition
- Cards are data-driven from GET /api/v1/admin/integrations
UI wiring:
- Route: /admin/integrations in main.ts ROUTES + ADMIN_ROUTES
- Nav: "Integrations" item in admin sidebar (nav.ts)
- Page title: "Integrations" in app-shell.ts PAGE_TITLES
The shell is deliberately generic — no telegram-specific logic in
the card renderer. Per-integration detail pages (groups, users) can
be added later as /admin/integrations/{type} routes.
…for local-dir projects When a local-directory project's .scion marker resolves to an external config path (~/.scion/project-configs/<slug>__<uuid>/.scion), the fallback workspace source was computed as filepath.Dir(projectDir), which gave the external config parent instead of the user's actual project directory. Use the original projectPath input to derive the correct workspace mount source.
GROUP 1 — Resume step init: updated() now triggers step loaders when currentStep changes (loadSystemCheck for step 1, loadRuntime for step 2, loadImagesStep for step 4). selectedHarnesses persisted to localStorage in handleHarnessesNext and restored when entering step 4. GROUP 2 — Stale error banner: this.error cleared on every step transition in updated(). GROUP 3 — Render side-effects: storage cleanup moved from renderDone() into updated() when currentStep===7. render() is now pure. GROUP 4 — Small cleanups: SSE onerror sets user-facing error message; removed unused runtimeAvailable state and its fetch; progress bar uses (currentStep+1)/TOTAL_STEPS to reach 100% on the last visible step.
The image pull goroutine used s.ctx (server lifetime) with no per-image timeout, so a hung PullImage would leak the goroutine forever. Now each pull gets a 5-minute context.WithTimeout derived from s.ctx.
Adds a test that simulates the web-created local-dir project flow where the .scion marker resolves to an external config dir without workspace_path in settings. Verifies the /workspace volume mounts to the project directory, not the config directory parent.
1. Cache DetectLocalRuntime result with sync.Once so repeated calls from GetDefaultSettingsData/YAML don't re-exec binaries each time. OverrideRuntimeDetection and mock helpers reset the cache so tests still get fresh probes per test case. 2. Align fallback when DetectLocalRuntime finds no runtime: both init.go and koanf.go now fall back to "docker" (matching factory.go), instead of diverging with "container" on macOS.
After creating a new folder in the dir-browser component, set selectedPath and emit path-selected so the parent form immediately reflects the new folder without requiring an extra click.
MUST-FIX: - WireBrokerPlugin: rollback (StopPlugin) if GetBroker fails after LoadOne, preventing orphaned plugin processes. - broker_v2.go Configure(): make idempotent on re-call. Second call (with hub creds) skips store/api/sendQueue/getMe/inboundMode init, only updates hub client + component handlers + project slug map. Prevents leaking old SQLite connections and duplicate resources. - fanout.go: AddSpoke closes the replaced spoke's Bus; RemoveSpoke closes the removed spoke's Bus. Prevents resource leaks on re-setup and disable. - Unify SCION_TELEGRAM_V2: add Env to V1PluginEntry (settings_v1.go), persist it in PersistTelegramConfig, read it in cold-start conversion (server_foreground.go) and admin enable handler. Removes the hardcoded name=="telegram" check — single source of truth. - system_telegram.go: return generic error to client on hot-start failure, log detailed error server-side. SHOULD-FIX: - CheckObserver: use strings.EqualFold for case-insensitive match; add warning log when GetInfo fails. - ValidateTelegramToken: url.PathEscape the token in getMe URL. - startPolling: add "caller must hold b.mu" comment. - system_telegram.go: add http.MaxBytesReader on both endpoints.
Add EnsureTelegramEnv(name, entry) as the SINGLE mechanism for setting SCION_TELEGRAM_V2=1. It derives the env from the plugin name (name=="telegram" && !selfManaged), not from persisted settings. All three launch paths call it: - Cold-start: server_foreground.go initPluginManager loop - Hot-start: system_telegram.go setup handler - Admin enable: system_integrations.go enable handler This ensures existing settings.yaml files WITHOUT the Env entry still launch telegram as v2 after a restart — no migration needed. Tests: - TestEnsureTelegramEnv_ExistingSettingsWithoutEnv: nil Env → v2 set - TestEnsureTelegramEnv_SetsV2: basic functionality - TestEnsureTelegramEnv_SkipsNonTelegram: other plugins untouched - TestEnsureTelegramEnv_SkipsSelfManaged: self-managed untouched
- Add PublishRaw to PostgresEventPublisher (EventPublisher interface) - Migrate system_handlers_test.go and fs_safety_test.go from removed sqlite.New to newTestStore (ent-backed composite store) - Fix test data to use valid UUIDs (ent schema requires them) - Add BrokerName to ProjectProvider test fixtures (ent validator) - Remove stale conflict marker in server_foreground.go - Restore encoding/json import
Run gofmt on files with formatting issues from rebase conflict resolution (hub_config.go, server.go, fs_safety_test.go, root.go, types.go) and pre-existing main issues (worktree_eligibility_test.go, models.go).
This file uses newTestStore (from teststore_test.go which has the !no_sqlite tag). Without the matching tag, go vet fails when the sqlite driver is excluded.
The function gained a needsOnboarding bool parameter but the 3 test call sites were not updated, causing go vet and golangci-lint to fail.
- system_integrations.go: check json.Encode, StopPlugin, SaveVersionedSettings return values; remove unused isTelegramInBrokerTypes - telegram_setup.go: check resp.Body.Close and StopPlugin return values - root.go: remove ineffectual name assignment in usesWorktrees
- system_telegram.go: check json.Encode return values - daemon/ports.go: check ln.Close return value - agent/provision_test.go: check os.Chdir/Setenv/MkdirAll/WriteFile return values (all test instances, consistent style)
5c2ee85 to
8fb003f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First increment of the integrations architecture (design: .design/integrations-admin.md, #115). Delivers the complete Workstation Telegram journey end-to-end, the onboarding flow it builds on (workstation-improvements work, delivered here), an integrations admin panel, and supporting runtime/project fixes.
Highlights
Testing
Notes