diff --git a/README.md b/README.md index 0b04d50..4af5ba5 100644 --- a/README.md +++ b/README.md @@ -307,19 +307,48 @@ The watcher monitors the project with `fsnotify`, debounces events (5 s default) ### Configuration ```bash -cix config show # print current config -cix config set # set a value +cix config init # first-run wizard (TUI form) +cix config edit # interactive edit (TUI form) +cix config show # print current config (lists servers; * marks default) +cix config keys # list every settable key with default/env/description +cix config set # set one value +cix config unset # remove a server / clear a key cix config path # show config file location ``` -Config file: `~/.cix/config.yaml` +Config file: `~/.cix/config.yaml`. The full key reference lives in +[`doc/CLI_CONFIG.md`](doc/CLI_CONFIG.md) — `cix config keys` is the +canonical runtime view. -| Key | Default | Description | -|-----|---------|-------------| -| `api.url` | `http://localhost:21847` | API server URL | -| `api.key` | — | Bearer token (`cix_*`) — required | -| `watcher.debounce_ms` | `5000` | Delay before reindex triggers after a file change | -| `indexing.batch_size` | `20` | Files per `/index/files` batch | +#### Env overrides (CI) + +| Variable | Overrides | +|-----------------|------------------------------------------| +| `CIX_SERVER` | which alias resolves when `--server` is empty | +| `CIX_API_URL` | the resolved server's URL | +| `CIX_API_KEY` | the resolved server's API key | + +Precedence is **flag > env > file > default**. Env overrides apply only +to the current process — they never write back to `~/.cix/config.yaml`. + +#### Multiple servers + +`cix` can be configured with several named servers and pick one per +command with the global `--server ` flag (without it, the +`default_server` is used): + +```bash +cix config set server.corporate.url https://cix.corp.internal +cix config set server.corporate.key cix_... +cix config set default_server corporate # optional +cix --server corporate search "rate limiter" +cix config unset server.corporate # remove it +``` + +The legacy `api.url` / `api.key` keys and the `--api-url` / `--api-key` +flags still work — they read/override the default server — and old flat +`api:` config files are migrated to the `servers:` layout automatically +on first load. --- diff --git a/cli/README.md b/cli/README.md index 209cbd0..a9afd5a 100644 --- a/cli/README.md +++ b/cli/README.md @@ -27,12 +27,16 @@ cli/ │ ├── reindex.go — `cix reindex` │ ├── cancel.go — `cix cancel` │ ├── watch.go — `cix watch` (start/stop/status, daemon) -│ ├── config.go — `cix config show/set/path` +│ ├── config.go — `cix config show/set/unset/path` (+ multi-server keys) +│ ├── config_keys.go — `cix config keys` (schema-driven key listing) +│ ├── config_edit.go — `cix config edit` / `cix config init` (huh-driven TUI) │ ├── workspace.go — `cix workspace …` (cross-repo, name-first) │ └── version.go — `cix version` ├── internal/ │ ├── client/ — HTTP client to cix-server │ ├── config/ — YAML config (~/.cix/config.yaml) +│ │ ├── schema/ — tag-driven walker over Config (single source of truth) +│ │ └── tui/ — huh-based form for `cix config edit` / `init` │ ├── daemon/ — PID-file based watcher daemon │ ├── discovery/ — project-root detection for `cix init` │ ├── fileutil/ — binary/text + size helpers @@ -82,6 +86,132 @@ Then any command picks up the saved URL + key from `~/.cix/config.yaml`. The server can be local Docker (`docker compose up -d` in the repo root) or a remote server. The CLI doesn't care. +### Multiple servers + +The CLI can hold several **named servers** and pick one per command. The +config stores a `servers:` list and a `default_server`; commands use the +default unless `--server ` is given. + +```bash +# Add a second server and switch the default +cix config set server.corporate.url https://cix.corp.internal +cix config set server.corporate.key +cix config set default_server corporate + +# Target a specific server for one command (alias must exist in config) +cix --server corporate search "rate limiter" + +# Inspect / remove +cix config show # lists servers; * marks the default +cix config unset server.corporate # remove a server +``` + +The legacy `api.url` / `api.key` keys and the `--api-url` / `--api-key` +flags still work — they operate on (or override) the **default** server, +so single-server setups need no changes. Old `~/.cix/config.yaml` files +that use the flat `api:` block are migrated to the `servers:` layout +automatically on first load (the old single server becomes `default`). + +### Environment overrides (CI-friendly) + +For CI runners, containers, and one-off scripts you can override server +selection via env vars instead of writing to `~/.cix/config.yaml`. +Precedence is always **flag > env > file > built-in default** — env +overrides never persist to disk. + +| Variable | Overrides | Use case | +|-----------------|------------------------------------------|----------| +| `CIX_SERVER` | which alias resolves when `--server` is empty | Switch active server in a shell session without touching the file | +| `CIX_API_URL` | the resolved server's `url` | Point at a different cix-server instance per process | +| `CIX_API_KEY` | the resolved server's `key` | Pass a secret from `secrets.CIX_API_KEY` in GitHub Actions | + +Example (GitHub Actions): + +```yaml +env: + CIX_API_URL: https://cix.corp.internal + CIX_API_KEY: ${{ secrets.CIX_API_KEY }} +steps: + - run: cix search "foo" +``` + +The 3-var surface is deliberately narrow — knobs like +`watcher.debounce_ms` or `indexing.batch_size` live in the config file +only, because they are persistent developer preferences, not per-process +overrides. + +### Interactive setup (`cix config init` / `cix config edit`) + +`cix config init` is the first-run wizard for fresh machines: it opens +a paged form (`huh`-driven TUI) that seeds the default server entry, +asks for the API key, and walks through the watcher + indexing knobs. +On submit it validates everything against the schema and writes +`~/.cix/config.yaml`. + +`cix config edit` is the same form against an existing config — useful +when you want to flip booleans (e.g. `watcher.enabled`) or tune timeouts +without re-reading `cix config set --help`. + +``` +┌─ Servers ──────────────────────────────┐ +│ [default] URL http://localhost:21847 │ +│ [default] API key ●●●●●●●● │ +│ Default server ▼ default │ +└────────────────────────────────────────┘ +┌─ File watcher ─────────────────────────┐ +│ Enable the watcher [✓] │ +│ Debounce (ms) 5000 │ +│ Sync interval (min) 5 │ +│ Exclude patterns node_modules,… │ +└────────────────────────────────────────┘ +┌─ Indexing ─────────────────────────────┐ +│ Batch size 20 │ +│ Streaming idle (s) 30 │ +└────────────────────────────────────────┘ + [ Submit ] ESC to cancel +``` + +Add/remove of server aliases is still done via +`cix config set server..url …` / `cix config unset server.` +— the form edits URL/key of *existing* aliases. + +### Discovering keys (`cix config keys`) + +`cix config keys` prints every settable configuration key with its +current value, default, env-var binding (if any), and a short +description. This is the canonical reference — there is no hard-coded +list anywhere else: + +```bash +$ cix config keys +KEY VALUE DEFAULT ENV DESCRIPTION +default_server default — CIX_SERVER Alias of the server used when --server is omitted +watcher.enabled true true — Run the file watcher +watcher.debounce_ms 5000 5000 — Debounce delay (ms) +watcher.exclude [node_modules .git …] … — Paths/globs to skip (REPLACE semantics on set) +watcher.sync_interval_mins 5 5 — Periodic sync interval (minutes) +indexing.batch_size 20 20 — Indexing batch size +indexing.streaming_idle_timeout_sec 30 30 — Streaming /index/files idle timeout (seconds); 0 disables +``` + +Slice keys (servers, projects) are not listed here — `cix config show` +displays them in their dedicated formats. + +### List-valued keys (`watcher.exclude`) + +`watcher.exclude` is the one list-valued scalar that `cix config set` +accepts. Input is **comma-separated**, and the semantics are +**REPLACE, not append**: + +```bash +$ cix config set watcher.exclude "node_modules,vendor,build" +# overwrites the entire list; previous defaults are gone +``` + +There is no `cix config add` / `cix config append` — if you want to +keep the existing defaults plus add an entry, repeat the full list. +The interactive `cix config edit` form is usually nicer for this. + ## Smoke test ```bash diff --git a/cli/cmd/config.go b/cli/cmd/config.go index 931f5ec..f3353f5 100644 --- a/cli/cmd/config.go +++ b/cli/cmd/config.go @@ -1,9 +1,15 @@ package cmd import ( + "errors" "fmt" + "io" + "os" + "reflect" + "strings" "github.com/anthropics/code-index/cli/internal/config" + "github.com/anthropics/code-index/cli/internal/config/schema" "github.com/spf13/cobra" ) @@ -25,21 +31,50 @@ var configSetCmd = &cobra.Command{ Short: "Set a configuration value", Long: `Set a configuration value. -Supported keys: - api.url - API server URL - api.key - API authentication key - watcher.debounce_ms - Debounce delay in milliseconds - watcher.sync_interval_mins - Periodic sync interval in minutes +Run 'cix config keys' to list every settable key with its description, +default, and env-var override. Beyond those schema keys, three patterns +manage the multi-server layout: + + server..url URL of a named server (creates the entry if absent) + server..key API key of a named server + default_server which server is used when --server is omitted + api.url / api.key legacy aliases — operate on the default server + +List-valued keys (e.g. watcher.exclude) use comma-separated input with +REPLACE semantics: 'cix config set watcher.exclude "node_modules,vendor"' +overwrites the entire list. There is no 'add'/'append' form. Examples: - cix config set api.key cix_abc123... - cix config set api.url http://localhost:21847 - cix config set watcher.debounce_ms 3000 - cix config set watcher.sync_interval_mins 5`, + cix config set server.corporate.url https://cix.corp.internal + cix config set server.corporate.key cix_abc123... + cix config set default_server corporate + + cix config set api.url http://localhost:21847 # legacy alias + cix config set api.key cix_abc123... # legacy alias + + cix config set watcher.enabled false # bool + cix config set watcher.debounce_ms 3000 # int + cix config set watcher.exclude "node_modules,.git" # list (replace)`, Args: cobra.ExactArgs(2), RunE: runConfigSet, } +var configUnsetCmd = &cobra.Command{ + Use: "unset ", + Short: "Remove a server or clear a server key", + Long: `Remove configuration entries. + +Supported keys: + server. - remove the named server entirely + server..key - clear the named server's API key + +Examples: + cix config unset server.corporate + cix config unset server.corporate.key`, + Args: cobra.ExactArgs(1), + RunE: runConfigUnset, +} + var configPathCmd = &cobra.Command{ Use: "path", Short: "Show config file path", @@ -52,6 +87,7 @@ func init() { rootCmd.AddCommand(configCmd) configCmd.AddCommand(configShowCmd) configCmd.AddCommand(configSetCmd) + configCmd.AddCommand(configUnsetCmd) configCmd.AddCommand(configPathCmd) } @@ -60,40 +96,117 @@ func runConfigShow(cmd *cobra.Command, args []string) error { if err != nil { return fmt.Errorf("load config: %w", err) } + return renderConfigShow(os.Stdout, cfg, config.GetConfigPath()) +} - // Render only "set" / "not set" — never any data derived from the key. - // CodeQL go/clear-text-logging flags partial display, masked output, - // length-only output (because len(secret) still originates from the - // secret field), and even local variables named `apiKey`/`*Secret` - // regardless of contents (sensitive-name heuristic). The variable is - // therefore named `keyStatus` to bypass the name match while still - // being readable in the output. - keyStatus := "(not set)" - if cfg.API.Key != "" { - keyStatus = "(set)" - } +// renderConfigShow writes the human-readable config dump to w. +// +// Exported-via-tests (lowercase but reachable from cmd_test) so the golden- +// file test in config_show_test.go can compare against a fixture without +// shelling out to the CLI binary. +// +// The leaf list is driven by schema.Walk over the Config struct, so any +// new tagged field appears here automatically — no printf drift. +func renderConfigShow(w io.Writer, cfg *config.Config, cfgPath string) error { + // 1) Servers list — slice of structs, custom renderer. + renderServersBlock(w, cfg) - fmt.Printf("%-28s = %s\n", "api.url", cfg.API.URL) - fmt.Printf("%-28s = %s\n", "api.key", keyStatus) - fmt.Printf("%-28s = %v\n", "watcher.enabled", cfg.Watcher.Enabled) - fmt.Printf("%-28s = %d\n", "watcher.debounce_ms", cfg.Watcher.DebounceMS) - fmt.Printf("%-28s = %d\n", "watcher.sync_interval_mins", cfg.Watcher.SyncIntervalMins) - fmt.Printf("%-28s = %d\n", "indexing.batch_size", cfg.Indexing.BatchSize) - fmt.Printf("%-28s = %d\n", "server.port", cfg.Server.Port) - fmt.Printf("%-28s = %d\n", "server.cache_ttl", cfg.Server.CacheTTL) + // 2) Scalar leaves grouped by top-level prefix (watcher.* / server.* / + // indexing.*) with a blank line between groups for readability. + var lastGroup string + first := true + err := schema.Walk(cfg, func(l schema.LeafField) { + // servers / projects are slice leaves rendered separately. + if l.Path == "servers" || l.Path == "projects" { + return + } + group := topGroup(l.Path) + if !first && group != lastGroup { + fmt.Fprintln(w) + } + first = false + lastGroup = group + renderScalarLeaf(w, l) + }) + if err != nil { + return fmt.Errorf("walk schema: %w", err) + } + // 3) Projects (slice). if len(cfg.Projects) > 0 { - fmt.Printf("\nprojects (%d):\n", len(cfg.Projects)) + fmt.Fprintf(w, "\nprojects (%d):\n", len(cfg.Projects)) for _, p := range cfg.Projects { - fmt.Printf(" - %s (auto-watch: %v)\n", p.Path, p.AutoWatch) + fmt.Fprintf(w, " - %s (auto-watch: %v)\n", p.Path, p.AutoWatch) } } - fmt.Printf("\nconfig file: %s\n", config.GetConfigPath()) - + fmt.Fprintf(w, "\nconfig file: %s\n", cfgPath) return nil } +func renderServersBlock(w io.Writer, cfg *config.Config) { + fmt.Fprintf(w, "servers (%d):\n", len(cfg.Servers)) + for _, s := range cfg.Servers { + // Render only "set" / "not set" for the key — never any data derived + // from it. CodeQL go/clear-text-logging flags partial/masked/length + // output and even local variables named `*Key`/`*Secret` (sensitive- + // name heuristic), so the status string is named `keyStatus`. + keyStatus := "(not set)" + if s.Key != "" { + keyStatus = "(set)" + } + marker := " " + if s.Name == cfg.DefaultServer { + marker = "* " + } + fmt.Fprintf(w, "%s%-16s url=%s key=%s\n", marker, s.Name, s.URL, keyStatus) + } +} + +// keyWidth is the column width for the "key = value" lines. Wide enough for +// the longest current path ("indexing.streaming_idle_timeout_sec" = 34 chars) +// plus one space of slack so future additions don't force a re-tune. +const keyWidth = 36 + +func renderScalarLeaf(w io.Writer, l schema.LeafField) { + if l.Sensitive() { + // Tag-gated branch: we only inspect *whether* the value is empty via + // reflect.Value.IsZero(), and the resulting string is named + // `keyStatus` to satisfy CodeQL's sensitive-name heuristic. The + // underlying value never lands in a named *Key/*Secret variable. + keyStatus := "(not set)" + if !l.Value.IsZero() { + keyStatus = "(set)" + } + fmt.Fprintf(w, "%-*s = %s\n", keyWidth, l.Path, keyStatus) + return + } + + v := l.Value + switch v.Kind() { + case reflect.Bool: + fmt.Fprintf(w, "%-*s = %v\n", keyWidth, l.Path, v.Bool()) + case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64: + fmt.Fprintf(w, "%-*s = %d\n", keyWidth, l.Path, v.Int()) + case reflect.String: + fmt.Fprintf(w, "%-*s = %s\n", keyWidth, l.Path, v.String()) + case reflect.Slice: + fmt.Fprintf(w, "%-*s = %v\n", keyWidth, l.Path, v.Interface()) + default: + fmt.Fprintf(w, "%-*s = %v\n", keyWidth, l.Path, v.Interface()) + } +} + +// topGroup returns the prefix of a dotted key up to the first dot, used to +// group related keys with blank lines in the show output. "default_server" +// (no dot) groups under itself. +func topGroup(path string) string { + if i := strings.IndexByte(path, '.'); i > 0 { + return path[:i] + } + return path +} + func runConfigSet(cmd *cobra.Command, args []string) error { key := args[0] value := args[1] @@ -103,42 +216,119 @@ func runConfigSet(cmd *cobra.Command, args []string) error { return fmt.Errorf("load config: %w", err) } - // Set value based on key - switch key { - case "api.url": - cfg.API.URL = value - case "api.key": - cfg.API.Key = value - case "watcher.debounce_ms": - var ms int - _, err := fmt.Sscanf(value, "%d", &ms) + // Server-management keys persist on their own (they may create entries and + // reassign the default), so handle them before the schema setter — those + // side effects don't fit the "parse and assign one field" model. + switch { + case key == "default_server": + if err := config.SetDefaultServer(value); err != nil { + return err + } + fmt.Printf("✓ Set %s = %s\n", key, value) + return nil + case key == "api.url": + // Legacy alias: operate on the default server. + name := defaultServerName(cfg) + if err := config.SetServerURL(name, value); err != nil { + return err + } + fmt.Printf("✓ Set %s = %s (server %q)\n", key, value, name) + return nil + case key == "api.key": + name := defaultServerName(cfg) + if err := config.SetServerKey(name, value); err != nil { + return err + } + fmt.Printf("✓ Set %s (server %q)\n", key, name) + return nil + case strings.HasPrefix(key, "server."): + name, field, perr := parseServerKey(key) + if perr != nil { + return perr + } + switch field { + case "url": + if err := config.SetServerURL(name, value); err != nil { + return err + } + case "key": + if err := config.SetServerKey(name, value); err != nil { + return err + } + } + fmt.Printf("✓ Set %s\n", key) + return nil + } + + // Schema-driven setter: handles every leaf annotated with a `key:` tag + // (watcher.*, indexing.*, server.*, default_server). Validates and + // persists internally. Unknown keys surface a clear error — the legacy + // hand-rolled switch was removed in step 10 of the config refactor; its + // surface is now a strict subset of the schema's. + if err := config.SetByPath(key, value); err != nil { + if errors.Is(err, config.ErrUnknownKey) { + return fmt.Errorf("unknown config key %q (run 'cix config keys' for the full list)", key) + } + return err + } + fmt.Printf("✓ Set %s = %s\n", key, value) + return nil +} + +func runConfigUnset(cmd *cobra.Command, args []string) error { + key := args[0] + + if !strings.HasPrefix(key, "server.") { + return fmt.Errorf("unknown unset key: %s (supported: server., server..key)", key) + } + + rest := strings.TrimPrefix(key, "server.") + switch { + case strings.HasSuffix(rest, ".key"): + name := strings.TrimSuffix(rest, ".key") + if name == "" || strings.Contains(name, ".") { + return fmt.Errorf("invalid server key: %s", key) + } + if err := config.SetServerKey(name, ""); err != nil { + return err + } + fmt.Printf("✓ Cleared key for server %q\n", name) + return nil + case !strings.Contains(rest, "."): + // `server.` — remove the whole server. + reassigned, err := config.RemoveServer(rest) if err != nil { - return fmt.Errorf("invalid value for debounce_ms: %s", value) - } - cfg.Watcher.DebounceMS = ms - case "watcher.sync_interval_mins": - var mins int - _, err := fmt.Sscanf(value, "%d", &mins) - if err != nil || mins < 1 { - return fmt.Errorf("invalid value for sync_interval_mins (must be >= 1): %s", value) - } - cfg.Watcher.SyncIntervalMins = mins - case "indexing.batch_size": - var bs int - _, err := fmt.Sscanf(value, "%d", &bs) - if err != nil || bs < 1 { - return fmt.Errorf("invalid value for batch_size (must be >= 1): %s", value) - } - cfg.Indexing.BatchSize = bs + return err + } + fmt.Printf("✓ Removed server %q\n", rest) + if reassigned != "" { + fmt.Printf(" default server is now %q\n", reassigned) + } + return nil default: - return fmt.Errorf("unknown config key: %s", key) + return fmt.Errorf("unknown unset key: %s (supported: server., server..key)", key) } +} - // Save config - if err := config.Save(cfg); err != nil { - return fmt.Errorf("save config: %w", err) +// defaultServerName returns the name of the default server for legacy api.* +// aliases, falling back to the canonical "default" name when none is set. +func defaultServerName(cfg *config.Config) string { + if s, ok := cfg.DefaultServerEntry(); ok { + return s.Name } + return config.DefaultServerName +} - fmt.Printf("✓ Set %s = %s\n", key, value) - return nil +// parseServerKey splits a `server..` config key into its name and +// field (url|key), validating the shape. +func parseServerKey(key string) (name, field string, err error) { + parts := strings.SplitN(key, ".", 3) + if len(parts) != 3 || parts[0] != "server" || parts[1] == "" { + return "", "", fmt.Errorf("invalid server key %q (expected server..url or server..key)", key) + } + name, field = parts[1], parts[2] + if field != "url" && field != "key" { + return "", "", fmt.Errorf("invalid server field %q in %q (expected url or key)", field, key) + } + return name, field, nil } diff --git a/cli/cmd/config_edit.go b/cli/cmd/config_edit.go new file mode 100644 index 0000000..668a8b5 --- /dev/null +++ b/cli/cmd/config_edit.go @@ -0,0 +1,61 @@ +package cmd + +import ( + "fmt" + + "github.com/anthropics/code-index/cli/internal/config" + "github.com/anthropics/code-index/cli/internal/config/tui" + "github.com/spf13/cobra" +) + +var configEditCmd = &cobra.Command{ + Use: "edit", + Short: "Interactively edit configuration (TUI)", + Long: `Open the full-screen lazygit-style editor for ~/.cix/config.yaml. + +Layout: section list on the left (Servers / Watcher / Indexing / +Projects / Misc), selected section's content on the right, persistent +key-hint bar at the bottom. + +Keys (press ? for the full table): + ↑/k ↓/j move within a panel + ←/h →/l / tab switch panel + enter edit selected field + space / x toggle bool field + a / d add / delete server (Servers section) + m mark selected server as default + t test connection (server) + q / esc quit + +Every edit goes through the same validation as 'cix config set' and is +written to disk immediately.`, + RunE: func(cmd *cobra.Command, args []string) error { + cfg, err := config.Load() + if err != nil { + return fmt.Errorf("load config: %w", err) + } + return tui.RunEdit(cfg) + }, +} + +var configInitCmd = &cobra.Command{ + Use: "init", + Short: "First-run wizard (TUI)", + Long: `Seed a fresh ~/.cix/config.yaml with the localhost default server +and open the interactive editor pointing at it. + +If a configuration already exists this is equivalent to 'cix config edit' +— no overwrite; the existing servers, settings, and projects are kept.`, + RunE: func(cmd *cobra.Command, args []string) error { + cfg, err := config.Load() + if err != nil { + return fmt.Errorf("load config: %w", err) + } + return tui.RunInit(cfg) + }, +} + +func init() { + configCmd.AddCommand(configEditCmd) + configCmd.AddCommand(configInitCmd) +} diff --git a/cli/cmd/config_keys.go b/cli/cmd/config_keys.go new file mode 100644 index 0000000..8ab63f0 --- /dev/null +++ b/cli/cmd/config_keys.go @@ -0,0 +1,83 @@ +package cmd + +import ( + "fmt" + "io" + "os" + "reflect" + "text/tabwriter" + + "github.com/anthropics/code-index/cli/internal/config" + "github.com/anthropics/code-index/cli/internal/config/schema" + "github.com/spf13/cobra" +) + +var configKeysCmd = &cobra.Command{ + Use: "keys", + Short: "List every settable configuration key", + Long: `Print every configuration key the CLI knows about, with its +current value, default, env-var override (if any), and a short description. + +The list is reflection-driven — any new schema-tagged field shows up here +automatically. Slice keys managed via dedicated commands (servers, +projects) are not listed; use 'cix config show' to view them.`, + RunE: func(cmd *cobra.Command, args []string) error { + cfg, err := config.Load() + if err != nil { + return fmt.Errorf("load config: %w", err) + } + return renderConfigKeys(os.Stdout, cfg) + }, +} + +func init() { + configCmd.AddCommand(configKeysCmd) +} + +func renderConfigKeys(w io.Writer, cfg *config.Config) error { + tw := tabwriter.NewWriter(w, 0, 0, 2, ' ', 0) + if _, err := fmt.Fprintln(tw, "KEY\tVALUE\tDEFAULT\tENV\tDESCRIPTION"); err != nil { + return err + } + + if err := schema.Walk(cfg, func(l schema.LeafField) { + // Skip slice-of-struct leaves (servers, projects); they have + // purpose-built management commands and don't render meaningfully + // in a single tab-separated row. + if l.Value.Kind() == reflect.Slice && l.Value.Type().Elem().Kind() == reflect.Struct { + return + } + fmt.Fprintf(tw, "%s\t%s\t%s\t%s\t%s\n", + l.Path, + formatLeafCurrent(l), + dashIfEmpty(l.Tag("default")), + dashIfEmpty(l.Tag("env")), + l.Tag("desc"), + ) + }); err != nil { + return err + } + return tw.Flush() +} + +// formatLeafCurrent returns the leaf's current value as a display string. +// Sensitive leaves render only "(set)"/"(not set)" — same gate as +// renderScalarLeaf in `cix config show`. +func formatLeafCurrent(l schema.LeafField) string { + if l.Sensitive() { + // Tag-gated: do not bind the value to a named *Key/*Secret var. + // reflect.Value.IsZero() inspects through reflection only. + if l.Value.IsZero() { + return "(not set)" + } + return "(set)" + } + return fmt.Sprintf("%v", l.Value.Interface()) +} + +func dashIfEmpty(s string) string { + if s == "" { + return "—" + } + return s +} diff --git a/cli/cmd/config_keys_test.go b/cli/cmd/config_keys_test.go new file mode 100644 index 0000000..a93b15f --- /dev/null +++ b/cli/cmd/config_keys_test.go @@ -0,0 +1,73 @@ +package cmd + +import ( + "bytes" + "strings" + "testing" + + "github.com/anthropics/code-index/cli/internal/config" +) + +func TestRenderConfigKeys_Snapshot(t *testing.T) { + cfg := &config.Config{ + Servers: []config.ServerEntry{ + {Name: "default", URL: "http://localhost:21847", Key: "cix_secret_xyz"}, + }, + DefaultServer: "default", + Watcher: config.WatcherConfig{ + Enabled: true, + DebounceMS: 5000, + ExcludePatterns: []string{"node_modules"}, + SyncIntervalMins: 5, + }, + Indexing: config.IndexingConfig{BatchSize: 20, StreamingIdleTimeoutSec: 30}, + } + + var buf bytes.Buffer + if err := renderConfigKeys(&buf, cfg); err != nil { + t.Fatalf("renderConfigKeys: %v", err) + } + + got := buf.String() + + // Header must come first. + if !strings.HasPrefix(got, "KEY") { + t.Errorf("output should start with header row, got %q", firstLine(got)) + } + + // Every settable scalar must appear. + mustContain := []string{ + "default_server", + "watcher.enabled", + "watcher.debounce_ms", + "watcher.exclude", + "watcher.sync_interval_mins", + "indexing.batch_size", + "indexing.streaming_idle_timeout_sec", + "CIX_SERVER", // env tag on default_server + } + for _, want := range mustContain { + if !strings.Contains(got, want) { + t.Errorf("output missing %q\nfull output:\n%s", want, got) + } + } + + // Slice-of-struct leaves are skipped from the listing. + for _, skip := range []string{"\nservers ", "\nprojects "} { + if strings.Contains(got, skip) { + t.Errorf("output should not list slice-of-struct leaf row %q", skip) + } + } + + // Sensitive value MUST NOT leak. + if strings.Contains(got, "cix_secret_xyz") { + t.Errorf("sensitive key value leaked into output") + } +} + +func firstLine(s string) string { + if i := strings.IndexByte(s, '\n'); i >= 0 { + return s[:i] + } + return s +} diff --git a/cli/cmd/config_show_test.go b/cli/cmd/config_show_test.go new file mode 100644 index 0000000..5707620 --- /dev/null +++ b/cli/cmd/config_show_test.go @@ -0,0 +1,97 @@ +package cmd + +import ( + "bytes" + "strings" + "testing" + + "github.com/anthropics/code-index/cli/internal/config" +) + +// TestRenderConfigShow_Snapshot pins the human-readable layout of `cix +// config show`. The expected output is the contract — if a future field +// addition shifts the format, this test forces an intentional update +// instead of silent drift. +// +// CodeQL note: the API key value below ("cix_secret123") never appears in +// the expected output — it MUST render as "(set)" because ServerEntry.Key +// is sensitive. Verifying that absence is the whole point of this test. +func TestRenderConfigShow_Snapshot(t *testing.T) { + cfg := &config.Config{ + Servers: []config.ServerEntry{ + {Name: "default", URL: "http://localhost:21847", Key: "cix_secret123"}, + {Name: "corporate", URL: "https://cix.corp.internal", Key: ""}, + }, + DefaultServer: "default", + Watcher: config.WatcherConfig{ + Enabled: true, + DebounceMS: 5000, + ExcludePatterns: []string{"node_modules", ".git"}, + SyncIntervalMins: 5, + }, + Indexing: config.IndexingConfig{ + BatchSize: 20, + StreamingIdleTimeoutSec: 30, + }, + Projects: []config.ProjectEntry{ + {Path: "/home/u/proj", AutoWatch: true}, + }, + } + + var buf bytes.Buffer + if err := renderConfigShow(&buf, cfg, "/home/u/.cix/config.yaml"); err != nil { + t.Fatalf("renderConfigShow: %v", err) + } + + want := `servers (2): +* default url=http://localhost:21847 key=(set) + corporate url=https://cix.corp.internal key=(not set) +default_server = default + +watcher.enabled = true +watcher.debounce_ms = 5000 +watcher.exclude = [node_modules .git] +watcher.sync_interval_mins = 5 + +indexing.batch_size = 20 +indexing.streaming_idle_timeout_sec = 30 + +projects (1): + - /home/u/proj (auto-watch: true) + +config file: /home/u/.cix/config.yaml +` + + got := buf.String() + if got != want { + t.Errorf("renderConfigShow output mismatch\n--- want ---\n%s\n--- got ----\n%s", want, got) + } + + // Explicit safety belt: the sensitive value MUST NOT leak even via + // substring (no length disclosure, no prefix, no mask). + if strings.Contains(got, "cix_secret123") { + t.Errorf("sensitive value leaked into output") + } + if strings.Contains(got, "secret") { + t.Errorf("substring of sensitive value leaked into output") + } +} + +func TestRenderConfigShow_EmptyProjects(t *testing.T) { + cfg := &config.Config{ + Servers: []config.ServerEntry{ + {Name: "default", URL: "http://localhost:21847"}, + }, + DefaultServer: "default", + Watcher: config.WatcherConfig{}, + Indexing: config.IndexingConfig{}, + } + var buf bytes.Buffer + if err := renderConfigShow(&buf, cfg, "/tmp/cfg.yaml"); err != nil { + t.Fatalf("renderConfigShow: %v", err) + } + out := buf.String() + if strings.Contains(out, "projects (") { + t.Errorf("projects block rendered for empty list:\n%s", out) + } +} diff --git a/cli/cmd/multiserver_test.go b/cli/cmd/multiserver_test.go new file mode 100644 index 0000000..02cd626 --- /dev/null +++ b/cli/cmd/multiserver_test.go @@ -0,0 +1,297 @@ +package cmd + +import ( + "strings" + "testing" + + "github.com/anthropics/code-index/cli/internal/config" +) + +// isolateConfig points config.Load() at a throwaway HOME and resets the +// singleton before and after the test. CIX_* env vars are unset so a +// developer with these set in their shell does not get spurious test +// failures — tests that need env overrides set them explicitly. +func isolateConfig(t *testing.T) { + t.Helper() + t.Setenv("HOME", t.TempDir()) + t.Setenv("XDG_CONFIG_HOME", "") + t.Setenv("CIX_SERVER", "") + t.Setenv("CIX_API_URL", "") + t.Setenv("CIX_API_KEY", "") + config.ResetForTesting() + t.Cleanup(config.ResetForTesting) +} + +// withFlags temporarily sets the global --server/--api-url/--api-key vars and +// restores them on cleanup. +func withFlags(t *testing.T, server, url, key string) { + t.Helper() + ps, pu, pk := serverName, apiURL, apiKey + serverName, apiURL, apiKey = server, url, key + t.Cleanup(func() { serverName, apiURL, apiKey = ps, pu, pk }) +} + +func TestGetClient_ServerFlagSelectsServer(t *testing.T) { + isolateConfig(t) + if err := config.SetServerURL("corp", "https://corp.example"); err != nil { + t.Fatal(err) + } + if err := config.SetServerKey("corp", "corp-key"); err != nil { + t.Fatal(err) + } + withFlags(t, "corp", "", "") + + c, err := getClient() + if err != nil { + t.Fatalf("getClient: %v", err) + } + if !strings.Contains(c.BaseURL(), "corp.example") { + t.Errorf("BaseURL = %q, want corp.example", c.BaseURL()) + } +} + +func TestGetClient_DefaultServerWhenNoFlag(t *testing.T) { + isolateConfig(t) + // Seed the default server with a key so resolution succeeds. + if err := config.SetServerURL(config.DefaultServerName, "http://localhost:21847"); err != nil { + t.Fatal(err) + } + if err := config.SetServerKey(config.DefaultServerName, "dk"); err != nil { + t.Fatal(err) + } + withFlags(t, "", "", "") + + c, err := getClient() + if err != nil { + t.Fatalf("getClient: %v", err) + } + if !strings.Contains(c.BaseURL(), "localhost:21847") { + t.Errorf("BaseURL = %q, want default localhost", c.BaseURL()) + } +} + +func TestGetClient_UnknownServerErrors(t *testing.T) { + isolateConfig(t) + if err := config.SetServerKey(config.DefaultServerName, "dk"); err != nil { + t.Fatal(err) + } + withFlags(t, "ghost", "", "") + + _, err := getClient() + if err == nil { + t.Fatal("expected error for unknown --server") + } + if !strings.Contains(err.Error(), "ghost") { + t.Errorf("error %q should mention the unknown server", err.Error()) + } +} + +func TestGetClient_ServerKeyMissingError(t *testing.T) { + isolateConfig(t) + // corp has a URL but no key, and no --api-key override. + if err := config.SetServerURL("corp", "https://corp.example"); err != nil { + t.Fatal(err) + } + withFlags(t, "corp", "", "") + + _, err := getClient() + if err == nil { + t.Fatal("expected missing-key error") + } + if !strings.Contains(err.Error(), "corp") || !strings.Contains(err.Error(), "API key") { + t.Errorf("error %q should name the server and mention API key", err.Error()) + } +} + +func TestRunConfigSet_ServerKeys(t *testing.T) { + isolateConfig(t) + + mustSet(t, "server.corp.url", "https://corp") + mustSet(t, "server.corp.key", "ck") + + cfg, _ := config.Load() + s, ok := cfg.GetServer("corp") + if !ok || s.URL != "https://corp" || s.Key != "ck" { + t.Fatalf("corp server = %+v, ok=%v", s, ok) + } +} + +func TestRunConfigSet_DefaultServer(t *testing.T) { + isolateConfig(t) + mustSet(t, "server.corp.url", "https://corp") + mustSet(t, "default_server", "corp") + + cfg, _ := config.Load() + if cfg.DefaultServer != "corp" { + t.Errorf("DefaultServer = %q, want corp", cfg.DefaultServer) + } + + // Unknown default is rejected. + if _, err := captureOutput(func() error { + return runConfigSet(nil, []string{"default_server", "ghost"}) + }); err == nil { + t.Error("expected error setting default_server to unknown alias") + } +} + +func TestRunConfigSet_ApiAliasMapsToDefault(t *testing.T) { + isolateConfig(t) + mustSet(t, "api.url", "http://aliased:1234") + mustSet(t, "api.key", "ak") + + cfg, _ := config.Load() + s, ok := cfg.DefaultServerEntry() + if !ok || s.URL != "http://aliased:1234" || s.Key != "ak" { + t.Fatalf("default server = %+v, ok=%v", s, ok) + } + // The legacy api block must remain empty (migrated/never persisted). + if cfg.API.URL != "" || cfg.API.Key != "" { + t.Errorf("API block = %+v, want empty", cfg.API) + } +} + +func TestRunConfigUnset(t *testing.T) { + isolateConfig(t) + mustSet(t, "server.corp.url", "https://corp") + mustSet(t, "server.corp.key", "ck") + + // Clear just the key. + if _, err := captureOutput(func() error { + return runConfigUnset(nil, []string{"server.corp.key"}) + }); err != nil { + t.Fatalf("unset key: %v", err) + } + cfg, _ := config.Load() + if s, ok := cfg.GetServer("corp"); !ok || s.Key != "" || s.URL != "https://corp" { + t.Fatalf("after key unset corp = %+v, ok=%v", s, ok) + } + + // Remove the whole server. + if _, err := captureOutput(func() error { + return runConfigUnset(nil, []string{"server.corp"}) + }); err != nil { + t.Fatalf("unset server: %v", err) + } + cfg, _ = config.Load() + if _, ok := cfg.GetServer("corp"); ok { + t.Error("corp server should have been removed") + } + + // Unknown key shape errors. + if _, err := captureOutput(func() error { + return runConfigUnset(nil, []string{"watcher.debounce_ms"}) + }); err == nil { + t.Error("expected error for unsupported unset key") + } +} + +// --- Env-override tests (step 8) ------------------------------------------- + +func TestGetClient_EnvServerSelectsAlias(t *testing.T) { + isolateConfig(t) + if err := config.SetServerURL("corp", "https://corp.example"); err != nil { + t.Fatal(err) + } + if err := config.SetServerKey("corp", "corp-key"); err != nil { + t.Fatal(err) + } + withFlags(t, "", "", "") + t.Setenv("CIX_SERVER", "corp") + + c, err := getClient() + if err != nil { + t.Fatalf("getClient: %v", err) + } + if !strings.Contains(c.BaseURL(), "corp.example") { + t.Errorf("BaseURL = %q, want corp.example (selected via CIX_SERVER)", c.BaseURL()) + } +} + +func TestGetClient_FlagBeatsCixServerEnv(t *testing.T) { + isolateConfig(t) + if err := config.SetServerURL("alpha", "https://alpha"); err != nil { + t.Fatal(err) + } + if err := config.SetServerKey("alpha", "ak"); err != nil { + t.Fatal(err) + } + if err := config.SetServerURL("beta", "https://beta"); err != nil { + t.Fatal(err) + } + if err := config.SetServerKey("beta", "bk"); err != nil { + t.Fatal(err) + } + withFlags(t, "alpha", "", "") + t.Setenv("CIX_SERVER", "beta") + + c, err := getClient() + if err != nil { + t.Fatalf("getClient: %v", err) + } + if !strings.Contains(c.BaseURL(), "alpha") { + t.Errorf("BaseURL = %q, want alpha (--server overrides CIX_SERVER)", c.BaseURL()) + } +} + +func TestGetClient_EnvAPIURLOverridesResolvedURL(t *testing.T) { + isolateConfig(t) + if err := config.SetServerURL(config.DefaultServerName, "http://localhost:21847"); err != nil { + t.Fatal(err) + } + if err := config.SetServerKey(config.DefaultServerName, "k"); err != nil { + t.Fatal(err) + } + withFlags(t, "", "", "") + t.Setenv("CIX_API_URL", "http://env-url:9999") + + c, err := getClient() + if err != nil { + t.Fatalf("getClient: %v", err) + } + if !strings.Contains(c.BaseURL(), "env-url:9999") { + t.Errorf("BaseURL = %q, want env-url:9999", c.BaseURL()) + } +} + +func TestGetClient_FlagBeatsCixAPIURLEnv(t *testing.T) { + isolateConfig(t) + if err := config.SetServerURL(config.DefaultServerName, "http://file-url"); err != nil { + t.Fatal(err) + } + if err := config.SetServerKey(config.DefaultServerName, "k"); err != nil { + t.Fatal(err) + } + withFlags(t, "", "http://flag-url:1234", "") + t.Setenv("CIX_API_URL", "http://env-url:9999") + + c, err := getClient() + if err != nil { + t.Fatalf("getClient: %v", err) + } + if !strings.Contains(c.BaseURL(), "flag-url:1234") { + t.Errorf("BaseURL = %q, want flag-url:1234 (--api-url > env)", c.BaseURL()) + } +} + +func TestGetClient_EnvAPIKeyFillsMissingKey(t *testing.T) { + isolateConfig(t) + // File has URL but no key — env supplies it. + if err := config.SetServerURL(config.DefaultServerName, "http://localhost:21847"); err != nil { + t.Fatal(err) + } + withFlags(t, "", "", "") + t.Setenv("CIX_API_KEY", "env-key-secret") + + if _, err := getClient(); err != nil { + t.Fatalf("getClient should succeed with CIX_API_KEY: %v", err) + } +} + +func mustSet(t *testing.T, key, value string) { + t.Helper() + if _, err := captureOutput(func() error { + return runConfigSet(nil, []string{key, value}) + }); err != nil { + t.Fatalf("config set %s %s: %v", key, value, err) + } +} diff --git a/cli/cmd/root.go b/cli/cmd/root.go index c6c051e..08ae2b3 100644 --- a/cli/cmd/root.go +++ b/cli/cmd/root.go @@ -39,9 +39,10 @@ func printBanner() { } var ( - cfgFile string - apiURL string - apiKey string + cfgFile string + apiURL string + apiKey string + serverName string ) // rootCmd represents the base command @@ -71,8 +72,9 @@ func Execute() { } func init() { - rootCmd.PersistentFlags().StringVar(&apiURL, "api-url", "", "API server URL (default from config)") - rootCmd.PersistentFlags().StringVar(&apiKey, "api-key", "", "API key (default from config)") + rootCmd.PersistentFlags().StringVar(&serverName, "server", "", "named server alias from config (default: the configured default server)") + rootCmd.PersistentFlags().StringVar(&apiURL, "api-url", "", "API server URL (overrides the selected server's URL)") + rootCmd.PersistentFlags().StringVar(&apiKey, "api-key", "", "API key (overrides the selected server's key)") } // resolveProjectByName performs an exact-match lookup of name against the @@ -130,24 +132,64 @@ func findProjectRoot(candidatePath string, apiClient *client.Client) string { return candidatePath } -// getClient creates an API client from config or flags +// Env-var names recognised by the CLI for server selection / overrides. +// Precedence is always flag > env > config-file > default. The CIX_* +// surface is deliberately tiny — three vars, all about reaching a server. +// Everything else lives in ~/.cix/config.yaml. +const ( + envServer = "CIX_SERVER" + envAPIURL = "CIX_API_URL" + envAPIKey = "CIX_API_KEY" +) + +// getClient creates an API client from config / flags / env. +// +// Precedence per axis: +// - target server alias: --server > CIX_SERVER > default_server +// - server URL override: --api-url > CIX_API_URL > the resolved server's URL +// - server key override: --api-key > CIX_API_KEY > the resolved server's key +// +// The env vars override the *resolved* server's URL/key locally — they +// never mutate the in-memory ServerEntry, so a follow-up config.Save() will +// not persist them. This matches the flag behavior and is what users in CI +// expect: `CIX_API_KEY=secret cix search …` must not write the secret back +// to ~/.cix/config.yaml. func getClient() (*client.Client, error) { cfg, err := config.Load() if err != nil { return nil, fmt.Errorf("load config: %w", err) } + // Server alias: flag > env > default. Read env only when the flag is + // empty; the flag is the authoritative override. + name := serverName + if name == "" { + name = os.Getenv(envServer) + } + srv, err := cfg.ResolveServer(name) + if err != nil { + return nil, err + } + + // URL override: flag > env > entry. url := apiURL if url == "" { - url = cfg.API.URL + url = os.Getenv(envAPIURL) + } + if url == "" { + url = srv.URL } + // Key override: flag > env > entry. Local copy — never write back. key := apiKey if key == "" { - key = cfg.API.Key - if key == "" { - return nil, fmt.Errorf("API key not set. Use --api-key flag or run 'cix config set api.key '") - } + key = os.Getenv(envAPIKey) + } + if key == "" { + key = srv.Key + } + if key == "" { + return nil, fmt.Errorf("API key not set for server %q. Use --api-key flag, set %s=…, or run 'cix config set server.%s.key '", srv.Name, envAPIKey, srv.Name) } c := client.New(url, key) diff --git a/cli/go.mod b/cli/go.mod index 2e00cf7..0ffe7b5 100644 --- a/cli/go.mod +++ b/cli/go.mod @@ -1,8 +1,16 @@ module github.com/anthropics/code-index/cli -go 1.23.0 +go 1.25.0 require ( + github.com/charmbracelet/bubbles v1.0.0 + github.com/charmbracelet/bubbletea v1.3.10 + github.com/charmbracelet/lipgloss v1.1.0 + github.com/go-playground/validator/v10 v10.30.3 + github.com/knadh/koanf/parsers/yaml v1.1.0 + github.com/knadh/koanf/providers/confmap v1.0.0 + github.com/knadh/koanf/providers/rawbytes v1.0.0 + github.com/knadh/koanf/v2 v2.3.5 github.com/rjeczalik/notify v0.9.3 github.com/sabhiram/go-gitignore v0.0.0-20210923224102-525f6e181f06 github.com/spf13/cobra v1.8.0 @@ -10,10 +18,39 @@ require ( ) require ( + github.com/atotto/clipboard v0.1.4 // indirect + github.com/aymanbagabas/go-osc52/v2 v2.0.1 // indirect + github.com/charmbracelet/colorprofile v0.4.1 // indirect + github.com/charmbracelet/x/ansi v0.11.6 // indirect + github.com/charmbracelet/x/cellbuf v0.0.15 // indirect + github.com/charmbracelet/x/term v0.2.2 // indirect + github.com/clipperhouse/displaywidth v0.9.0 // indirect + github.com/clipperhouse/stringish v0.1.1 // indirect + github.com/clipperhouse/uax29/v2 v2.5.0 // indirect + github.com/erikgeiser/coninput v0.0.0-20211004153227-1c3628e74d0f // indirect + github.com/gabriel-vasile/mimetype v1.4.13 // indirect + github.com/go-playground/locales v0.14.1 // indirect + github.com/go-playground/universal-translator v0.18.1 // indirect + github.com/go-viper/mapstructure/v2 v2.4.0 // indirect github.com/inconshreveable/mousetrap v1.1.0 // indirect + github.com/knadh/koanf/maps v0.1.2 // indirect github.com/kr/pretty v0.3.1 // indirect + github.com/leodido/go-urn v1.4.0 // indirect + github.com/lucasb-eyer/go-colorful v1.3.0 // indirect + github.com/mattn/go-isatty v0.0.20 // indirect + github.com/mattn/go-localereader v0.0.1 // indirect + github.com/mattn/go-runewidth v0.0.19 // indirect + github.com/mitchellh/copystructure v1.2.0 // indirect + github.com/mitchellh/reflectwalk v1.0.2 // indirect + github.com/muesli/ansi v0.0.0-20230316100256-276c6243b2f6 // indirect + github.com/muesli/cancelreader v0.2.2 // indirect + github.com/muesli/termenv v0.16.0 // indirect + github.com/rivo/uniseg v0.4.7 // indirect github.com/spf13/pflag v1.0.10 // indirect github.com/stretchr/testify v1.11.1 // indirect - golang.org/x/sys v0.29.0 // indirect - gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 // indirect + github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e // indirect + go.yaml.in/yaml/v3 v3.0.3 // indirect + golang.org/x/crypto v0.52.0 // indirect + golang.org/x/sys v0.45.0 // indirect + golang.org/x/text v0.37.0 // indirect ) diff --git a/cli/go.sum b/cli/go.sum index f72792f..e12947e 100644 --- a/cli/go.sum +++ b/cli/go.sum @@ -1,17 +1,87 @@ +github.com/atotto/clipboard v0.1.4 h1:EH0zSVneZPSuFR11BlR9YppQTVDbh5+16AmcJi4g1z4= +github.com/atotto/clipboard v0.1.4/go.mod h1:ZY9tmq7sm5xIbd9bOK4onWV4S6X0u6GY7Vn0Yu86PYI= +github.com/aymanbagabas/go-osc52/v2 v2.0.1 h1:HwpRHbFMcZLEVr42D4p7XBqjyuxQH5SMiErDT4WkJ2k= +github.com/aymanbagabas/go-osc52/v2 v2.0.1/go.mod h1:uYgXzlJ7ZpABp8OJ+exZzJJhRNQ2ASbcXHWsFqH8hp8= +github.com/charmbracelet/bubbles v1.0.0 h1:12J8/ak/uCZEMQ6KU7pcfwceyjLlWsDLAxB5fXonfvc= +github.com/charmbracelet/bubbles v1.0.0/go.mod h1:9d/Zd5GdnauMI5ivUIVisuEm3ave1XwXtD1ckyV6r3E= +github.com/charmbracelet/bubbletea v1.3.10 h1:otUDHWMMzQSB0Pkc87rm691KZ3SWa4KUlvF9nRvCICw= +github.com/charmbracelet/bubbletea v1.3.10/go.mod h1:ORQfo0fk8U+po9VaNvnV95UPWA1BitP1E0N6xJPlHr4= +github.com/charmbracelet/colorprofile v0.4.1 h1:a1lO03qTrSIRaK8c3JRxJDZOvhvIeSco3ej+ngLk1kk= +github.com/charmbracelet/colorprofile v0.4.1/go.mod h1:U1d9Dljmdf9DLegaJ0nGZNJvoXAhayhmidOdcBwAvKk= +github.com/charmbracelet/lipgloss v1.1.0 h1:vYXsiLHVkK7fp74RkV7b2kq9+zDLoEU4MZoFqR/noCY= +github.com/charmbracelet/lipgloss v1.1.0/go.mod h1:/6Q8FR2o+kj8rz4Dq0zQc3vYf7X+B0binUUBwA0aL30= +github.com/charmbracelet/x/ansi v0.11.6 h1:GhV21SiDz/45W9AnV2R61xZMRri5NlLnl6CVF7ihZW8= +github.com/charmbracelet/x/ansi v0.11.6/go.mod h1:2JNYLgQUsyqaiLovhU2Rv/pb8r6ydXKS3NIttu3VGZQ= +github.com/charmbracelet/x/cellbuf v0.0.15 h1:ur3pZy0o6z/R7EylET877CBxaiE1Sp1GMxoFPAIztPI= +github.com/charmbracelet/x/cellbuf v0.0.15/go.mod h1:J1YVbR7MUuEGIFPCaaZ96KDl5NoS0DAWkskup+mOY+Q= +github.com/charmbracelet/x/term v0.2.2 h1:xVRT/S2ZcKdhhOuSP4t5cLi5o+JxklsoEObBSgfgZRk= +github.com/charmbracelet/x/term v0.2.2/go.mod h1:kF8CY5RddLWrsgVwpw4kAa6TESp6EB5y3uxGLeCqzAI= +github.com/clipperhouse/displaywidth v0.9.0 h1:Qb4KOhYwRiN3viMv1v/3cTBlz3AcAZX3+y9OLhMtAtA= +github.com/clipperhouse/displaywidth v0.9.0/go.mod h1:aCAAqTlh4GIVkhQnJpbL0T/WfcrJXHcj8C0yjYcjOZA= +github.com/clipperhouse/stringish v0.1.1 h1:+NSqMOr3GR6k1FdRhhnXrLfztGzuG+VuFDfatpWHKCs= +github.com/clipperhouse/stringish v0.1.1/go.mod h1:v/WhFtE1q0ovMta2+m+UbpZ+2/HEXNWYXQgCt4hdOzA= +github.com/clipperhouse/uax29/v2 v2.5.0 h1:x7T0T4eTHDONxFJsL94uKNKPHrclyFI0lm7+w94cO8U= +github.com/clipperhouse/uax29/v2 v2.5.0/go.mod h1:Wn1g7MK6OoeDT0vL+Q0SQLDz/KpfsVRgg6W7ihQeh4g= github.com/cpuguy83/go-md2man/v2 v2.0.3/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o= github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E= github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c= github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38= +github.com/erikgeiser/coninput v0.0.0-20211004153227-1c3628e74d0f h1:Y/CXytFA4m6baUTXGLOoWe4PQhGxaX0KpnayAqC48p4= +github.com/erikgeiser/coninput v0.0.0-20211004153227-1c3628e74d0f/go.mod h1:vw97MGsxSvLiUE2X8qFplwetxpGLQrlU1Q9AUEIzCaM= +github.com/gabriel-vasile/mimetype v1.4.13 h1:46nXokslUBsAJE/wMsp5gtO500a4F3Nkz9Ufpk2AcUM= +github.com/gabriel-vasile/mimetype v1.4.13/go.mod h1:d+9Oxyo1wTzWdyVUPMmXFvp4F9tea18J8ufA774AB3s= +github.com/go-playground/assert/v2 v2.2.0 h1:JvknZsQTYeFEAhQwI4qEt9cyV5ONwRHC+lYKSsYSR8s= +github.com/go-playground/assert/v2 v2.2.0/go.mod h1:VDjEfimB/XKnb+ZQfWdccd7VUvScMdVu0Titje2rxJ4= +github.com/go-playground/locales v0.14.1 h1:EWaQ/wswjilfKLTECiXz7Rh+3BjFhfDFKv/oXslEjJA= +github.com/go-playground/locales v0.14.1/go.mod h1:hxrqLVvrK65+Rwrd5Fc6F2O76J/NuW9t0sjnWqG1slY= +github.com/go-playground/universal-translator v0.18.1 h1:Bcnm0ZwsGyWbCzImXv+pAJnYK9S473LQFuzCbDbfSFY= +github.com/go-playground/universal-translator v0.18.1/go.mod h1:xekY+UJKNuX9WP91TpwSH2VMlDf28Uj24BCp08ZFTUY= +github.com/go-playground/validator/v10 v10.30.3 h1:4MU6YkEwx7GbcPJOZxrtbu+QfF3pJLJuaYTeAH0DYy8= +github.com/go-playground/validator/v10 v10.30.3/go.mod h1:4Axh7oCNGcoGkqLoE4YWt6n20mcEIsPRlB7vPk3lpyc= +github.com/go-viper/mapstructure/v2 v2.4.0 h1:EBsztssimR/CONLSZZ04E8qAkxNYq4Qp9LvH92wZUgs= +github.com/go-viper/mapstructure/v2 v2.4.0/go.mod h1:oJDH3BJKyqBA2TXFhDsKDGDTlndYOZ6rGS0BRZIxGhM= github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8= github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw= +github.com/knadh/koanf/maps v0.1.2 h1:RBfmAW5CnZT+PJ1CVc1QSJKf4Xu9kxfQgYVQSu8hpbo= +github.com/knadh/koanf/maps v0.1.2/go.mod h1:npD/QZY3V6ghQDdcQzl1W4ICNVTkohC8E73eI2xW4yI= +github.com/knadh/koanf/parsers/yaml v1.1.0 h1:3ltfm9ljprAHt4jxgeYLlFPmUaunuCgu1yILuTXRdM4= +github.com/knadh/koanf/parsers/yaml v1.1.0/go.mod h1:HHmcHXUrp9cOPcuC+2wrr44GTUB0EC+PyfN3HZD9tFg= +github.com/knadh/koanf/providers/confmap v1.0.0 h1:mHKLJTE7iXEys6deO5p6olAiZdG5zwp8Aebir+/EaRE= +github.com/knadh/koanf/providers/confmap v1.0.0/go.mod h1:txHYHiI2hAtF0/0sCmcuol4IDcuQbKTybiB1nOcUo1A= +github.com/knadh/koanf/providers/rawbytes v1.0.0 h1:MrKDh/HksJlKJmaZjgs4r8aVBb/zsJyc/8qaSnzcdNI= +github.com/knadh/koanf/providers/rawbytes v1.0.0/go.mod h1:KxwYJf1uezTKy6PBtfE+m725NGp4GPVA7XoNTJ/PtLo= +github.com/knadh/koanf/v2 v2.3.5 h1:2dXJUYaKGm4SGYeoAtBviq9+02JZo/pxQ2ssOd60rJg= +github.com/knadh/koanf/v2 v2.3.5/go.mod h1:gRb40VRAbd4iJMYYD5IxZ6hfuopFcXBpc9bbQpZwo28= github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE= github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk= github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY= github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE= +github.com/leodido/go-urn v1.4.0 h1:WT9HwE9SGECu3lg4d/dIA+jxlljEa1/ffXKmRjqdmIQ= +github.com/leodido/go-urn v1.4.0/go.mod h1:bvxc+MVxLKB4z00jd1z+Dvzr47oO32F/QSNjSBOlFxI= +github.com/lucasb-eyer/go-colorful v1.3.0 h1:2/yBRLdWBZKrf7gB40FoiKfAWYQ0lqNcbuQwVHXptag= +github.com/lucasb-eyer/go-colorful v1.3.0/go.mod h1:R4dSotOR9KMtayYi1e77YzuveK+i7ruzyGqttikkLy0= +github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY= +github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y= +github.com/mattn/go-localereader v0.0.1 h1:ygSAOl7ZXTx4RdPYinUpg6W99U8jWvWi9Ye2JC/oIi4= +github.com/mattn/go-localereader v0.0.1/go.mod h1:8fBrzywKY7BI3czFoHkuzRoWE9C+EiG4R1k4Cjx5p88= +github.com/mattn/go-runewidth v0.0.19 h1:v++JhqYnZuu5jSKrk9RbgF5v4CGUjqRfBm05byFGLdw= +github.com/mattn/go-runewidth v0.0.19/go.mod h1:XBkDxAl56ILZc9knddidhrOlY5R/pDhgLpndooCuJAs= +github.com/mitchellh/copystructure v1.2.0 h1:vpKXTN4ewci03Vljg/q9QvCGUDttBOGBIa15WveJJGw= +github.com/mitchellh/copystructure v1.2.0/go.mod h1:qLl+cE2AmVv+CoeAwDPye/v+N2HKCj9FbZEVFJRxO9s= +github.com/mitchellh/reflectwalk v1.0.2 h1:G2LzWKi524PWgd3mLHV8Y5k7s6XUvT0Gef6zxSIeXaQ= +github.com/mitchellh/reflectwalk v1.0.2/go.mod h1:mSTlrgnPZtwu0c4WaC2kGObEpuNDbx0jmZXqmk4esnw= +github.com/muesli/ansi v0.0.0-20230316100256-276c6243b2f6 h1:ZK8zHtRHOkbHy6Mmr5D264iyp3TiX5OmNcI5cIARiQI= +github.com/muesli/ansi v0.0.0-20230316100256-276c6243b2f6/go.mod h1:CJlz5H+gyd6CUWT45Oy4q24RdLyn7Md9Vj2/ldJBSIo= +github.com/muesli/cancelreader v0.2.2 h1:3I4Kt4BQjOR54NavqnDogx/MIoWBFa0StPA8ELUXHmA= +github.com/muesli/cancelreader v0.2.2/go.mod h1:3XuTXfFS2VjM+HTLZY9Ak0l6eUKfijIfMUZ4EgX0QYo= +github.com/muesli/termenv v0.16.0 h1:S5AlUN9dENB57rsbnkPyfdGuWIlkmzJjbFf0Tf5FWUc= +github.com/muesli/termenv v0.16.0/go.mod h1:ZRfOIKPFDYQoDFF4Olj7/QJbW60Ol/kL1pU3VfY/Cnk= github.com/pkg/diff v0.0.0-20210226163009-20ebb0f2a09e/go.mod h1:pJLUxLENpZxwdsKMEsNbx1VGcRFpLqf3715MtcvvzbA= github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM= github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4= +github.com/rivo/uniseg v0.4.7 h1:WUdvkW8uEhrYfLC4ZzdpI2ztxP1I582+49Oc5Mq64VQ= +github.com/rivo/uniseg v0.4.7/go.mod h1:FN3SvrM+Zdj16jyLfmOkMNblXMcoc8DfTHruCPUcx88= github.com/rjeczalik/notify v0.9.3 h1:6rJAzHTGKXGj76sbRgDiDcYj/HniypXmSJo1SWakZeY= github.com/rjeczalik/notify v0.9.3/go.mod h1:gF3zSOrafR9DQEWSE8TjfI9NkooDxbyT4UgRGKZA0lc= github.com/rogpeppe/go-internal v1.9.0 h1:73kH8U+JUqXU8lRuOHeVHaa/SZPifC7BkcraZVejAe8= @@ -28,9 +98,21 @@ github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+ github.com/stretchr/testify v1.6.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg= github.com/stretchr/testify v1.11.1 h1:7s2iGBzp5EwR7/aIZr8ao5+dra3wiQyKjjFuvgVKu7U= github.com/stretchr/testify v1.11.1/go.mod h1:wZwfW3scLgRK+23gO65QZefKpKQRnfz6sD981Nm4B6U= +github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e h1:JVG44RsyaB9T2KIHavMF/ppJZNG9ZpyihvCd0w101no= +github.com/xo/terminfo v0.0.0-20220910002029-abceb7e1c41e/go.mod h1:RbqR21r5mrJuqunuUZ/Dhy/avygyECGrLceyNeo4LiM= +go.yaml.in/yaml/v3 v3.0.3 h1:bXOww4E/J3f66rav3pX3m8w6jDE4knZjGOw8b5Y6iNE= +go.yaml.in/yaml/v3 v3.0.3/go.mod h1:tBHosrYAkRZjRAOREWbDnBXUf08JOwYq++0QNwQiWzI= +golang.org/x/crypto v0.52.0 h1:RMs7fP2rXdep0CftQlK8Uf+kibLm7qkCcradZWYz988= +golang.org/x/crypto v0.52.0/go.mod h1:1QgfPxDqh0T2M/elOJtp9RvuR95kVjir0e6/BvEmGbc= +golang.org/x/exp v0.0.0-20231006140011-7918f672742d h1:jtJma62tbqLibJ5sFQz8bKtEM8rJBtfilJ2qTU199MI= +golang.org/x/exp v0.0.0-20231006140011-7918f672742d/go.mod h1:ldy0pHrwJyGW56pPQzzkH36rKxoZW1tw7ZJpeKx+hdo= golang.org/x/sys v0.0.0-20180926160741-c2ed4eda69e7/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY= -golang.org/x/sys v0.29.0 h1:TPYlXGxvx1MGTn2GiZDhnjPA9wZzZeGKHHmKhHYvgaU= -golang.org/x/sys v0.29.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA= +golang.org/x/sys v0.0.0-20210809222454-d867a43fc93e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= +golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg= +golang.org/x/sys v0.45.0 h1:dO4czNzziLiiXplLQgBCEpCvXQ3dnkn0SdaZSYdQ+FY= +golang.org/x/sys v0.45.0/go.mod h1:4GL1E5IUh+htKOUEOaiffhrAeqysfVGipDYzABqnCmw= +golang.org/x/text v0.37.0 h1:Cqjiwd9eSg8e0QAkyCaQTNHFIIzWtidPahFWR83rTrc= +golang.org/x/text v0.37.0/go.mod h1:a5sjxXGs9hsn/AJVwuElvCAo9v8QYLzvavO5z2PiM38= gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15 h1:YR8cESwS4TdDjEe65xsg0ogRM/Nc3DYOhEAlW+xobZo= gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0= diff --git a/cli/internal/config/config.go b/cli/internal/config/config.go index 1456819..7f6dde2 100644 --- a/cli/internal/config/config.go +++ b/cli/internal/config/config.go @@ -5,16 +5,41 @@ import ( "fmt" "os" "path/filepath" + "strings" "gopkg.in/yaml.v3" ) type Config struct { - API APIConfig `yaml:"api"` + // Servers is the canonical list of named cix servers the CLI can talk to. + // One of them is the default (DefaultServer). Commands target the default + // unless --server is given. + Servers []ServerEntry `yaml:"servers" key:"servers" desc:"Named cix servers; the entry matching default_server is the active one"` + // DefaultServer is the name of the server used when --server is absent. + // CIX_SERVER env var overrides it when --server is not given. + DefaultServer string `yaml:"default_server" key:"default_server" env:"CIX_SERVER" desc:"Alias of the server used when --server is omitted"` + + // API is the legacy single-server config (pre-multi-server). It is read + // from old config files and migrated into Servers on Load (see + // migrateToServers), then cleared so it is no longer written back. + // omitempty keeps it out of freshly-written configs. + API APIConfig `yaml:"api,omitempty"` + Watcher WatcherConfig `yaml:"watcher"` - Server ServerConfig `yaml:"server"` Indexing IndexingConfig `yaml:"indexing"` - Projects []ProjectEntry `yaml:"projects"` + Projects []ProjectEntry `yaml:"projects" key:"projects" desc:"Registered project paths and their auto-watch flag"` +} + +// ServerEntry is a single named cix server: a friendly alias plus its base +// URL and API key. The alias is what users pass to --server. +// +// CIX_API_URL / CIX_API_KEY do NOT bind to a specific entry — they override +// the URL/key of the *resolved* server (the one picked by --server or +// default_server) inside getClient. Hence no `env:` tags here. +type ServerEntry struct { + Name string `yaml:"name" desc:"Server alias"` + URL string `yaml:"url" desc:"Base URL of the cix server" validate:"omitempty,url"` + Key string `yaml:"key" desc:"API key (bearer token)" sensitive:"true"` } type APIConfig struct { @@ -22,31 +47,34 @@ type APIConfig struct { Key string `yaml:"key"` } +// DefaultServerName is the alias assigned to the implicit/migrated server. +const DefaultServerName = "default" + type WatcherConfig struct { - Enabled bool `yaml:"enabled"` - DebounceMS int `yaml:"debounce_ms"` - ExcludePatterns []string `yaml:"exclude"` - SyncIntervalMins int `yaml:"sync_interval_mins"` + Enabled bool `yaml:"enabled" key:"watcher.enabled" desc:"Run the file watcher" default:"true"` + DebounceMS int `yaml:"debounce_ms" key:"watcher.debounce_ms" desc:"Debounce delay (ms)" default:"5000" validate:"min=100,max=60000"` + ExcludePatterns []string `yaml:"exclude" key:"watcher.exclude" desc:"Paths/globs to skip (comma-separated; REPLACE semantics on set)" default:"node_modules,.git,.venv,__pycache__,dist,build,.next,.cache,.DS_Store"` + SyncIntervalMins int `yaml:"sync_interval_mins" key:"watcher.sync_interval_mins" desc:"Periodic sync interval (minutes)" default:"5" validate:"min=1"` } -type ServerConfig struct { - Port int `yaml:"port"` - CacheTTL int `yaml:"cache_ttl"` -} +// ServerConfig used to hold port/cache_ttl knobs for an in-process server. +// The CLI runs no server, so the struct and its fields were removed. Old +// `server:` blocks in existing config files still load without error — +// koanf silently drops unknown keys during unmarshal. type IndexingConfig struct { - BatchSize int `yaml:"batchsize"` + BatchSize int `yaml:"batch_size" key:"indexing.batch_size" desc:"Indexing batch size" default:"20" validate:"min=1"` // StreamingIdleTimeoutSec is the maximum allowed silence on the streaming // /index/files response before the CLI gives up and closes the conn. The // server emits a heartbeat every 10s, so 30s gives the network three // retry windows. Set to 0 to disable the watchdog (not recommended). - StreamingIdleTimeoutSec int `yaml:"streaming_idle_timeout_sec"` + StreamingIdleTimeoutSec int `yaml:"streaming_idle_timeout_sec" key:"indexing.streaming_idle_timeout_sec" desc:"Streaming /index/files idle timeout (seconds); 0 disables watchdog" default:"30" validate:"min=0"` } type ProjectEntry struct { - Path string `yaml:"path"` - AutoWatch bool `yaml:"auto_watch"` + Path string `yaml:"path" desc:"Absolute path of the project root"` + AutoWatch bool `yaml:"auto_watch" desc:"Start the file watcher automatically for this project"` } var ( @@ -54,31 +82,9 @@ var ( configPath string ) -// defaults returns a Config populated with default values. -func defaults() Config { - return Config{ - API: APIConfig{ - URL: "http://localhost:21847", - }, - Watcher: WatcherConfig{ - Enabled: true, - DebounceMS: 5000, - ExcludePatterns: []string{ - "node_modules", ".git", ".venv", "__pycache__", - "dist", "build", ".next", ".cache", ".DS_Store", - }, - SyncIntervalMins: 5, - }, - Server: ServerConfig{ - Port: 8080, - CacheTTL: 300, - }, - Indexing: IndexingConfig{ - BatchSize: 20, - StreamingIdleTimeoutSec: 30, - }, - } -} +// (defaults() removed: the canonical source of default values is now the +// `default:"…"` struct tag set; loadWithKoanf seeds them via the schema +// walker. Servers/DefaultServer are populated by migrateToServers as before.) // normalizeLegacyKeys maps old viper-generated YAML key names to the current // yaml struct tag names. Provides backward compatibility for configs created @@ -89,6 +95,10 @@ func normalizeLegacyKeys(data []byte) []byte { {"excludepatterns:", "exclude:"}, {"cachettl:", "cache_ttl:"}, {"autowatch:", "auto_watch:"}, + // `batchsize:` was the viper-mangled emission of the `BatchSize` Go + // field. We now use `batch_size:` everywhere (consistent with other + // snake_case keys); this mapping keeps old files loading. + {"batchsize:", "batch_size:"}, } { data = bytes.ReplaceAll(data, []byte(pair[0]), []byte(pair[1])) } @@ -97,47 +107,71 @@ func normalizeLegacyKeys(data []byte) []byte { // Load loads configuration from ~/.cix/config.yaml. // Fields absent from the file keep their default values. +// +// Implementation: delegates the heavy lifting to loadWithKoanf +// (loader_koanf.go) — defaults come from struct tags, the YAML file is the +// override layer, and migrateToServers handles the legacy `api:` block and +// the implicit localhost seeding. Load owns the singleton cache and the +// "re-save when the loader rewrote the on-disk form" side effect. func Load() (*Config, error) { if globalConfig != nil { return globalConfig, nil } - home, err := os.UserHomeDir() + _, path, err := configPaths() if err != nil { - return nil, fmt.Errorf("get home dir: %w", err) + return nil, err } + configPath = path - configDir := filepath.Join(home, ".cix") - configPath = filepath.Join(configDir, "config.yaml") - - if err := os.MkdirAll(configDir, 0755); err != nil { - return nil, fmt.Errorf("create config dir: %w", err) + cfg, needsResave, err := loadWithKoanf() + if err != nil { + return nil, err } - cfg := defaults() + globalConfig = cfg + if needsResave { + _ = Save(cfg) + } + return globalConfig, nil +} - data, err := os.ReadFile(configPath) - if err != nil { - if os.IsNotExist(err) { - globalConfig = &cfg - return globalConfig, nil +// migrateToServers upgrades a parsed config to the multi-server layout in +// place and reports whether anything changed (so Load can re-save). +// +// - Legacy single-server config (api: only, no servers:) → one server named +// "default" carrying the old url/key; api: is cleared so it is no longer +// written back. +// - servers: present but no default_server → default to the first entry. +// - Neither servers: nor api: (e.g. a partial hand-written file) → seed the +// implicit localhost default so the CLI is always usable. +func migrateToServers(cfg *Config) (changed bool) { + if len(cfg.Servers) == 0 { + url := cfg.API.URL + if url == "" { + // No api.url (fresh install, or a legacy file that only set + // api.key): fall back to the historical localhost default so the + // migrated server is usable. + url = "http://localhost:21847" + } + cfg.Servers = []ServerEntry{ + {Name: DefaultServerName, URL: url, Key: cfg.API.Key}, } - return nil, fmt.Errorf("read config: %w", err) + cfg.DefaultServer = DefaultServerName + cfg.API = APIConfig{} + return true } - normalized := normalizeLegacyKeys(data) - if err := yaml.Unmarshal(normalized, &cfg); err != nil { - return nil, fmt.Errorf("parse config: %w", err) + // Servers present. Clear any leftover legacy api block and ensure a default. + if cfg.API.URL != "" || cfg.API.Key != "" { + cfg.API = APIConfig{} + changed = true } - - globalConfig = &cfg - - // If the file used legacy viper-style keys, re-save in the current format. - if !bytes.Equal(data, normalized) { - _ = Save(&cfg) + if cfg.DefaultServer == "" { + cfg.DefaultServer = cfg.Servers[0].Name + changed = true } - - return globalConfig, nil + return changed } // Save writes cfg to disk and updates the in-memory singleton. @@ -167,6 +201,154 @@ func ResetForTesting() { configPath = "" } +// validateServerName checks a server alias is usable as both a YAML name and +// a `server..url` config key (which is split on "."). Names must be +// non-empty and contain no dots or whitespace. +func validateServerName(name string) error { + if name == "" { + return fmt.Errorf("server name must not be empty") + } + if strings.ContainsAny(name, " \t\r\n") { + return fmt.Errorf("server name %q must not contain whitespace", name) + } + if strings.Contains(name, ".") { + return fmt.Errorf("server name %q must not contain '.'", name) + } + return nil +} + +// GetServer returns a pointer to the server entry with the given name. +func (c *Config) GetServer(name string) (*ServerEntry, bool) { + for i := range c.Servers { + if c.Servers[i].Name == name { + return &c.Servers[i], true + } + } + return nil, false +} + +// DefaultServerEntry returns the configured default server, falling back to +// the first server when DefaultServer is unset or dangling. +func (c *Config) DefaultServerEntry() (*ServerEntry, bool) { + if c.DefaultServer != "" { + if s, ok := c.GetServer(c.DefaultServer); ok { + return s, true + } + } + if len(c.Servers) > 0 { + return &c.Servers[0], true + } + return nil, false +} + +// ResolveServer selects which server a command should target. An empty name +// means "use the default"; a non-empty name must match a configured alias +// exactly. On miss the error lists the available aliases. +func (c *Config) ResolveServer(name string) (*ServerEntry, error) { + if name == "" { + if s, ok := c.DefaultServerEntry(); ok { + return s, nil + } + return nil, fmt.Errorf("no servers configured; run 'cix config set api.url ' or 'cix config set server..url '") + } + if s, ok := c.GetServer(name); ok { + return s, nil + } + return nil, fmt.Errorf("server %q not found; configured servers:\n - %s", name, strings.Join(c.serverNames(), "\n - ")) +} + +// serverNames returns the aliases of all configured servers. +func (c *Config) serverNames() []string { + names := make([]string, 0, len(c.Servers)) + for _, s := range c.Servers { + names = append(names, s.Name) + } + return names +} + +// upsertServer finds or appends the named server, applies mut, and makes it +// the default when no default is set yet. +func upsertServer(cfg *Config, name string, mut func(*ServerEntry)) { + if s, ok := cfg.GetServer(name); ok { + mut(s) + } else { + cfg.Servers = append(cfg.Servers, ServerEntry{Name: name}) + mut(&cfg.Servers[len(cfg.Servers)-1]) + } + if cfg.DefaultServer == "" { + cfg.DefaultServer = name + } +} + +// SetServerURL sets (or creates) the URL of the named server and persists. +func SetServerURL(name, url string) error { + if err := validateServerName(name); err != nil { + return err + } + cfg, err := Load() + if err != nil { + return err + } + upsertServer(cfg, name, func(s *ServerEntry) { s.URL = url }) + return Save(cfg) +} + +// SetServerKey sets (or creates) the API key of the named server and persists. +func SetServerKey(name, key string) error { + if err := validateServerName(name); err != nil { + return err + } + cfg, err := Load() + if err != nil { + return err + } + upsertServer(cfg, name, func(s *ServerEntry) { s.Key = key }) + return Save(cfg) +} + +// SetDefaultServer marks an existing server as the default and persists. +func SetDefaultServer(name string) error { + cfg, err := Load() + if err != nil { + return err + } + if _, ok := cfg.GetServer(name); !ok { + return fmt.Errorf("server %q not found; configured servers:\n - %s", name, strings.Join(cfg.serverNames(), "\n - ")) + } + cfg.DefaultServer = name + return Save(cfg) +} + +// RemoveServer deletes the named server and persists. If the removed server +// was the default and others remain, the default is reassigned to the first +// remaining server and its name is returned in reassignedTo. Removing the +// last server leaves none — the next Load() re-seeds the localhost default. +func RemoveServer(name string) (reassignedTo string, err error) { + cfg, err := Load() + if err != nil { + return "", err + } + if _, ok := cfg.GetServer(name); !ok { + return "", fmt.Errorf("server %q not found", name) + } + kept := make([]ServerEntry, 0, len(cfg.Servers)) + for _, s := range cfg.Servers { + if s.Name != name { + kept = append(kept, s) + } + } + cfg.Servers = kept + if cfg.DefaultServer == name { + if len(kept) > 0 { + cfg.DefaultServer = kept[0].Name + reassignedTo = kept[0].Name + } else { + cfg.DefaultServer = "" + } + } + return reassignedTo, Save(cfg) +} + // AddProject adds a project to the config. func AddProject(path string, autoWatch bool) error { cfg, err := Load() @@ -238,4 +420,4 @@ func GetPIDFile() (string, error) { } return filepath.Join(pidDir, "watcher.pid"), nil -} \ No newline at end of file +} diff --git a/cli/internal/config/config_test.go b/cli/internal/config/config_test.go index 24409cf..dc86683 100644 --- a/cli/internal/config/config_test.go +++ b/cli/internal/config/config_test.go @@ -3,6 +3,7 @@ package config import ( "os" "path/filepath" + "strings" "testing" ) @@ -31,11 +32,18 @@ func TestLoad_Defaults(t *testing.T) { t.Fatalf("Load() error = %v", err) } - if cfg.API.URL != "http://localhost:21847" { - t.Errorf("API.URL = %q, want %q", cfg.API.URL, "http://localhost:21847") + // With no config file, the implicit localhost default server is seeded. + if len(cfg.Servers) != 1 { + t.Fatalf("Servers len = %d, want 1 (seeded default)", len(cfg.Servers)) } - if cfg.API.Key != "" { - t.Errorf("API.Key = %q, want empty", cfg.API.Key) + if cfg.DefaultServer != DefaultServerName { + t.Errorf("DefaultServer = %q, want %q", cfg.DefaultServer, DefaultServerName) + } + if cfg.Servers[0].Name != DefaultServerName || cfg.Servers[0].URL != "http://localhost:21847" { + t.Errorf("default server = %+v, want {default, localhost:21847, }", cfg.Servers[0]) + } + if cfg.Servers[0].Key != "" { + t.Errorf("default server Key = %q, want empty", cfg.Servers[0].Key) } if !cfg.Watcher.Enabled { t.Error("Watcher.Enabled = false, want true") @@ -46,12 +54,6 @@ func TestLoad_Defaults(t *testing.T) { if len(cfg.Watcher.ExcludePatterns) == 0 { t.Error("Watcher.ExcludePatterns is empty, want default list") } - if cfg.Server.Port != 8080 { - t.Errorf("Server.Port = %d, want 8080", cfg.Server.Port) - } - if cfg.Server.CacheTTL != 300 { - t.Errorf("Server.CacheTTL = %d, want 300", cfg.Server.CacheTTL) - } if cfg.Indexing.BatchSize != 20 { t.Errorf("Indexing.BatchSize = %d, want 20", cfg.Indexing.BatchSize) } @@ -90,11 +92,22 @@ indexing: t.Fatalf("Load() error = %v", err) } - if cfg.API.URL != "http://myserver:9000" { - t.Errorf("API.URL = %q, want %q", cfg.API.URL, "http://myserver:9000") + // Legacy api: block migrates to a single "default" server. + if len(cfg.Servers) != 1 { + t.Fatalf("Servers len = %d, want 1 (migrated from api:)", len(cfg.Servers)) + } + if cfg.DefaultServer != DefaultServerName { + t.Errorf("DefaultServer = %q, want %q", cfg.DefaultServer, DefaultServerName) + } + if cfg.Servers[0].URL != "http://myserver:9000" { + t.Errorf("default server URL = %q, want %q", cfg.Servers[0].URL, "http://myserver:9000") + } + if cfg.Servers[0].Key != "secret-key-123" { + t.Errorf("default server Key = %q, want %q", cfg.Servers[0].Key, "secret-key-123") } - if cfg.API.Key != "secret-key-123" { - t.Errorf("API.Key = %q, want %q", cfg.API.Key, "secret-key-123") + // The legacy api block must be cleared after migration. + if cfg.API.URL != "" || cfg.API.Key != "" { + t.Errorf("API block = %+v, want cleared after migration", cfg.API) } if cfg.Watcher.Enabled { t.Error("Watcher.Enabled = true, want false") @@ -105,12 +118,9 @@ indexing: if cfg.Watcher.SyncIntervalMins != 10 { t.Errorf("Watcher.SyncIntervalMins = %d, want 10", cfg.Watcher.SyncIntervalMins) } - if cfg.Server.Port != 3000 { - t.Errorf("Server.Port = %d, want 3000", cfg.Server.Port) - } - if cfg.Server.CacheTTL != 60 { - t.Errorf("Server.CacheTTL = %d, want 60", cfg.Server.CacheTTL) - } + // server.port / server.cache_ttl removed (dead fields, step 13). The + // `server:` block in the input file is parsed-then-dropped by koanf — + // no assertion needed beyond "this load did not error". if cfg.Indexing.BatchSize != 5 { t.Errorf("Indexing.BatchSize = %d, want 5", cfg.Indexing.BatchSize) } @@ -139,15 +149,16 @@ api: t.Fatalf("Load() error = %v", err) } - if cfg.API.Key != "partial-key" { - t.Errorf("API.Key = %q, want %q", cfg.API.Key, "partial-key") + // api.key-only legacy file migrates to the default server, with the URL + // falling back to the historical localhost default. + if len(cfg.Servers) != 1 { + t.Fatalf("Servers len = %d, want 1", len(cfg.Servers)) } - // Default must still apply for the URL. - if cfg.API.URL != "http://localhost:21847" { - t.Errorf("API.URL = %q, want default http://localhost:21847", cfg.API.URL) + if cfg.Servers[0].Key != "partial-key" { + t.Errorf("default server Key = %q, want %q", cfg.Servers[0].Key, "partial-key") } - if cfg.Server.Port != 8080 { - t.Errorf("Server.Port = %d, want default 8080", cfg.Server.Port) + if cfg.Servers[0].URL != "http://localhost:21847" { + t.Errorf("default server URL = %q, want default http://localhost:21847", cfg.Servers[0].URL) } if cfg.Indexing.BatchSize != 20 { t.Errorf("Indexing.BatchSize = %d, want default 20", cfg.Indexing.BatchSize) @@ -215,20 +226,16 @@ func TestSave_RoundTrip(t *testing.T) { } want := &Config{ - API: APIConfig{ - URL: "http://saved:8888", - Key: "saved-key", + Servers: []ServerEntry{ + {Name: DefaultServerName, URL: "http://saved:8888", Key: "saved-key"}, }, + DefaultServer: DefaultServerName, Watcher: WatcherConfig{ Enabled: false, DebounceMS: 1234, SyncIntervalMins: 15, ExcludePatterns: []string{".git", "vendor"}, }, - Server: ServerConfig{ - Port: 4444, - CacheTTL: 99, - }, Indexing: IndexingConfig{ BatchSize: 7, }, @@ -246,11 +253,14 @@ func TestSave_RoundTrip(t *testing.T) { t.Fatalf("Load() after Save() error = %v", err) } - if got.API.URL != want.API.URL { - t.Errorf("API.URL = %q, want %q", got.API.URL, want.API.URL) + if len(got.Servers) != 1 { + t.Fatalf("Servers len = %d, want 1", len(got.Servers)) } - if got.API.Key != want.API.Key { - t.Errorf("API.Key = %q, want %q", got.API.Key, want.API.Key) + if got.Servers[0].URL != want.Servers[0].URL { + t.Errorf("server URL = %q, want %q", got.Servers[0].URL, want.Servers[0].URL) + } + if got.Servers[0].Key != want.Servers[0].Key { + t.Errorf("server Key = %q, want %q", got.Servers[0].Key, want.Servers[0].Key) } if got.Watcher.Enabled != want.Watcher.Enabled { t.Errorf("Watcher.Enabled = %v, want %v", got.Watcher.Enabled, want.Watcher.Enabled) @@ -261,9 +271,6 @@ func TestSave_RoundTrip(t *testing.T) { if got.Watcher.SyncIntervalMins != want.Watcher.SyncIntervalMins { t.Errorf("Watcher.SyncIntervalMins = %d, want %d", got.Watcher.SyncIntervalMins, want.Watcher.SyncIntervalMins) } - if got.Server.Port != want.Server.Port { - t.Errorf("Server.Port = %d, want %d", got.Server.Port, want.Server.Port) - } if got.Indexing.BatchSize != want.Indexing.BatchSize { t.Errorf("Indexing.BatchSize = %d, want %d", got.Indexing.BatchSize, want.Indexing.BatchSize) } @@ -347,12 +354,9 @@ projects: if len(cfg.Watcher.ExcludePatterns) != 2 { t.Errorf("ExcludePatterns len = %d, want 2 (legacy key: excludepatterns)", len(cfg.Watcher.ExcludePatterns)) } - if cfg.Server.CacheTTL != 120 { - t.Errorf("CacheTTL = %d, want 120 (legacy key: cachettl)", cfg.Server.CacheTTL) - } - if cfg.Server.Port != 9090 { - t.Errorf("Port = %d, want 9090", cfg.Server.Port) - } + // server.port / server.cache_ttl removed (dead fields, step 13). The + // legacy `server:` block must still parse without error, but its + // values are dropped — koanf silently ignores them on unmarshal. if len(cfg.Projects) != 1 || !cfg.Projects[0].AutoWatch { t.Errorf("Projects[0].AutoWatch = false, want true (legacy key: autowatch)") } @@ -433,6 +437,190 @@ func TestRemoveProject(t *testing.T) { } } +// TestMigrate_ReSavesNewFormat verifies a legacy api: file is rewritten to the +// servers: layout on disk (api: dropped) the first time it is loaded. +func TestMigrate_ReSavesNewFormat(t *testing.T) { + home := isolateHome(t) + + cfgDir := filepath.Join(home, ".cix") + if err := os.MkdirAll(cfgDir, 0755); err != nil { + t.Fatal(err) + } + content := "api:\n url: \"http://legacy:9000\"\n key: \"legacy-key\"\n" + path := filepath.Join(cfgDir, "config.yaml") + if err := os.WriteFile(path, []byte(content), 0644); err != nil { + t.Fatal(err) + } + + if _, err := Load(); err != nil { + t.Fatalf("Load() error = %v", err) + } + + raw, err := os.ReadFile(path) + if err != nil { + t.Fatal(err) + } + got := string(raw) + if !strings.Contains(got, "servers:") { + t.Errorf("re-saved config missing servers::\n%s", got) + } + if !strings.Contains(got, "default_server: default") { + t.Errorf("re-saved config missing default_server:\n%s", got) + } + if strings.Contains(got, "api:") { + t.Errorf("re-saved config should not contain api::\n%s", got) + } + if !strings.Contains(got, "http://legacy:9000") || !strings.Contains(got, "legacy-key") { + t.Errorf("re-saved config lost url/key:\n%s", got) + } +} + +func TestResolveServer(t *testing.T) { + c := &Config{ + Servers: []ServerEntry{ + {Name: "default", URL: "http://local", Key: "k1"}, + {Name: "corp", URL: "http://corp", Key: "k2"}, + }, + DefaultServer: "default", + } + + // Empty name → default server. + s, err := c.ResolveServer("") + if err != nil || s.Name != "default" { + t.Errorf("ResolveServer(\"\") = %v, %v; want default", s, err) + } + // Named alias. + s, err = c.ResolveServer("corp") + if err != nil || s.URL != "http://corp" { + t.Errorf("ResolveServer(corp) = %v, %v; want corp URL", s, err) + } + // Unknown → error listing available names. + _, err = c.ResolveServer("nope") + if err == nil { + t.Fatal("expected error for unknown server") + } + for _, want := range []string{"nope", "default", "corp"} { + if !strings.Contains(err.Error(), want) { + t.Errorf("error %q missing %q", err.Error(), want) + } + } +} + +// TestResolveServer_DanglingDefault falls back to the first server when +// DefaultServer points at a missing entry. +func TestResolveServer_DanglingDefault(t *testing.T) { + c := &Config{ + Servers: []ServerEntry{{Name: "only", URL: "http://only"}}, + DefaultServer: "ghost", + } + s, err := c.ResolveServer("") + if err != nil || s.Name != "only" { + t.Errorf("ResolveServer(\"\") with dangling default = %v, %v; want first server", s, err) + } +} + +func TestSetServer_UpsertAndDefault(t *testing.T) { + isolateHome(t) + if _, err := Load(); err != nil { + t.Fatal(err) + } + + // Adding a server's URL creates the entry. The seeded localhost default + // already exists, so the new one does NOT become default. + if err := SetServerURL("corp", "http://corp"); err != nil { + t.Fatalf("SetServerURL: %v", err) + } + if err := SetServerKey("corp", "corp-key"); err != nil { + t.Fatalf("SetServerKey: %v", err) + } + + cfg, _ := Load() + s, ok := cfg.GetServer("corp") + if !ok || s.URL != "http://corp" || s.Key != "corp-key" { + t.Fatalf("corp server = %+v, ok=%v", s, ok) + } + if cfg.DefaultServer != DefaultServerName { + t.Errorf("DefaultServer = %q, want %q (unchanged)", cfg.DefaultServer, DefaultServerName) + } + + // Updating url again must not duplicate the entry. + if err := SetServerURL("corp", "http://corp2"); err != nil { + t.Fatal(err) + } + cfg, _ = Load() + if n := len(cfg.Servers); n != 2 { + t.Errorf("Servers len = %d, want 2 (default + corp)", n) + } +} + +func TestSetDefaultServer(t *testing.T) { + isolateHome(t) + if _, err := Load(); err != nil { + t.Fatal(err) + } + if err := SetServerURL("corp", "http://corp"); err != nil { + t.Fatal(err) + } + + // Unknown name rejected. + if err := SetDefaultServer("ghost"); err == nil { + t.Error("expected error setting default to unknown server") + } + // Known name switches default. + if err := SetDefaultServer("corp"); err != nil { + t.Fatalf("SetDefaultServer: %v", err) + } + cfg, _ := Load() + if cfg.DefaultServer != "corp" { + t.Errorf("DefaultServer = %q, want corp", cfg.DefaultServer) + } +} + +func TestRemoveServer_ReassignsDefault(t *testing.T) { + isolateHome(t) + if _, err := Load(); err != nil { + t.Fatal(err) + } + if err := SetServerURL("corp", "http://corp"); err != nil { + t.Fatal(err) + } + // default is still "default"; remove it → reassign to remaining "corp". + reassigned, err := RemoveServer(DefaultServerName) + if err != nil { + t.Fatalf("RemoveServer: %v", err) + } + if reassigned != "corp" { + t.Errorf("reassignedTo = %q, want corp", reassigned) + } + cfg, _ := Load() + if cfg.DefaultServer != "corp" { + t.Errorf("DefaultServer = %q, want corp", cfg.DefaultServer) + } + if _, ok := cfg.GetServer(DefaultServerName); ok { + t.Error("default server should have been removed") + } + + // Removing an unknown server errors. + if _, err := RemoveServer("ghost"); err == nil { + t.Error("expected error removing unknown server") + } +} + +func TestValidateServerName(t *testing.T) { + isolateHome(t) + if _, err := Load(); err != nil { + t.Fatal(err) + } + for _, bad := range []string{"", "has.dot", "has space"} { + if err := SetServerURL(bad, "http://x"); err == nil { + t.Errorf("SetServerURL(%q) expected validation error", bad) + } + } + if err := SetServerURL("ok_name-1", "http://x"); err != nil { + t.Errorf("SetServerURL(valid) unexpected error: %v", err) + } +} + func TestGetConfigPath(t *testing.T) { home := isolateHome(t) @@ -444,4 +632,4 @@ func TestGetConfigPath(t *testing.T) { if got := GetConfigPath(); got != want { t.Errorf("GetConfigPath() = %q, want %q", got, want) } -} \ No newline at end of file +} diff --git a/cli/internal/config/loader_koanf.go b/cli/internal/config/loader_koanf.go new file mode 100644 index 0000000..4edacb0 --- /dev/null +++ b/cli/internal/config/loader_koanf.go @@ -0,0 +1,158 @@ +package config + +import ( + "bytes" + "fmt" + "os" + "path/filepath" + "reflect" + "strconv" + "strings" + + "github.com/knadh/koanf/parsers/yaml" + "github.com/knadh/koanf/providers/confmap" + "github.com/knadh/koanf/providers/rawbytes" + "github.com/knadh/koanf/v2" + + "github.com/anthropics/code-index/cli/internal/config/schema" +) + +// loadWithKoanf is the schema-driven config loader. It produces a *Config +// from these layers: +// +// 1. Defaults derived from `default:"…"` struct tags via schema.Walk. +// 2. The YAML file at ~/.cix/config.yaml (with legacy-key normalization +// applied to the raw bytes pre-parse). +// +// Post-unmarshal, migrateToServers seeds the implicit localhost server and +// upgrades the legacy single-server `api:` block to the `servers:` list — +// same logic the original loader used. +// +// The needsResave flag tells the caller whether the on-disk file should be +// rewritten because the load process changed its representation +// (normalization rewrote keys, or a legacy `api:` block was migrated). The +// flag is never true when no file existed — the no-file case keeps the +// in-memory defaults but does not materialize them to disk. +func loadWithKoanf() (cfg *Config, needsResave bool, err error) { + configDir, path, err := configPaths() + if err != nil { + return nil, false, err + } + if err := os.MkdirAll(configDir, 0755); err != nil { + return nil, false, fmt.Errorf("create config dir: %w", err) + } + + k := koanf.New(".") + + // Layer 1: defaults from struct tags. + if defaults := defaultsFromTags(); len(defaults) > 0 { + if err := k.Load(confmap.Provider(defaults, "."), nil); err != nil { + return nil, false, fmt.Errorf("load defaults: %w", err) + } + } + + // Layer 2: YAML file (if present), with legacy-key normalization. + var ( + fileExists bool + changedByNormalize bool + ) + data, ferr := os.ReadFile(path) + if ferr != nil && !os.IsNotExist(ferr) { + return nil, false, fmt.Errorf("read config: %w", ferr) + } + if ferr == nil { + fileExists = true + normalized := normalizeLegacyKeys(data) + changedByNormalize = !bytes.Equal(data, normalized) + if err := k.Load(rawbytes.Provider(normalized), yaml.Parser()); err != nil { + return nil, false, fmt.Errorf("parse config: %w", err) + } + } + + var out Config + if err := k.UnmarshalWithConf("", &out, koanf.UnmarshalConf{Tag: "yaml"}); err != nil { + return nil, false, fmt.Errorf("unmarshal config: %w", err) + } + + migrated := migrateToServers(&out) + needsResave = fileExists && (changedByNormalize || migrated) + return &out, needsResave, nil +} + +// configPaths returns the config directory and the config file path for the +// current user. Extracted so both loaders share the resolution. +func configPaths() (dir, path string, err error) { + home, err := os.UserHomeDir() + if err != nil { + return "", "", fmt.Errorf("get home dir: %w", err) + } + dir = filepath.Join(home, ".cix") + path = filepath.Join(dir, "config.yaml") + return dir, path, nil +} + +// defaultsFromTags walks the Config schema and returns a flat +// dotted-key → typed-value map suitable for koanf's confmap provider. +// +// Fields without a `default:` tag are omitted (they get Go's zero value +// after unmarshal, which is the same behavior as the legacy defaults() +// function for any field it didn't seed). Slices are parsed as +// comma-separated lists with whitespace trimmed. +func defaultsFromTags() map[string]any { + out := map[string]any{} + _ = schema.Walk(&Config{}, func(l schema.LeafField) { + raw := l.Tag("default") + if raw == "" { + return + } + val, ok := parseDefaultValue(raw, l.Field.Type) + if !ok { + return + } + out[l.Path] = val + }) + return out +} + +// parseDefaultValue converts a `default:"…"` tag string into a typed Go +// value matching the field's reflect.Type. Returns ok=false for unsupported +// kinds (e.g. nested structs), in which case the field falls back to its +// zero value after unmarshal. +func parseDefaultValue(raw string, t reflect.Type) (any, bool) { + switch t.Kind() { + case reflect.Bool: + v, err := strconv.ParseBool(raw) + if err != nil { + return nil, false + } + return v, true + case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64: + // bitSize 0 parses into the platform int width, so the int(v) + // conversion below cannot truncate on 32-bit builds (passing 64 + // here would let an int64 silently narrow — CodeQL + // go/incorrect-integer-conversion). default: tags are + // developer-controlled small values (ports, counts, timeouts), + // well within int range; an out-of-range tag now errors here + // (ok=false) rather than truncating. + v, err := strconv.ParseInt(raw, 10, 0) + if err != nil { + return nil, false + } + return int(v), true + case reflect.String: + return raw, true + case reflect.Slice: + if t.Elem().Kind() != reflect.String { + return nil, false + } + parts := strings.Split(raw, ",") + out := make([]string, 0, len(parts)) + for _, p := range parts { + if p = strings.TrimSpace(p); p != "" { + out = append(out, p) + } + } + return out, true + } + return nil, false +} diff --git a/cli/internal/config/loader_koanf_test.go b/cli/internal/config/loader_koanf_test.go new file mode 100644 index 0000000..e19c840 --- /dev/null +++ b/cli/internal/config/loader_koanf_test.go @@ -0,0 +1,161 @@ +package config + +import ( + "fmt" + "os" + "path/filepath" + "reflect" + "testing" +) + +// TestLoaderParity asserts loadWithKoanf produces an identical *Config to +// the legacy Load() path for every scenario the legacy loader covers. Once +// this is green, step 5 of the refactor (flipping Load to call koanf) is +// purely mechanical. +// +// Each scenario runs in a fresh HOME so the legacy loader's "re-save on +// migration" side effect does not leak into the koanf run, and vice versa. +func TestLoaderParity(t *testing.T) { + scenarios := []struct { + name string + file string // empty = no file on disk + }{ + { + name: "no_file", + file: "", + }, + { + name: "partial_file_only_debounce", + file: "watcher:\n debounce_ms: 1234\n", + }, + { + name: "full_file_new_format", + file: `servers: + - name: default + url: http://localhost:21847 + key: cix_abc123 + - name: corporate + url: https://cix.corp.internal + key: cix_xyz789 +default_server: corporate +watcher: + enabled: false + debounce_ms: 2000 + sync_interval_mins: 10 + exclude: + - vendor + - tmp +indexing: + batchsize: 50 + streaming_idle_timeout_sec: 45 +projects: + - path: /home/user/proj + auto_watch: true +`, + }, + { + name: "legacy_api_only", + file: "api:\n url: http://legacy.example\n key: cix_legacy\n", + }, + { + name: "legacy_lowercase_viper_keys", + file: `watcher: + debouncems: 9999 + excludepatterns: + - foo + - bar + sync_interval_mins: 7 +server: + cachettl: 600 +projects: + - path: /p + autowatch: true +`, + }, + { + name: "servers_no_default", + file: `servers: + - name: alpha + url: http://alpha + key: a + - name: beta + url: http://beta + key: b +`, + }, + } + + for _, sc := range scenarios { + t.Run(sc.name, func(t *testing.T) { + legacyCfg := runLoaderInTempHome(t, sc.file, false) + koanfCfg := runLoaderInTempHome(t, sc.file, true) + if !configsEqual(legacyCfg, koanfCfg) { + t.Errorf("legacy != koanf\n legacy: %s\n koanf: %s", + dumpConfig(legacyCfg), dumpConfig(koanfCfg)) + } + }) + } +} + +// runLoaderInTempHome sets HOME to a fresh tempdir, optionally writes a +// config file there, resets the legacy singleton, and runs the requested +// loader. Both loaders use os.UserHomeDir() → $HOME on Unix. +func runLoaderInTempHome(t *testing.T, file string, useKoanf bool) *Config { + t.Helper() + home := t.TempDir() + t.Setenv("HOME", home) + + cixDir := filepath.Join(home, ".cix") + if err := os.MkdirAll(cixDir, 0755); err != nil { + t.Fatalf("mkdir: %v", err) + } + if file != "" { + if err := os.WriteFile(filepath.Join(cixDir, "config.yaml"), []byte(file), 0644); err != nil { + t.Fatalf("write config: %v", err) + } + } + + ResetForTesting() + + if useKoanf { + cfg, _, err := loadWithKoanf() + if err != nil { + t.Fatalf("loadWithKoanf: %v", err) + } + return cfg + } + cfg, err := Load() + if err != nil { + t.Fatalf("Load: %v", err) + } + return cfg +} + +// configsEqual compares two *Config values after normalizing nil vs +// empty-slice for fields the two loaders may zero-init differently +// (Projects, ExcludePatterns, Servers — none should diverge but +// reflect.DeepEqual treats nil != []T{}). +func configsEqual(a, b *Config) bool { + if a == nil || b == nil { + return a == b + } + return reflect.DeepEqual(normalizeForCompare(*a), normalizeForCompare(*b)) +} + +func normalizeForCompare(c Config) Config { + if c.Projects == nil { + c.Projects = []ProjectEntry{} + } + if c.Servers == nil { + c.Servers = []ServerEntry{} + } + if c.Watcher.ExcludePatterns == nil { + c.Watcher.ExcludePatterns = []string{} + } + return c +} + +func dumpConfig(c *Config) string { + return fmt.Sprintf("Servers=%+v DefaultServer=%q API=%+v Watcher=%+v Indexing=%+v Projects=%+v", + c.Servers, c.DefaultServer, c.API, c.Watcher, c.Indexing, c.Projects) +} diff --git a/cli/internal/config/schema/schema.go b/cli/internal/config/schema/schema.go new file mode 100644 index 0000000..5712c02 --- /dev/null +++ b/cli/internal/config/schema/schema.go @@ -0,0 +1,103 @@ +// Package schema provides a tag-driven walker over the Config struct. +// +// The walker is the single source of truth for "what fields exist, where do +// they live, what are their defaults/descriptions/validators". Every config +// surface that needs that knowledge (show, set, keys, edit, init, defaults +// seeding) calls Walk and acts on the LeafField it yields, instead of +// hard-coding a switch. +package schema + +import ( + "fmt" + "reflect" +) + +// LeafField is one annotated leaf in the Config struct tree. +// +// Path is the dotted key from the `key:` tag (e.g. "watcher.debounce_ms"). +// Field carries the original reflect.StructField so callers can read all +// other tags (desc, default, env, validate, sensitive) without re-parsing. +// Value is the live reflect.Value pointing at the field — callers can +// read (Render) or mutate it (Set) via the standard reflect API. +type LeafField struct { + Path string + Field reflect.StructField + Value reflect.Value +} + +// Tag returns the value of a struct tag on this leaf's field. +// Convenience wrapper so callers don't need to import reflect. +func (l LeafField) Tag(name string) string { + return l.Field.Tag.Get(name) +} + +// Sensitive reports whether this leaf is marked `sensitive:"true"`. +// CodeQL sensitive-name heuristics flag any read of a *Key/*Secret value +// into a named variable, so renderers MUST use this gate instead of +// inspecting the value itself. +func (l LeafField) Sensitive() bool { + return l.Tag("sensitive") == "true" +} + +// LeafVisitor is invoked once per annotated leaf in Walk order +// (structural order of the source struct). +type LeafVisitor func(leaf LeafField) + +// Walk traverses cfg (a struct or pointer to a struct) and invokes visit for +// every field that has a `key:` struct tag. +// +// Rules: +// - A field WITH a `key:` tag is yielded as a leaf regardless of its Go +// kind. Slice fields (Servers, Projects) are yielded as a single leaf; +// callers render them with their own formatter. +// - A field WITHOUT a `key:` tag that is itself a struct is recursed into. +// Containers like `Watcher`, `Server`, `Indexing` carry no key tag +// themselves — their child fields each carry the full dotted key +// (`watcher.debounce_ms`, …). +// - A field WITHOUT a `key:` tag that is a scalar/slice is skipped +// entirely. This is how legacy fields (the auto-migrated `API` block) +// stay invisible to `config show` / `config set` without needing an +// allow-list. +// - Unexported fields are skipped. +func Walk(cfg any, visit LeafVisitor) error { + v := reflect.ValueOf(cfg) + if v.Kind() == reflect.Pointer { + if v.IsNil() { + return fmt.Errorf("schema.Walk: nil pointer") + } + v = v.Elem() + } + if v.Kind() != reflect.Struct { + return fmt.Errorf("schema.Walk: expected struct or *struct, got %s", v.Kind()) + } + walkStruct(v, visit) + return nil +} + +func walkStruct(v reflect.Value, visit LeafVisitor) { + t := v.Type() + for i := 0; i < t.NumField(); i++ { + f := t.Field(i) + if !f.IsExported() { + continue + } + fv := v.Field(i) + if key := f.Tag.Get("key"); key != "" { + visit(LeafField{Path: key, Field: f, Value: fv}) + continue + } + if fv.Kind() == reflect.Struct { + walkStruct(fv, visit) + } + } +} + +// Keys is a convenience that returns the ordered list of dotted paths Walk +// would yield. Used by tests and by `cix config keys`. +func Keys(cfg any) ([]string, error) { + var keys []string + err := Walk(cfg, func(l LeafField) { + keys = append(keys, l.Path) + }) + return keys, err +} diff --git a/cli/internal/config/schema/schema_test.go b/cli/internal/config/schema/schema_test.go new file mode 100644 index 0000000..7f41821 --- /dev/null +++ b/cli/internal/config/schema/schema_test.go @@ -0,0 +1,117 @@ +package schema_test + +import ( + "reflect" + "testing" + + "github.com/anthropics/code-index/cli/internal/config" + "github.com/anthropics/code-index/cli/internal/config/schema" +) + +// expectedKeys is the contract: the exact dotted-key set Walk yields over +// the current Config struct. Any new annotated field MUST update this list. +// Bare `expectedKeys` (not regex/contains) is intentional — silent drift in +// the key surface is exactly what this snapshot is here to catch. +var expectedKeys = []string{ + "servers", + "default_server", + "watcher.enabled", + "watcher.debounce_ms", + "watcher.exclude", + "watcher.sync_interval_mins", + "indexing.batch_size", + "indexing.streaming_idle_timeout_sec", + "projects", +} + +func TestKeys_Snapshot(t *testing.T) { + got, err := schema.Keys(&config.Config{}) + if err != nil { + t.Fatalf("Keys: %v", err) + } + if !reflect.DeepEqual(got, expectedKeys) { + t.Errorf("key snapshot drift:\nwant: %v\ngot: %v", expectedKeys, got) + } +} + +func TestWalk_YieldsTagMetadata(t *testing.T) { + // Spot-check that the LeafField carries enough metadata for downstream + // consumers (show, set, keys, TUI) — desc, default, validate, env. + want := map[string]map[string]string{ + "watcher.debounce_ms": { + "desc": "Debounce delay (ms)", + "default": "5000", + "validate": "min=100,max=60000", + }, + "default_server": { + "env": "CIX_SERVER", + "desc": "Alias of the server used when --server is omitted", + }, + "indexing.batch_size": { + "default": "20", + "validate": "min=1", + }, + } + + seen := map[string]map[string]string{} + err := schema.Walk(&config.Config{}, func(l schema.LeafField) { + if _, target := want[l.Path]; !target { + return + } + seen[l.Path] = map[string]string{ + "desc": l.Tag("desc"), + "default": l.Tag("default"), + "validate": l.Tag("validate"), + "env": l.Tag("env"), + } + }) + if err != nil { + t.Fatalf("Walk: %v", err) + } + + for path, expect := range want { + got, ok := seen[path] + if !ok { + t.Errorf("%s: not yielded by Walk", path) + continue + } + for tag, val := range expect { + if got[tag] != val { + t.Errorf("%s tag %q: want %q, got %q", path, tag, val, got[tag]) + } + } + } +} + +func TestWalk_RejectsNonStruct(t *testing.T) { + cases := []struct { + name string + v any + }{ + {"int", 42}, + {"nil-ptr", (*config.Config)(nil)}, + {"string", "hello"}, + } + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + err := schema.Walk(tc.v, func(schema.LeafField) {}) + if err == nil { + t.Errorf("expected error for %v, got nil", tc.v) + } + }) + } +} + +func TestWalk_SkipsLegacyAPIBlock(t *testing.T) { + // The legacy API field has no `key:` tag, so the walker must not yield + // `api.url` / `api.key` even though APIConfig has exported fields. + got, err := schema.Keys(&config.Config{}) + if err != nil { + t.Fatalf("Keys: %v", err) + } + for _, k := range got { + if k == "api.url" || k == "api.key" || k == "api" { + t.Errorf("legacy API field leaked into walker output: %q", k) + } + } +} diff --git a/cli/internal/config/set.go b/cli/internal/config/set.go new file mode 100644 index 0000000..3fdcb75 --- /dev/null +++ b/cli/internal/config/set.go @@ -0,0 +1,129 @@ +package config + +import ( + "errors" + "fmt" + "reflect" + "strconv" + "strings" + + "github.com/anthropics/code-index/cli/internal/config/schema" +) + +// ErrUnknownKey is returned by SetByPath when key does not match any +// schema-tagged leaf in the Config struct. Callers (notably runConfigSet) +// use errors.Is(err, ErrUnknownKey) to decide whether to fall through to a +// legacy handler. +var ErrUnknownKey = errors.New("unknown config key") + +// SetByPath looks up the schema leaf identified by key, parses value per +// the leaf's Go type, applies it, runs full-struct validation, and persists. +// +// Parsing rules: +// - bool strconv.ParseBool ("true"/"false"/"1"/"0"/etc.) +// - int* strconv.ParseInt(base=10) +// - string used verbatim +// - []string comma-separated, each entry trimmed; REPLACE semantics +// (the new value REPLACES the existing slice — there is no append form +// on `config set`) +// +// Server-management keys (`server..url|key`, `default_server` aliases +// for legacy `api.*`) are NOT handled here — they live in runConfigSet's +// dedicated branch because they have side effects (upsert into Servers, +// reassign DefaultServer) that don't fit the "parse-and-assign" model. +// +// Slices of structs (Servers, Projects) are deliberately rejected: there's +// no sensible string serialization for them and they have purpose-built +// CRUD helpers (SetServerURL, AddProject, …). +func SetByPath(key, value string) error { + cfg, err := Load() + if err != nil { + return err + } + + var ( + found bool + applyErr error + // On validation failure we restore the field to its prior value so + // the in-memory singleton stays consistent with the on-disk file + // (which was NOT written). Save the prior snapshot as a detached + // reflect.Value so a re-assignment via leaf.Value.Set() can undo + // the mutation without re-walking the schema. + leafRef schema.LeafField + priorVal reflect.Value + ) + walkErr := schema.Walk(cfg, func(l schema.LeafField) { + if found || l.Path != key { + return + } + found = true + leafRef = l + priorVal = reflect.New(l.Value.Type()).Elem() + priorVal.Set(l.Value) + applyErr = applyLeafValue(l, value) + }) + if walkErr != nil { + return walkErr + } + if !found { + return fmt.Errorf("%w: %s", ErrUnknownKey, key) + } + if applyErr != nil { + // Best-effort restore even on parse failure (applyLeafValue may + // have partially mutated for some kinds in the future). + leafRef.Value.Set(priorVal) + return applyErr + } + + if err := Validate(cfg); err != nil { + leafRef.Value.Set(priorVal) + return err + } + return Save(cfg) +} + +func applyLeafValue(l schema.LeafField, raw string) error { + if !l.Value.CanSet() { + return fmt.Errorf("config key %q cannot be set", l.Path) + } + v := l.Value + switch v.Kind() { + case reflect.Bool: + b, err := strconv.ParseBool(raw) + if err != nil { + return fmt.Errorf("%s: invalid bool %q (use true/false)", l.Path, raw) + } + v.SetBool(b) + return nil + + case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64: + n, err := strconv.ParseInt(raw, 10, 64) + if err != nil { + return fmt.Errorf("%s: invalid integer %q", l.Path, raw) + } + if v.OverflowInt(n) { + return fmt.Errorf("%s: value %d out of range", l.Path, n) + } + v.SetInt(n) + return nil + + case reflect.String: + v.SetString(raw) + return nil + + case reflect.Slice: + if v.Type().Elem().Kind() != reflect.String { + return fmt.Errorf("%s: list keys with non-string elements are not settable via 'config set'", l.Path) + } + parts := strings.Split(raw, ",") + out := make([]string, 0, len(parts)) + for _, p := range parts { + if trimmed := strings.TrimSpace(p); trimmed != "" { + out = append(out, trimmed) + } + } + v.Set(reflect.ValueOf(out)) + return nil + } + return fmt.Errorf("%s: unsupported field kind %s", l.Path, v.Kind()) +} diff --git a/cli/internal/config/set_test.go b/cli/internal/config/set_test.go new file mode 100644 index 0000000..eb85795 --- /dev/null +++ b/cli/internal/config/set_test.go @@ -0,0 +1,148 @@ +package config + +import ( + "errors" + "os" + "path/filepath" + "reflect" + "testing" +) + +// withIsolatedHome points config.Load() at a throwaway HOME and resets the +// singleton. Mirrors cmd/multiserver_test.go's isolateConfig — duplicated +// because that helper is in the cmd package and import would be cyclic. +func withIsolatedHome(t *testing.T) { + t.Helper() + t.Setenv("HOME", t.TempDir()) + t.Setenv("XDG_CONFIG_HOME", "") + ResetForTesting() + t.Cleanup(ResetForTesting) +} + +func TestSetByPath_Bool(t *testing.T) { + withIsolatedHome(t) + if err := SetByPath("watcher.enabled", "false"); err != nil { + t.Fatalf("SetByPath: %v", err) + } + cfg, _ := Load() + if cfg.Watcher.Enabled { + t.Errorf("Enabled = true, want false") + } + // And back. + if err := SetByPath("watcher.enabled", "true"); err != nil { + t.Fatal(err) + } + cfg, _ = Load() + if !cfg.Watcher.Enabled { + t.Errorf("Enabled = false, want true") + } +} + +func TestSetByPath_Int(t *testing.T) { + withIsolatedHome(t) + if err := SetByPath("watcher.debounce_ms", "2500"); err != nil { + t.Fatalf("SetByPath: %v", err) + } + cfg, _ := Load() + if cfg.Watcher.DebounceMS != 2500 { + t.Errorf("DebounceMS = %d, want 2500", cfg.Watcher.DebounceMS) + } +} + +func TestSetByPath_Slice_ReplaceSemantics(t *testing.T) { + withIsolatedHome(t) + // Set then overwrite — replace, not append. + if err := SetByPath("watcher.exclude", "vendor, tmp ,build"); err != nil { + t.Fatalf("first set: %v", err) + } + cfg, _ := Load() + want := []string{"vendor", "tmp", "build"} + if !reflect.DeepEqual(cfg.Watcher.ExcludePatterns, want) { + t.Errorf("ExcludePatterns = %v, want %v", cfg.Watcher.ExcludePatterns, want) + } + + if err := SetByPath("watcher.exclude", "only"); err != nil { + t.Fatalf("second set: %v", err) + } + cfg, _ = Load() + if !reflect.DeepEqual(cfg.Watcher.ExcludePatterns, []string{"only"}) { + t.Errorf("after replace, ExcludePatterns = %v, want [only]", cfg.Watcher.ExcludePatterns) + } +} + +func TestSetByPath_StreamingIdleTimeout(t *testing.T) { + // Was NEVER settable via the legacy switch — exposing it is one of the + // concrete user-visible wins of the schema-driven setter. + withIsolatedHome(t) + if err := SetByPath("indexing.streaming_idle_timeout_sec", "60"); err != nil { + t.Fatalf("SetByPath: %v", err) + } + cfg, _ := Load() + if cfg.Indexing.StreamingIdleTimeoutSec != 60 { + t.Errorf("StreamingIdleTimeoutSec = %d, want 60", cfg.Indexing.StreamingIdleTimeoutSec) + } +} + +func TestSetByPath_ValidationRejectsBadValue(t *testing.T) { + withIsolatedHome(t) + err := SetByPath("indexing.batch_size", "0") + if err == nil { + t.Fatal("expected validation error for batch_size=0") + } + // Bad value MUST NOT be persisted. + cfg, _ := Load() + if cfg.Indexing.BatchSize == 0 { + t.Errorf("BatchSize = 0 — validation should have rolled back the in-memory mutation before Save") + } +} + +func TestSetByPath_UnknownKey(t *testing.T) { + withIsolatedHome(t) + err := SetByPath("nope.does.not.exist", "v") + if !errors.Is(err, ErrUnknownKey) { + t.Errorf("err = %v, want errors.Is(_, ErrUnknownKey)", err) + } +} + +func TestSetByPath_BadIntFormat(t *testing.T) { + withIsolatedHome(t) + err := SetByPath("watcher.debounce_ms", "abc") + if err == nil { + t.Fatal("expected parse error") + } +} + +func TestSetByPath_BadBoolFormat(t *testing.T) { + withIsolatedHome(t) + err := SetByPath("watcher.enabled", "maybe") + if err == nil { + t.Fatal("expected parse error") + } +} + +// TestSetByPath_PersistsToDisk verifies the mutation actually reaches the +// YAML file (not just the in-memory singleton). Without this, a downstream +// reader on a fresh process would see the old value. +func TestSetByPath_PersistsToDisk(t *testing.T) { + withIsolatedHome(t) + if err := SetByPath("watcher.debounce_ms", "7777"); err != nil { + t.Fatal(err) + } + path := filepath.Join(os.Getenv("HOME"), ".cix", "config.yaml") + data, err := os.ReadFile(path) + if err != nil { + t.Fatalf("read file: %v", err) + } + if !contains(string(data), "debounce_ms: 7777") { + t.Errorf("config.yaml does not contain the new value:\n%s", string(data)) + } +} + +func contains(haystack, needle string) bool { + for i := 0; i+len(needle) <= len(haystack); i++ { + if haystack[i:i+len(needle)] == needle { + return true + } + } + return false +} diff --git a/cli/internal/config/tui/keys.go b/cli/internal/config/tui/keys.go new file mode 100644 index 0000000..d1a74bb --- /dev/null +++ b/cli/internal/config/tui/keys.go @@ -0,0 +1,100 @@ +package tui + +import "github.com/charmbracelet/bubbles/key" + +// keymap groups every binding the TUI responds to. Kept in one place so +// the help overlay and the Update switch agree on what's available. +type keymap struct { + Up key.Binding + Down key.Binding + Left key.Binding + Right key.Binding + NextPanel key.Binding + + Enter key.Binding + Toggle key.Binding + Save key.Binding + Test key.Binding + Add key.Binding + Delete key.Binding + MarkDef key.Binding + + Help key.Binding + Quit key.Binding +} + +func newKeymap() keymap { + return keymap{ + Up: key.NewBinding( + key.WithKeys("up", "k"), + key.WithHelp("↑/k", "up"), + ), + Down: key.NewBinding( + key.WithKeys("down", "j"), + key.WithHelp("↓/j", "down"), + ), + Left: key.NewBinding( + key.WithKeys("left", "h"), + key.WithHelp("←/h", "left panel"), + ), + Right: key.NewBinding( + key.WithKeys("right", "l"), + key.WithHelp("→/l", "right panel"), + ), + NextPanel: key.NewBinding( + key.WithKeys("tab"), + key.WithHelp("tab", "switch panel"), + ), + Enter: key.NewBinding( + key.WithKeys("enter"), + key.WithHelp("enter", "edit"), + ), + Toggle: key.NewBinding( + key.WithKeys(" ", "x"), + key.WithHelp("space/x", "toggle bool"), + ), + Save: key.NewBinding( + key.WithKeys("s"), + key.WithHelp("s", "save (no-op; sets save on edit)"), + ), + Test: key.NewBinding( + key.WithKeys("t"), + key.WithHelp("t", "test connection"), + ), + Add: key.NewBinding( + key.WithKeys("a"), + key.WithHelp("a", "add server"), + ), + Delete: key.NewBinding( + key.WithKeys("d"), + key.WithHelp("d", "delete server"), + ), + MarkDef: key.NewBinding( + key.WithKeys("m"), + key.WithHelp("m", "mark as default"), + ), + Help: key.NewBinding( + key.WithKeys("?"), + key.WithHelp("?", "help"), + ), + Quit: key.NewBinding( + key.WithKeys("q", "esc", "ctrl+c"), + key.WithHelp("q/esc", "quit"), + ), + } +} + +// shortHelp returns the keys shown in the always-on status bar. +func (k keymap) shortHelp() []key.Binding { + return []key.Binding{k.Up, k.Down, k.NextPanel, k.Enter, k.Help, k.Quit} +} + +// fullHelp returns all keys, grouped by purpose, for the ? overlay. +func (k keymap) fullHelp() [][]key.Binding { + return [][]key.Binding{ + {k.Up, k.Down, k.Left, k.Right, k.NextPanel}, + {k.Enter, k.Toggle, k.Save}, + {k.Add, k.Delete, k.MarkDef, k.Test}, + {k.Help, k.Quit}, + } +} diff --git a/cli/internal/config/tui/model.go b/cli/internal/config/tui/model.go new file mode 100644 index 0000000..345cbda --- /dev/null +++ b/cli/internal/config/tui/model.go @@ -0,0 +1,163 @@ +package tui + +import ( + "github.com/charmbracelet/bubbles/textinput" + "github.com/anthropics/code-index/cli/internal/config" +) + +// sectionID enumerates the left-panel entries. Stable order — the View +// renders them in this sequence. +type sectionID int + +const ( + secServers sectionID = iota + secWatcher + secIndexing + secProjects + secMisc + numSections +) + +// sectionLabel maps a sectionID to its display name. +func (s sectionID) Label() string { + switch s { + case secServers: + return "Servers" + case secWatcher: + return "Watcher" + case secIndexing: + return "Indexing" + case secProjects: + return "Projects" + case secMisc: + return "Misc" + } + return "?" +} + +// panel marks which of the two columns has focus. Keyboard navigation +// (Tab, h/l) flips this; the styled border highlights the active one. +type panel int + +const ( + panelLeft panel = iota + panelRight +) + +// editPurpose distinguishes between editing an existing scalar (set by +// config.SetByPath) and editing a server entry's URL or key. +type editPurpose int + +const ( + editPurposeScalar editPurpose = iota + editPurposeServerURL + editPurposeServerKey + editPurposeServerName // only used during "add server" flow +) + +// Model is the full TUI state. bubbletea is Elm-style: every keypress +// goes through Update(model, msg) → (model, cmd). View() reads model and +// returns a string. +type Model struct { + cfg *config.Config + + // Navigation. + active panel + sectionIdx int // 0..numSections-1 + rowIdx int // selected row within the right panel + + // Edit mode: when true, all keys go to editInput except esc/enter. + editing bool + editPurpose editPurpose + editKey string // schema key path being edited (scalar mode) + editServer int // index into cfg.Servers (server-edit modes) + editInput textinput.Model + editErr string + + // "Add server" flow. Three sequential prompts: name → URL → key. + addingServer bool + addStep int // 0=name, 1=url, 2=key + addName string + addURL string + + // Help overlay toggled with ?. + showHelp bool + + // Transient status line (shown below status bar; fades on next action). + statusMsg string + statusErr bool + + // Layout. + width, height int + styles styles + keys keymap + + quitting bool +} + +// NewModel builds the initial Model with cfg loaded. cfg is mutated in +// place when edits land; the caller is the only owner. +func NewModel(cfg *config.Config) Model { + ti := textinput.New() + ti.Prompt = "› " + ti.CharLimit = 200 + return Model{ + cfg: cfg, + active: panelLeft, + styles: newStyles(), + keys: newKeymap(), + editInput: ti, + } +} + +// numRowsRight returns how many rows the right panel renders for the +// current section. Edit-mode entry validates against this so the user +// can never select a non-existent row. +func (m Model) numRowsRight() int { + switch sectionID(m.sectionIdx) { + case secServers: + // Each server contributes one row; "no servers" still shows a + // hint row so the panel isn't empty. + if len(m.cfg.Servers) == 0 { + return 0 + } + return len(m.cfg.Servers) + 1 // +1 for "default_server" select + case secWatcher: + return 4 + case secIndexing: + return 2 + case secProjects: + return len(m.cfg.Projects) + case secMisc: + return 1 + } + return 0 +} + +// clampRow keeps rowIdx inside [0, numRowsRight). Called after any +// navigation or after a delete that shrinks the section. +func (m *Model) clampRow() { + n := m.numRowsRight() + if n == 0 { + m.rowIdx = 0 + return + } + if m.rowIdx < 0 { + m.rowIdx = 0 + } + if m.rowIdx >= n { + m.rowIdx = n - 1 + } +} + +// setStatus replaces the transient status line. ok=false renders in red. +// The status survives until the next keypress (Update clears it). +func (m *Model) setStatus(msg string, ok bool) { + m.statusMsg = msg + m.statusErr = !ok +} + +func (m *Model) clearStatus() { + m.statusMsg = "" + m.statusErr = false +} diff --git a/cli/internal/config/tui/model_test.go b/cli/internal/config/tui/model_test.go new file mode 100644 index 0000000..b59e431 --- /dev/null +++ b/cli/internal/config/tui/model_test.go @@ -0,0 +1,219 @@ +package tui + +import ( + "strings" + "testing" + + tea "github.com/charmbracelet/bubbletea" + + "github.com/anthropics/code-index/cli/internal/config" +) + +// freshCfg returns a minimal Config suitable for driving the TUI in tests +// — two servers, full watcher/indexing defaults. No HOME isolation, +// because we never call Load/Save here; the Model is exercised directly. +func freshCfg() *config.Config { + return &config.Config{ + Servers: []config.ServerEntry{ + {Name: "default", URL: "http://localhost:21847", Key: "k"}, + {Name: "corp", URL: "https://corp", Key: ""}, + }, + DefaultServer: "default", + Watcher: config.WatcherConfig{ + Enabled: true, + DebounceMS: 5000, + ExcludePatterns: []string{"node_modules", ".git"}, + SyncIntervalMins: 5, + }, + Indexing: config.IndexingConfig{ + BatchSize: 20, + StreamingIdleTimeoutSec: 30, + }, + } +} + +// makeKey produces a tea.KeyMsg that mimics a single-rune keypress. +// Workaround for bubbletea's typed messages — Update only switches on +// these, so synthesising them is enough to drive the Model. +func makeKey(s string) tea.KeyMsg { + switch s { + case "tab": + return tea.KeyMsg{Type: tea.KeyTab} + case "enter": + return tea.KeyMsg{Type: tea.KeyEnter} + case "esc": + return tea.KeyMsg{Type: tea.KeyEsc} + case "up": + return tea.KeyMsg{Type: tea.KeyUp} + case "down": + return tea.KeyMsg{Type: tea.KeyDown} + case " ": + return tea.KeyMsg{Type: tea.KeySpace} + } + return tea.KeyMsg{Type: tea.KeyRunes, Runes: []rune(s)} +} + +// send is the test driver — push msgs through Update and return the +// final Model. Test bodies stay short. +func send(m Model, msgs ...tea.Msg) Model { + for _, msg := range msgs { + updated, _ := m.Update(msg) + m = updated.(Model) + } + return m +} + +func TestNavigation_DownMovesSectionSelection(t *testing.T) { + m := NewModel(freshCfg()) + m.width, m.height = 100, 30 + + m = send(m, makeKey("down")) + if m.sectionIdx != int(secWatcher) { + t.Errorf("after down: sectionIdx = %d, want %d (Watcher)", m.sectionIdx, secWatcher) + } +} + +func TestNavigation_TabSwitchesPanels(t *testing.T) { + m := NewModel(freshCfg()) + m.width, m.height = 100, 30 + if m.active != panelLeft { + t.Fatalf("initial panel = %v, want left", m.active) + } + m = send(m, makeKey("tab")) + if m.active != panelRight { + t.Errorf("after tab: active = %v, want right", m.active) + } + m = send(m, makeKey("tab")) + if m.active != panelLeft { + t.Errorf("after second tab: active = %v, want left", m.active) + } +} + +func TestNavigation_LeftRightForcePanel(t *testing.T) { + m := NewModel(freshCfg()) + m.width, m.height = 100, 30 + + m = send(m, makeKey("l")) + if m.active != panelRight { + t.Errorf("after l: active = %v, want right", m.active) + } + m = send(m, makeKey("h")) + if m.active != panelLeft { + t.Errorf("after h: active = %v, want left", m.active) + } +} + +func TestEnterOnLeftPanelMovesFocusRight(t *testing.T) { + m := NewModel(freshCfg()) + m.width, m.height = 100, 30 + m = send(m, makeKey("enter")) + if m.active != panelRight { + t.Errorf("enter from left panel should focus right; got %v", m.active) + } +} + +func TestQuitSetsQuittingFlag(t *testing.T) { + m := NewModel(freshCfg()) + m.width, m.height = 100, 30 + m = send(m, makeKey("q")) + if !m.quitting { + t.Error("q should set quitting=true") + } +} + +func TestHelpToggle(t *testing.T) { + m := NewModel(freshCfg()) + m.width, m.height = 100, 30 + m = send(m, makeKey("?")) + if !m.showHelp { + t.Error("? should show help") + } + // Any key dismisses. + m = send(m, makeKey("x")) + if m.showHelp { + t.Error("any key should dismiss help") + } +} + +func TestRowsFor_ServersIncludesDefaultServerRow(t *testing.T) { + cfg := freshCfg() + rows := rowsFor(cfg, secServers) + if len(rows) != len(cfg.Servers)+1 { + t.Errorf("len(rows) = %d, want %d (servers + default_server)", len(rows), len(cfg.Servers)+1) + } + last := rows[len(rows)-1] + if last.label != "default_server" { + t.Errorf("last row label = %q, want default_server", last.label) + } +} + +func TestRowsFor_WatcherListsAllScalarLeaves(t *testing.T) { + cfg := freshCfg() + rows := rowsFor(cfg, secWatcher) + if len(rows) != 4 { + t.Errorf("watcher rows = %d, want 4 (enabled, debounce, exclude, sync)", len(rows)) + } + want := map[string]bool{ + "enabled": false, + "debounce_ms": false, + "exclude": false, + "sync_interval_mins": false, + } + for _, r := range rows { + want[r.label] = true + } + for label, seen := range want { + if !seen { + t.Errorf("missing row %q", label) + } + } +} + +func TestView_RendersBothPanels(t *testing.T) { + m := NewModel(freshCfg()) + m.width, m.height = 100, 30 + out := m.View() + for _, expect := range []string{"Servers", "Watcher", "Indexing", "Projects", "Misc", "default"} { + if !strings.Contains(out, expect) { + t.Errorf("View missing %q\noutput:\n%s", expect, out) + } + } +} + +func TestView_SensitiveKeyNeverLeaks(t *testing.T) { + cfg := freshCfg() + cfg.Servers[0].Key = "cix_super_secret_xyz" + m := NewModel(cfg) + m.width, m.height = 100, 30 + m.sectionIdx = int(secServers) + out := m.View() + if strings.Contains(out, "super_secret") { + t.Errorf("sensitive key leaked into View output") + } + if strings.Contains(out, "cix_super_secret_xyz") { + t.Errorf("sensitive key leaked into View output") + } +} + +func TestPingServer_RejectsBlank(t *testing.T) { + if err := PingServer("", "k"); err == nil { + t.Error("empty URL should fail") + } + if err := PingServer("http://x", ""); err == nil { + t.Error("empty key should fail") + } +} + +func TestClamp(t *testing.T) { + cases := []struct{ n, lo, hi, want int }{ + {5, 0, 10, 5}, + {-1, 0, 10, 0}, + {99, 0, 10, 10}, + {0, 0, 0, 0}, + } + for _, c := range cases { + if got := clamp(c.n, c.lo, c.hi); got != c.want { + t.Errorf("clamp(%d, %d, %d) = %d, want %d", c.n, c.lo, c.hi, got, c.want) + } + } +} diff --git a/cli/internal/config/tui/sections.go b/cli/internal/config/tui/sections.go new file mode 100644 index 0000000..52c4950 --- /dev/null +++ b/cli/internal/config/tui/sections.go @@ -0,0 +1,225 @@ +package tui + +import ( + "fmt" + "reflect" + "strconv" + "strings" + + "github.com/anthropics/code-index/cli/internal/config" + "github.com/anthropics/code-index/cli/internal/config/schema" +) + +// row is one entry rendered in the right panel. The TUI keeps every +// section's rows in this uniform shape so navigation, rendering, and +// edit-mode dispatch share one code path. +type row struct { + label string // shown in the "key" column + value string // shown in the "value" column (formatted for display) + + // dispatch on activation + kind rowKind + schemaKey string // for kindScalarEdit / kindBoolToggle: schema dotted key + serverIdx int // for kindServerEdit: index in cfg.Servers + + // metadata + sensitive bool +} + +type rowKind int + +const ( + rowKindInert rowKind = iota // no action on Enter + rowKindScalarEdit // Enter opens text input; SetByPath on save + rowKindBoolToggle // space/x flips bool; Enter does too + rowKindServerEdit // Enter opens server URL/key editor + rowKindDefaultPick // Enter cycles default_server alias +) + +// rowsFor returns the rendered rows for the currently selected section. +// Pure function of cfg — easy to unit-test without spinning up bubbletea. +func rowsFor(cfg *config.Config, sec sectionID) []row { + switch sec { + case secServers: + return serverRows(cfg) + case secWatcher: + return scalarRows(cfg, "watcher.") + case secIndexing: + return scalarRows(cfg, "indexing.") + case secProjects: + return projectRows(cfg) + case secMisc: + return miscRows(cfg) + } + return nil +} + +func serverRows(cfg *config.Config) []row { + if len(cfg.Servers) == 0 { + return nil + } + out := make([]row, 0, len(cfg.Servers)+1) + for i, s := range cfg.Servers { + // Marker shows which entry is the default. The key field is + // rendered via Sensitive — never the raw value. + marker := " " + if s.Name == cfg.DefaultServer { + marker = "●" + } + keyStatus := "(not set)" + if s.Key != "" { + keyStatus = "(set)" + } + out = append(out, row{ + label: fmt.Sprintf("%s %s", marker, s.Name), + value: fmt.Sprintf("%s key %s", s.URL, keyStatus), + kind: rowKindServerEdit, + serverIdx: i, + }) + } + out = append(out, row{ + label: "default_server", + value: cfg.DefaultServer, + kind: rowKindDefaultPick, + schemaKey: "default_server", + }) + return out +} + +// scalarRows walks the schema and emits one row per leaf whose dotted +// key starts with prefix. Slice-of-struct leaves (servers, projects) +// are skipped — they have dedicated sections. +func scalarRows(cfg *config.Config, prefix string) []row { + var out []row + _ = schema.Walk(cfg, func(l schema.LeafField) { + if !strings.HasPrefix(l.Path, prefix) { + return + } + // Skip non-scalar leaves (slice-of-struct never has a path prefix + // in this set, but be defensive). + if l.Value.Kind() == reflect.Slice && l.Value.Type().Elem().Kind() == reflect.Struct { + return + } + out = append(out, row{ + label: strings.TrimPrefix(l.Path, prefix), + value: formatLeafForRow(l), + kind: kindForLeaf(l), + schemaKey: l.Path, + sensitive: l.Sensitive(), + }) + }) + return out +} + +func kindForLeaf(l schema.LeafField) rowKind { + if l.Value.Kind() == reflect.Bool { + return rowKindBoolToggle + } + return rowKindScalarEdit +} + +// formatLeafForRow turns the leaf's current value into a display string +// that fits in one terminal row. Sensitive leaves never expose the value. +func formatLeafForRow(l schema.LeafField) string { + if l.Sensitive() { + if l.Value.IsZero() { + return "(not set)" + } + return "(set)" + } + v := l.Value + switch v.Kind() { + case reflect.Bool: + if v.Bool() { + return "✓ enabled" + } + return "✗ disabled" + case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64: + return strconv.FormatInt(v.Int(), 10) + case reflect.String: + return v.String() + case reflect.Slice: + if v.Type().Elem().Kind() == reflect.String { + items := v.Interface().([]string) + return fmt.Sprintf("[%d items] %s", len(items), strings.Join(truncList(items, 3), ", ")) + } + } + return fmt.Sprintf("%v", v.Interface()) +} + +func truncList(xs []string, n int) []string { + if len(xs) <= n { + return xs + } + out := make([]string, 0, n+1) + out = append(out, xs[:n]...) + out = append(out, fmt.Sprintf("…+%d", len(xs)-n)) + return out +} + +func projectRows(cfg *config.Config) []row { + out := make([]row, 0, len(cfg.Projects)) + for _, p := range cfg.Projects { + wState := "✗" + if p.AutoWatch { + wState = "✓" + } + out = append(out, row{ + label: p.Path, + value: fmt.Sprintf("auto-watch %s", wState), + kind: rowKindInert, + }) + } + return out +} + +func miscRows(cfg *config.Config) []row { + return []row{ + { + label: "config file", + value: config.GetConfigPath(), + kind: rowKindInert, + }, + } +} + +// rawScalarValue returns the leaf's current value as the canonical string +// the user will see when entering edit mode. For ints/bools we round-trip +// through strconv so the user can edit the exact form SetByPath expects. +func rawScalarValue(l schema.LeafField) string { + if l.Sensitive() { + // Edit mode does not pre-fill secrets — the user types fresh. + return "" + } + v := l.Value + switch v.Kind() { + case reflect.Bool: + return strconv.FormatBool(v.Bool()) + case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64: + return strconv.FormatInt(v.Int(), 10) + case reflect.String: + return v.String() + case reflect.Slice: + if v.Type().Elem().Kind() == reflect.String { + return strings.Join(v.Interface().([]string), ",") + } + } + return fmt.Sprintf("%v", v.Interface()) +} + +// findLeaf walks the schema and returns the leaf at path. Used to look +// up a leaf's reflect.Value when committing an edit (we cannot keep a +// stale leaf reference across an Update call because the underlying +// reflect.Value may move when cfg is mutated). +func findLeaf(cfg *config.Config, path string) (schema.LeafField, bool) { + var found schema.LeafField + var ok bool + _ = schema.Walk(cfg, func(l schema.LeafField) { + if ok || l.Path != path { + return + } + found = l + ok = true + }) + return found, ok +} diff --git a/cli/internal/config/tui/styles.go b/cli/internal/config/tui/styles.go new file mode 100644 index 0000000..db70d64 --- /dev/null +++ b/cli/internal/config/tui/styles.go @@ -0,0 +1,96 @@ +package tui + +import "github.com/charmbracelet/lipgloss" + +// Color palette — chosen so the TUI keeps decent contrast on both dark and +// light terminal themes. We avoid pure white/black and stick to mid-tones +// that get auto-adapted by lipgloss's terminal-color detection. +var ( + colAccent = lipgloss.AdaptiveColor{Light: "#005577", Dark: "#7dd3fc"} + colMuted = lipgloss.AdaptiveColor{Light: "#666666", Dark: "#888888"} + colBorder = lipgloss.AdaptiveColor{Light: "#cccccc", Dark: "#444444"} + colActiveBdr = lipgloss.AdaptiveColor{Light: "#005577", Dark: "#7dd3fc"} + colSel = lipgloss.AdaptiveColor{Light: "#000000", Dark: "#ffffff"} + colSelBg = lipgloss.AdaptiveColor{Light: "#cce5f0", Dark: "#1e3a8a"} + colOK = lipgloss.AdaptiveColor{Light: "#15803d", Dark: "#86efac"} + colWarn = lipgloss.AdaptiveColor{Light: "#a16207", Dark: "#fcd34d"} + colErr = lipgloss.AdaptiveColor{Light: "#b91c1c", Dark: "#fca5a5"} + colSensitive = lipgloss.AdaptiveColor{Light: "#7c3aed", Dark: "#c4b5fd"} +) + +// styles is a flat bundle of every reusable lipgloss.Style. Built once in +// New() and stored on the Model so each View() call does no allocation. +type styles struct { + leftPanel lipgloss.Style + leftPanelActive lipgloss.Style + rightPanel lipgloss.Style + rightPanelActive lipgloss.Style + + sectionRow lipgloss.Style + sectionRowSel lipgloss.Style + sectionCount lipgloss.Style + + rowKey lipgloss.Style + rowValue lipgloss.Style + rowKeySel lipgloss.Style + rowValSel lipgloss.Style + rowMuted lipgloss.Style + + statusBar lipgloss.Style + statusOK lipgloss.Style + statusErr lipgloss.Style + statusKey lipgloss.Style + + header lipgloss.Style + headerActive lipgloss.Style + + editLabel lipgloss.Style + editError lipgloss.Style + + dot lipgloss.Style + dotDimmed lipgloss.Style + sensitiveTag lipgloss.Style +} + +func newStyles() styles { + border := lipgloss.RoundedBorder() + return styles{ + leftPanel: lipgloss.NewStyle(). + Border(border).BorderForeground(colBorder). + Padding(0, 1), + leftPanelActive: lipgloss.NewStyle(). + Border(border).BorderForeground(colActiveBdr). + Padding(0, 1), + rightPanel: lipgloss.NewStyle(). + Border(border).BorderForeground(colBorder). + Padding(0, 1), + rightPanelActive: lipgloss.NewStyle(). + Border(border).BorderForeground(colActiveBdr). + Padding(0, 1), + + sectionRow: lipgloss.NewStyle().Padding(0, 1), + sectionRowSel: lipgloss.NewStyle().Padding(0, 1).Background(colSelBg).Foreground(colSel).Bold(true), + sectionCount: lipgloss.NewStyle().Foreground(colMuted), + + rowKey: lipgloss.NewStyle().Foreground(colAccent), + rowValue: lipgloss.NewStyle(), + rowKeySel: lipgloss.NewStyle().Foreground(colSel).Background(colSelBg).Bold(true), + rowValSel: lipgloss.NewStyle().Foreground(colSel).Background(colSelBg), + rowMuted: lipgloss.NewStyle().Foreground(colMuted), + + statusBar: lipgloss.NewStyle().Foreground(colMuted), + statusOK: lipgloss.NewStyle().Foreground(colOK).Bold(true), + statusErr: lipgloss.NewStyle().Foreground(colErr).Bold(true), + statusKey: lipgloss.NewStyle().Foreground(colAccent).Bold(true), + + header: lipgloss.NewStyle().Foreground(colMuted).Bold(true), + headerActive: lipgloss.NewStyle().Foreground(colAccent).Bold(true), + + editLabel: lipgloss.NewStyle().Foreground(colAccent).Bold(true), + editError: lipgloss.NewStyle().Foreground(colErr), + + dot: lipgloss.NewStyle().Foreground(colOK), + dotDimmed: lipgloss.NewStyle().Foreground(colMuted), + sensitiveTag: lipgloss.NewStyle().Foreground(colSensitive).Italic(true), + } +} diff --git a/cli/internal/config/tui/tui.go b/cli/internal/config/tui/tui.go new file mode 100644 index 0000000..70ddb31 --- /dev/null +++ b/cli/internal/config/tui/tui.go @@ -0,0 +1,57 @@ +// Package tui implements the interactive configuration editor used by +// `cix config edit` and `cix config init`. +// +// The editor is a full-screen bubbletea program — Elm-style state +// machine, lipgloss for styling, bubbles/textinput for inline edits. +// Layout is lazygit-inspired: section list on the left, content panel +// on the right, persistent help bar at the bottom. +// +// Every edit goes through the existing config.SetByPath / SetServerURL / +// SetServerKey paths — the TUI never touches the YAML directly, so +// validation, schema rules, and persistence are exactly what the CLI's +// `cix config set` would apply. +package tui + +import ( + "fmt" + + tea "github.com/charmbracelet/bubbletea" + + "github.com/anthropics/code-index/cli/internal/client" + "github.com/anthropics/code-index/cli/internal/config" +) + +// RunEdit boots the TUI against the current config. cfg is mutated via +// the config package's CRUD helpers — every successful keystroke writes +// straight to disk, so the function never needs an explicit Save step. +func RunEdit(cfg *config.Config) error { + model := NewModel(cfg) + p := tea.NewProgram(model, tea.WithAltScreen()) + _, err := p.Run() + return err +} + +// RunInit is the fresh-install variant. If no servers exist yet we seed +// the implicit localhost default so the user has something to edit on +// the first screen, then hand off to the standard editor. +func RunInit(cfg *config.Config) error { + if len(cfg.Servers) == 0 { + if err := config.SetServerURL(config.DefaultServerName, "http://localhost:21847"); err != nil { + return fmt.Errorf("seed default server: %w", err) + } + } + return RunEdit(cfg) +} + +// PingServer is a thin wrapper around client.Health used by the "test +// connection" action ('t' on a server row). Exported so the test file +// can exercise the error-mapping path without spinning up bubbletea. +func PingServer(url, key string) error { + if url == "" { + return fmt.Errorf("URL is required") + } + if key == "" { + return fmt.Errorf("API key is required") + } + return client.New(url, key).Health() +} diff --git a/cli/internal/config/tui/update.go b/cli/internal/config/tui/update.go new file mode 100644 index 0000000..1541032 --- /dev/null +++ b/cli/internal/config/tui/update.go @@ -0,0 +1,432 @@ +package tui + +import ( + "fmt" + "strconv" + "strings" + + "github.com/charmbracelet/bubbles/key" + "github.com/charmbracelet/bubbles/textinput" + tea "github.com/charmbracelet/bubbletea" + + "github.com/anthropics/code-index/cli/internal/config" +) + +// Init satisfies tea.Model — nothing to fire at startup. +func (m Model) Init() tea.Cmd { return nil } + +// Update is the central message handler. bubbletea calls it once per +// input event (key, resize, custom message). Every transition goes +// through here — the View() function is a pure read of the Model. +func (m Model) Update(msg tea.Msg) (tea.Model, tea.Cmd) { + switch msg := msg.(type) { + case tea.WindowSizeMsg: + m.width, m.height = msg.Width, msg.Height + return m, nil + case tea.KeyMsg: + return m.handleKey(msg) + } + return m, nil +} + +func (m Model) handleKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) { + // In edit mode (scalar or server), most keys go to the text input. + // Esc cancels, Enter commits. + if m.editing { + return m.handleEditKey(msg) + } + + // "Add server" flow uses the same text input but a different commit path. + if m.addingServer { + return m.handleAddKey(msg) + } + + // Help overlay traps every key — any input dismisses it. + if m.showHelp { + m.showHelp = false + return m, nil + } + + m.clearStatus() + + switch { + case key.Matches(msg, m.keys.Quit): + m.quitting = true + return m, tea.Quit + case key.Matches(msg, m.keys.Help): + m.showHelp = true + return m, nil + case key.Matches(msg, m.keys.NextPanel): + if m.active == panelLeft { + m.active = panelRight + } else { + m.active = panelLeft + } + return m, nil + case key.Matches(msg, m.keys.Left): + m.active = panelLeft + return m, nil + case key.Matches(msg, m.keys.Right): + m.active = panelRight + return m, nil + case key.Matches(msg, m.keys.Up): + return m.moveSelection(-1), nil + case key.Matches(msg, m.keys.Down): + return m.moveSelection(+1), nil + case key.Matches(msg, m.keys.Enter): + return m.activateRow() + case key.Matches(msg, m.keys.Toggle): + return m.toggleRow() + case key.Matches(msg, m.keys.Add): + if sectionID(m.sectionIdx) == secServers { + return m.beginAddServer() + } + return m, nil + case key.Matches(msg, m.keys.Delete): + if sectionID(m.sectionIdx) == secServers && m.active == panelRight { + return m.deleteSelectedServer() + } + return m, nil + case key.Matches(msg, m.keys.MarkDef): + if sectionID(m.sectionIdx) == secServers && m.active == panelRight { + return m.markDefaultServer() + } + return m, nil + case key.Matches(msg, m.keys.Test): + if sectionID(m.sectionIdx) == secServers && m.active == panelRight { + return m.testSelectedServer() + } + return m, nil + } + return m, nil +} + +func (m Model) moveSelection(delta int) Model { + if m.active == panelLeft { + m.sectionIdx = clamp(m.sectionIdx+delta, 0, int(numSections)-1) + m.rowIdx = 0 + } else { + n := m.numRowsRight() + if n == 0 { + m.rowIdx = 0 + } else { + m.rowIdx = clamp(m.rowIdx+delta, 0, n-1) + } + } + return m +} + +func (m Model) activateRow() (tea.Model, tea.Cmd) { + // Enter on left panel == "focus right panel". + if m.active == panelLeft { + m.active = panelRight + m.rowIdx = 0 + return m, nil + } + rows := rowsFor(m.cfg, sectionID(m.sectionIdx)) + if m.rowIdx >= len(rows) { + return m, nil + } + r := rows[m.rowIdx] + switch r.kind { + case rowKindBoolToggle: + return m.toggleBoolByPath(r.schemaKey) + case rowKindScalarEdit: + return m.beginEditScalar(r.schemaKey) + case rowKindServerEdit: + return m.beginEditServer(r.serverIdx, editPurposeServerURL) + case rowKindDefaultPick: + return m.cycleDefaultServer() + } + return m, nil +} + +func (m Model) toggleRow() (tea.Model, tea.Cmd) { + if m.active != panelRight { + return m, nil + } + rows := rowsFor(m.cfg, sectionID(m.sectionIdx)) + if m.rowIdx >= len(rows) { + return m, nil + } + if rows[m.rowIdx].kind == rowKindBoolToggle { + return m.toggleBoolByPath(rows[m.rowIdx].schemaKey) + } + return m, nil +} + +func (m Model) toggleBoolByPath(path string) (tea.Model, tea.Cmd) { + leaf, ok := findLeaf(m.cfg, path) + if !ok { + m.setStatus(fmt.Sprintf("unknown key %q", path), false) + return m, nil + } + newVal := !leaf.Value.Bool() + if err := config.SetByPath(path, strconv.FormatBool(newVal)); err != nil { + m.setStatus(err.Error(), false) + return m, nil + } + m.setStatus(fmt.Sprintf("✓ %s = %v", path, newVal), true) + return m, nil +} + +func (m Model) beginEditScalar(path string) (tea.Model, tea.Cmd) { + leaf, ok := findLeaf(m.cfg, path) + if !ok { + m.setStatus(fmt.Sprintf("unknown key %q", path), false) + return m, nil + } + m.editing = true + m.editPurpose = editPurposeScalar + m.editKey = path + m.editErr = "" + m.editInput.SetValue(rawScalarValue(leaf)) + m.editInput.CursorEnd() + if leaf.Sensitive() { + m.editInput.EchoMode = textinput.EchoPassword + } else { + m.editInput.EchoMode = textinput.EchoNormal + } + m.editInput.Focus() + return m, nil +} + +func (m Model) beginEditServer(idx int, purpose editPurpose) (tea.Model, tea.Cmd) { + if idx < 0 || idx >= len(m.cfg.Servers) { + return m, nil + } + m.editing = true + m.editPurpose = purpose + m.editServer = idx + m.editErr = "" + current := "" + if purpose == editPurposeServerURL { + current = m.cfg.Servers[idx].URL + m.editInput.EchoMode = textinput.EchoNormal + } else { + // Editing the key — never pre-fill, never echo. + current = "" + m.editInput.EchoMode = textinput.EchoPassword + } + m.editInput.SetValue(current) + m.editInput.CursorEnd() + m.editInput.Focus() + return m, nil +} + +func (m Model) cycleDefaultServer() (tea.Model, tea.Cmd) { + if len(m.cfg.Servers) == 0 { + return m, nil + } + // Find current and pick the next one. + curIdx := 0 + for i, s := range m.cfg.Servers { + if s.Name == m.cfg.DefaultServer { + curIdx = i + break + } + } + next := m.cfg.Servers[(curIdx+1)%len(m.cfg.Servers)].Name + if err := config.SetDefaultServer(next); err != nil { + m.setStatus(err.Error(), false) + return m, nil + } + m.setStatus(fmt.Sprintf("default_server = %s", next), true) + return m, nil +} + +func (m Model) markDefaultServer() (tea.Model, tea.Cmd) { + if m.rowIdx >= len(m.cfg.Servers) { + return m, nil + } + name := m.cfg.Servers[m.rowIdx].Name + if err := config.SetDefaultServer(name); err != nil { + m.setStatus(err.Error(), false) + return m, nil + } + m.setStatus(fmt.Sprintf("default_server = %s", name), true) + return m, nil +} + +func (m Model) deleteSelectedServer() (tea.Model, tea.Cmd) { + if m.rowIdx >= len(m.cfg.Servers) { + return m, nil + } + name := m.cfg.Servers[m.rowIdx].Name + reassigned, err := config.RemoveServer(name) + if err != nil { + m.setStatus(err.Error(), false) + return m, nil + } + msg := fmt.Sprintf("removed server %q", name) + if reassigned != "" { + msg += fmt.Sprintf("; default → %q", reassigned) + } + m.setStatus(msg, true) + m.clampRow() + return m, nil +} + +func (m Model) testSelectedServer() (tea.Model, tea.Cmd) { + if m.rowIdx >= len(m.cfg.Servers) { + return m, nil + } + s := m.cfg.Servers[m.rowIdx] + if err := PingServer(s.URL, s.Key); err != nil { + m.setStatus(fmt.Sprintf("✗ %s: %v", s.Name, err), false) + return m, nil + } + m.setStatus(fmt.Sprintf("✓ %s reachable", s.Name), true) + return m, nil +} + +func (m Model) beginAddServer() (tea.Model, tea.Cmd) { + m.addingServer = true + m.addStep = 0 + m.addName = "" + m.addURL = "" + m.editInput.SetValue("") + m.editInput.EchoMode = textinput.EchoNormal + m.editInput.Focus() + return m, nil +} + +func (m Model) handleEditKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) { + switch msg.String() { + case "esc": + m.editing = false + m.editErr = "" + m.editInput.Blur() + m.setStatus("edit cancelled", false) + return m, nil + case "enter": + return m.commitEdit() + } + var cmd tea.Cmd + m.editInput, cmd = m.editInput.Update(msg) + return m, cmd +} + +func (m Model) commitEdit() (tea.Model, tea.Cmd) { + value := strings.TrimRight(m.editInput.Value(), "\r\n") + switch m.editPurpose { + case editPurposeScalar: + if err := config.SetByPath(m.editKey, value); err != nil { + m.editErr = err.Error() + return m, nil + } + m.editing = false + m.editInput.Blur() + m.setStatus(fmt.Sprintf("✓ %s saved", m.editKey), true) + return m, nil + case editPurposeServerURL: + if m.editServer < 0 || m.editServer >= len(m.cfg.Servers) { + m.editing = false + return m, nil + } + name := m.cfg.Servers[m.editServer].Name + if err := config.SetServerURL(name, value); err != nil { + m.editErr = err.Error() + return m, nil + } + // Chain: after URL, prompt for key. + m.editPurpose = editPurposeServerKey + m.editInput.SetValue("") + m.editInput.EchoMode = textinput.EchoPassword + m.editErr = "" + return m, nil + case editPurposeServerKey: + if m.editServer < 0 || m.editServer >= len(m.cfg.Servers) { + m.editing = false + return m, nil + } + name := m.cfg.Servers[m.editServer].Name + if value == "" { + // Empty input on the key step means "skip — keep current". + m.editing = false + m.editInput.Blur() + m.setStatus(fmt.Sprintf("✓ %s URL updated", name), true) + return m, nil + } + if err := config.SetServerKey(name, value); err != nil { + m.editErr = err.Error() + return m, nil + } + m.editing = false + m.editInput.Blur() + m.setStatus(fmt.Sprintf("✓ %s URL+key updated", name), true) + return m, nil + } + m.editing = false + return m, nil +} + +func (m Model) handleAddKey(msg tea.KeyMsg) (tea.Model, tea.Cmd) { + switch msg.String() { + case "esc": + m.addingServer = false + m.editInput.Blur() + m.setStatus("add cancelled", false) + return m, nil + case "enter": + return m.commitAddStep() + } + var cmd tea.Cmd + m.editInput, cmd = m.editInput.Update(msg) + return m, cmd +} + +func (m Model) commitAddStep() (tea.Model, tea.Cmd) { + value := strings.TrimSpace(m.editInput.Value()) + switch m.addStep { + case 0: // name + if value == "" { + m.editErr = "name must not be empty" + return m, nil + } + m.addName = value + m.addStep = 1 + m.editInput.SetValue("") + m.editInput.EchoMode = textinput.EchoNormal + m.editErr = "" + return m, nil + case 1: // url + if value == "" { + m.editErr = "URL must not be empty" + return m, nil + } + m.addURL = value + m.addStep = 2 + m.editInput.SetValue("") + m.editInput.EchoMode = textinput.EchoPassword + m.editErr = "" + return m, nil + case 2: // key — optional; empty means "add server with no key" + if err := config.SetServerURL(m.addName, m.addURL); err != nil { + m.editErr = err.Error() + return m, nil + } + if value != "" { + if err := config.SetServerKey(m.addName, value); err != nil { + m.editErr = err.Error() + return m, nil + } + } + m.addingServer = false + m.editInput.Blur() + m.setStatus(fmt.Sprintf("✓ added server %q", m.addName), true) + return m, nil + } + return m, nil +} + +// clamp keeps n inside [lo, hi]. Stdlib doesn't ship one yet. +func clamp(n, lo, hi int) int { + if n < lo { + return lo + } + if n > hi { + return hi + } + return n +} diff --git a/cli/internal/config/tui/view.go b/cli/internal/config/tui/view.go new file mode 100644 index 0000000..53b5cf3 --- /dev/null +++ b/cli/internal/config/tui/view.go @@ -0,0 +1,283 @@ +package tui + +import ( + "fmt" + "strings" + + "github.com/charmbracelet/lipgloss" + + "github.com/anthropics/code-index/cli/internal/config" +) + +// View renders the full screen. bubbletea calls this once per Update. +// +// Layout: +// +// ┌─ sections ─┬─ ─────────────┐ +// │ ▶ Servers │ ...rows... │ +// │ Watcher │ │ +// │ … │ │ +// └────────────┴───────────────────────────────────┘ +// status / edit prompt +// short-help bar (always visible) +// +// The help overlay (when shown) replaces the body but keeps the bars +// underneath for context. +func (m Model) View() string { + if m.quitting { + return "" + } + if m.width == 0 || m.height == 0 { + return "initializing…" + } + + bodyH := m.height - 3 // 1 status, 1 short-help, 1 spacing + if bodyH < 4 { + bodyH = 4 + } + + if m.showHelp { + return m.renderHelp(bodyH) + } + + leftW := 22 + rightW := m.width - leftW - 4 // borders + padding + if rightW < 20 { + rightW = 20 + } + + left := m.renderSections(leftW, bodyH) + right := m.renderRightPanel(rightW, bodyH) + + body := lipgloss.JoinHorizontal(lipgloss.Top, left, right) + + statusLine := m.renderStatusOrEdit(m.width) + helpLine := m.renderShortHelp(m.width) + + return lipgloss.JoinVertical(lipgloss.Left, body, statusLine, helpLine) +} + +// renderSections draws the left panel — the list of section names with +// the current selection highlighted. Width and height are the inner box +// size; borders add 2 each. +func (m Model) renderSections(w, h int) string { + var lines []string + for i := 0; i < int(numSections); i++ { + s := sectionID(i) + marker := " " + if i == m.sectionIdx { + if m.active == panelLeft { + marker = "▶ " + } else { + marker = "● " + } + } + count := sectionCount(m.cfg, s) + label := fmt.Sprintf("%s%-9s %s", marker, s.Label(), m.styles.sectionCount.Render(countLabel(count))) + if i == m.sectionIdx { + lines = append(lines, m.styles.sectionRowSel.Render(label)) + } else { + lines = append(lines, m.styles.sectionRow.Render(label)) + } + } + for len(lines) < h-2 { + lines = append(lines, "") + } + + style := m.styles.leftPanel + if m.active == panelLeft { + style = m.styles.leftPanelActive + } + return style.Width(w).Height(h).Render(strings.Join(lines, "\n")) +} + +// sectionCount returns the number that appears as a small "n" badge next +// to the section label — total items for lists, total tunable fields for +// scalar groups. Cheap to recompute on every View() call. +func sectionCount(cfg *config.Config, s sectionID) int { + switch s { + case secServers: + return len(cfg.Servers) + case secProjects: + return len(cfg.Projects) + case secWatcher: + return 4 + case secIndexing: + return 2 + } + return 0 +} + +func countLabel(n int) string { + if n <= 0 { + return "" + } + return fmt.Sprintf("%d", n) +} + +// renderRightPanel draws the content panel for the selected section. +func (m Model) renderRightPanel(w, h int) string { + rows := rowsFor(m.cfg, sectionID(m.sectionIdx)) + + title := m.styles.header.Render(sectionID(m.sectionIdx).Label()) + if m.active == panelRight { + title = m.styles.headerActive.Render(sectionID(m.sectionIdx).Label()) + } + + var lines []string + lines = append(lines, title, "") + + if len(rows) == 0 { + lines = append(lines, m.styles.rowMuted.Render(m.emptySectionHint())) + } + + keyW := keyColumnWidth(rows) + for i, r := range rows { + selected := i == m.rowIdx && m.active == panelRight + + keyText := r.label + valText := r.value + if r.sensitive && !selected { + valText = m.styles.sensitiveTag.Render(valText) + } + + var line string + switch { + case selected: + line = m.styles.rowKeySel.Render(padRight(keyText, keyW)) + + m.styles.rowValSel.Render(" "+valText) + default: + line = m.styles.rowKey.Render(padRight(keyText, keyW)) + + " " + m.styles.rowValue.Render(valText) + } + lines = append(lines, line) + } + + // Section-specific footer (action hints). + footer := m.sectionFooter() + if footer != "" { + // Pad up so the footer pins to the bottom. + for len(lines) < h-3 { + lines = append(lines, "") + } + lines = append(lines, m.styles.rowMuted.Render(footer)) + } + + style := m.styles.rightPanel + if m.active == panelRight { + style = m.styles.rightPanelActive + } + return style.Width(w).Height(h).Render(strings.Join(lines, "\n")) +} + +func (m Model) emptySectionHint() string { + switch sectionID(m.sectionIdx) { + case secServers: + return "no servers configured — press 'a' to add one" + case secProjects: + return "no projects — register one with `cix init`" + } + return "" +} + +func (m Model) sectionFooter() string { + switch sectionID(m.sectionIdx) { + case secServers: + return "enter edit · a add · d delete · m mark default · t test" + case secWatcher, secIndexing: + return "enter edit · space toggle bool" + case secProjects: + return "managed via `cix init` / dashboard" + } + return "" +} + +func keyColumnWidth(rows []row) int { + w := 12 + for _, r := range rows { + if l := lipgloss.Width(r.label); l > w { + w = l + } + } + if w > 32 { + w = 32 + } + return w +} + +func padRight(s string, w int) string { + if lipgloss.Width(s) >= w { + return s + } + return s + strings.Repeat(" ", w-lipgloss.Width(s)) +} + +// renderStatusOrEdit shows either the active edit input or the last +// status message. Edit takes precedence — there's no confusion about +// where keystrokes are going. +func (m Model) renderStatusOrEdit(w int) string { + if m.editing { + label := m.editLabel() + line := m.styles.editLabel.Render(label+":") + " " + m.editInput.View() + if m.editErr != "" { + line += " " + m.styles.editError.Render(m.editErr) + } + return line + } + if m.addingServer { + label := []string{"name", "URL", "API key (optional)"}[m.addStep] + line := m.styles.editLabel.Render("add server "+label+":") + " " + m.editInput.View() + if m.editErr != "" { + line += " " + m.styles.editError.Render(m.editErr) + } + return line + } + if m.statusMsg != "" { + style := m.styles.statusOK + if m.statusErr { + style = m.styles.statusErr + } + return style.Render(m.statusMsg) + } + return "" +} + +func (m Model) editLabel() string { + switch m.editPurpose { + case editPurposeScalar: + return m.editKey + case editPurposeServerURL: + if m.editServer < len(m.cfg.Servers) { + return m.cfg.Servers[m.editServer].Name + " URL" + } + case editPurposeServerKey: + if m.editServer < len(m.cfg.Servers) { + return m.cfg.Servers[m.editServer].Name + " API key (empty=keep)" + } + } + return "edit" +} + +// renderShortHelp is the always-on bottom hint bar. +func (m Model) renderShortHelp(_ int) string { + parts := []string{} + for _, b := range m.keys.shortHelp() { + parts = append(parts, fmt.Sprintf("%s %s", b.Help().Key, b.Help().Desc)) + } + return m.styles.statusBar.Render(strings.Join(parts, " · ")) +} + +// renderHelp shows the full key table when ? is pressed. +func (m Model) renderHelp(bodyH int) string { + var b strings.Builder + b.WriteString(m.styles.headerActive.Render("Keybindings") + "\n\n") + for _, group := range m.keys.fullHelp() { + for _, k := range group { + b.WriteString(fmt.Sprintf(" %-12s %s\n", + m.styles.statusKey.Render(k.Help().Key), k.Help().Desc)) + } + b.WriteString("\n") + } + b.WriteString(m.styles.statusBar.Render("press any key to dismiss")) + return lipgloss.Place(m.width, bodyH, lipgloss.Center, lipgloss.Center, b.String()) +} diff --git a/cli/internal/config/validator.go b/cli/internal/config/validator.go new file mode 100644 index 0000000..259db80 --- /dev/null +++ b/cli/internal/config/validator.go @@ -0,0 +1,94 @@ +package config + +import ( + "fmt" + "strings" + + "github.com/go-playground/validator/v10" +) + +// validate is the singleton validator. Reused across calls so its tag cache +// is hot and so external code can register custom tags once. +var validate = validator.New(validator.WithRequiredStructEnabled()) + +// Validate runs struct-tag validation on cfg and returns a friendly, +// dotted-key error if any field fails. Returns nil when every constraint is +// satisfied. Slice elements are NOT validated transitively in this pass — +// add `dive` to the relevant slice field's tag if/when that's wanted. +// +// Validate is NOT called by Load() — a malformed value in an existing on- +// disk config should not brick the CLI. Validate is the gate for +// mutations: `cix config set`, the TUI's per-field error rendering, and +// any future `cix config doctor` will all call it. +func Validate(cfg *Config) error { + if cfg == nil { + return fmt.Errorf("nil config") + } + err := validate.Struct(cfg) + if err == nil { + return nil + } + var verrs validator.ValidationErrors + if !asValidationErrors(err, &verrs) { + return err + } + parts := make([]string, 0, len(verrs)) + for _, ve := range verrs { + parts = append(parts, formatFieldError(ve)) + } + return fmt.Errorf("config validation failed: %s", strings.Join(parts, "; ")) +} + +// asValidationErrors is a tiny helper that avoids importing errors.As at +// every call site and keeps the type assertion explicit. +func asValidationErrors(err error, out *validator.ValidationErrors) bool { + if ve, ok := err.(validator.ValidationErrors); ok { + *out = ve + return true + } + return false +} + +// formatFieldError turns a validator FieldError into a user-readable line. +// The validator reports paths in Go-struct form (e.g. Watcher.DebounceMS); +// callers usually want the dotted YAML/key form (e.g. watcher.debounce_ms). +// Translation is best-effort via a small hand-written map — keeping it +// hard-coded avoids paying for another reflect walk on every error. +func formatFieldError(ve validator.FieldError) string { + key := goPathToKey(ve.Namespace()) + switch ve.Tag() { + case "min": + return fmt.Sprintf("%s must be ≥ %s (got %v)", key, ve.Param(), ve.Value()) + case "max": + return fmt.Sprintf("%s must be ≤ %s (got %v)", key, ve.Param(), ve.Value()) + case "url": + return fmt.Sprintf("%s must be a valid URL (got %q)", key, ve.Value()) + case "required": + return fmt.Sprintf("%s is required", key) + default: + return fmt.Sprintf("%s failed %q (got %v)", key, ve.Tag(), ve.Value()) + } +} + +// goPathToKey maps validator's Namespace ("Config.Watcher.DebounceMS") onto +// the user-facing dotted key ("watcher.debounce_ms"). Best-effort: unknown +// paths are lowercased. +var goPathToKey = func(ns string) string { + switch ns { + case "Config.Watcher.Enabled": + return "watcher.enabled" + case "Config.Watcher.DebounceMS": + return "watcher.debounce_ms" + case "Config.Watcher.ExcludePatterns": + return "watcher.exclude" + case "Config.Watcher.SyncIntervalMins": + return "watcher.sync_interval_mins" + case "Config.Indexing.BatchSize": + return "indexing.batch_size" + case "Config.Indexing.StreamingIdleTimeoutSec": + return "indexing.streaming_idle_timeout_sec" + case "Config.DefaultServer": + return "default_server" + } + return strings.ToLower(ns) +} diff --git a/cli/internal/config/validator_test.go b/cli/internal/config/validator_test.go new file mode 100644 index 0000000..920f8be --- /dev/null +++ b/cli/internal/config/validator_test.go @@ -0,0 +1,107 @@ +package config + +import ( + "strings" + "testing" +) + +// TestValidate_OK exercises every default value to confirm the baseline +// config passes validation. Failure here means a `default:"…"` tag is +// outside its own `validate:"…"` range — an internal contradiction in the +// schema, not a user error. +func TestValidate_OK(t *testing.T) { + cfg := &Config{ + Servers: []ServerEntry{ + {Name: "default", URL: "http://localhost:21847", Key: "k"}, + }, + DefaultServer: "default", + Watcher: WatcherConfig{ + Enabled: true, + DebounceMS: 5000, + ExcludePatterns: []string{".git"}, + SyncIntervalMins: 5, + }, + Indexing: IndexingConfig{ + BatchSize: 20, + StreamingIdleTimeoutSec: 30, + }, + } + if err := Validate(cfg); err != nil { + t.Errorf("Validate on canonical defaults: %v", err) + } +} + +func TestValidate_Failures(t *testing.T) { + cases := []struct { + name string + mut func(*Config) + // expectKey is the dotted-key fragment we expect in the error + // message; if absent we know goPathToKey is wrong. + expectKey string + }{ + { + name: "debounce too low", + mut: func(c *Config) { c.Watcher.DebounceMS = 50 }, + expectKey: "watcher.debounce_ms", + }, + { + name: "debounce too high", + mut: func(c *Config) { c.Watcher.DebounceMS = 999999 }, + expectKey: "watcher.debounce_ms", + }, + { + name: "sync interval zero", + mut: func(c *Config) { c.Watcher.SyncIntervalMins = 0 }, + expectKey: "watcher.sync_interval_mins", + }, + { + name: "batch size zero", + mut: func(c *Config) { c.Indexing.BatchSize = 0 }, + expectKey: "indexing.batch_size", + }, + { + name: "streaming idle negative", + mut: func(c *Config) { c.Indexing.StreamingIdleTimeoutSec = -1 }, + expectKey: "indexing.streaming_idle_timeout_sec", + }, + } + + for _, tc := range cases { + t.Run(tc.name, func(t *testing.T) { + cfg := validBaseConfig() + tc.mut(cfg) + err := Validate(cfg) + if err == nil { + t.Fatalf("expected validation error, got nil") + } + if !strings.Contains(err.Error(), tc.expectKey) { + t.Errorf("error %q does not reference key %q", err, tc.expectKey) + } + }) + } +} + +func TestValidate_NilConfig(t *testing.T) { + if err := Validate(nil); err == nil { + t.Errorf("expected error on nil config") + } +} + +func validBaseConfig() *Config { + return &Config{ + Servers: []ServerEntry{ + {Name: "default", URL: "http://localhost:21847"}, + }, + DefaultServer: "default", + Watcher: WatcherConfig{ + Enabled: true, + DebounceMS: 5000, + ExcludePatterns: []string{".git"}, + SyncIntervalMins: 5, + }, + Indexing: IndexingConfig{ + BatchSize: 20, + StreamingIdleTimeoutSec: 30, + }, + } +} diff --git a/doc/CLI_CONFIG.md b/doc/CLI_CONFIG.md new file mode 100644 index 0000000..eef97b9 --- /dev/null +++ b/doc/CLI_CONFIG.md @@ -0,0 +1,108 @@ +# `cix` CLI configuration — reference + +Comprehensive reference for everything the `cix` CLI lets you configure. +For a quick tour see [`cli/README.md`](../cli/README.md#run-against-a-server). + +## File location + +`~/.cix/config.yaml` — created on first `cix config set …` / +`cix config init`. The CLI seeds an implicit `default` server pointing at +`http://localhost:21847` if no file exists yet, but does not materialise +that to disk until the user writes something. + +## Precedence + +``` +CLI flag (--server / --api-url / --api-key) + ↓ +Environment variable (CIX_SERVER / CIX_API_URL / CIX_API_KEY) + ↓ +~/.cix/config.yaml + ↓ +Built-in default (struct-tag default:"…") +``` + +Env overrides never write back to disk. The 3 env vars are the entire +env surface — knobs like `watcher.debounce_ms` are persistent +preferences and have no env binding. + +## Commands + +| Command | Purpose | +|----------------------------------------|---------| +| `cix config show` | Human-readable dump of the current configuration | +| `cix config keys` | List every settable key with default, env, description | +| `cix config set ` | Set one key — supports scalars + comma-separated lists | +| `cix config unset server.[.key]` | Remove a server entry or just clear its key | +| `cix config edit` | Interactive TUI form (huh) for the whole file | +| `cix config init` | First-run wizard — same form, pre-seeded for fresh installs | +| `cix config path` | Print the file path (useful in scripts) | + +## Keys + +### Server selection + +| Key | Type | Default | Description | +|---------------------------|--------|------------------------------|-------------| +| `servers` | list | `[{default → localhost}]` | Managed via `cix config set server..url|key` | +| `default_server` | string | `default` | Active alias when `--server`/`CIX_SERVER` are unset | +| `server..url` | string | — | URL of a named server (creates the entry on first set) | +| `server..key` | string | — | API key for the named server (sensitive — never printed) | +| `api.url` / `api.key` | string | — | Legacy aliases — operate on the default server | + +### File watcher + +| Key | Type | Default | Validation | +|--------------------------------|----------|------------------------------------------------------------------------|------------| +| `watcher.enabled` | bool | `true` | — | +| `watcher.debounce_ms` | int | `5000` | 100 — 60000 | +| `watcher.sync_interval_mins` | int | `5` | ≥ 1 | +| `watcher.exclude` | []string | `node_modules,.git,.venv,__pycache__,dist,build,.next,.cache,.DS_Store` | comma-separated; REPLACE semantics on set | + +### Indexing + +| Key | Type | Default | Validation | Description | +|----------------------------------------|------|---------|------------|-------------| +| `indexing.batch_size` | int | `20` | ≥ 1 | Per-batch file count for the upload pipeline | +| `indexing.streaming_idle_timeout_sec` | int | `30` | ≥ 0 | Max silence on streaming `/index/files` before giving up; 0 disables | + +### Projects + +| Key | Type | Description | +|----------------------------|-------------|-------------| +| `projects` | list | Managed via `cix init` / dashboard — not editable via `config set` | + +## Env vars + +| Variable | Overrides | Notes | +|-----------------|---------------------------------|-------| +| `CIX_SERVER` | `default_server` | Used only when `--server` is empty | +| `CIX_API_URL` | resolved server's `url` | Local override; never persisted | +| `CIX_API_KEY` | resolved server's `key` | Designed for `secrets.CIX_API_KEY` in CI | + +## Implementation notes + +- **Single source of truth**: every key, default, validation rule, and + description lives on the corresponding Go struct field as a tag + (`yaml`, `key`, `default`, `validate`, `env`, `desc`, `sensitive`). + All five surfaces — load, save, show, set, TUI — read from this + schema via reflection. +- **Loader**: [`knadh/koanf v2`](https://github.com/knadh/koanf) layers + defaults (from tags) and the YAML file. Legacy lowercase keys + (`debouncems`, `excludepatterns`, `cachettl`, `autowatch`, `batchsize`) + are normalised in-place pre-parse so old files keep loading. The + `api:` block is migrated into the multi-server `servers:` list and + cleared on the next save. +- **Validation**: [`go-playground/validator/v10`](https://github.com/go-playground/validator) + validates the whole `Config` after every mutation via `cix config set` + and on TUI form submit. `Load()` itself does NOT validate — a + malformed value in an on-disk file must not brick the CLI; bad values + surface the next time the user tries to change something. +- **TUI**: [`charmbracelet/huh`](https://github.com/charmbracelet/huh) + builds the paged form. The Charm stack (`huh` + `bubbletea` + + `lipgloss`) is the visual layer for any future TUI screens too. +- **Sensitive fields**: `ServerEntry.Key` carries `sensitive:"true"`. + Renderers print `(set)` / `(not set)`, never the value. CodeQL's + `go/clear-text-logging` heuristic flags reads of `*Key`/`*Secret` + into named variables, so the tag is read off `reflect.StructField` + and the value goes through `reflect.Value.IsZero()` only. diff --git a/doc/openapi.yaml b/doc/openapi.yaml index 06fac13..e9943b0 100644 --- a/doc/openapi.yaml +++ b/doc/openapi.yaml @@ -515,6 +515,131 @@ paths: "403": $ref: "#/components/responses/Forbidden" + /api/v1/admin/embedding-providers: + get: + operationId: listEmbeddingProviders + tags: [admin] + summary: List registered embedding-provider kinds (admin only) + description: | + Returns one entry per registered provider kind with its config + schema and the names of the env vars it reads for credentials, + plus whether those env vars are currently set on the server. + The dashboard uses this to render the kind dropdown, the + per-kind form, and the "set CIX_VOYAGE_API_KEY before saving" + banner when a key is missing. + responses: + "200": + description: List of registered providers + content: + application/json: + schema: + $ref: "#/components/schemas/EmbeddingProviderList" + "401": + $ref: "#/components/responses/Unauthorized" + "403": + $ref: "#/components/responses/Forbidden" + + /api/v1/admin/embedding-providers/active: + get: + operationId: getActiveEmbeddingProvider + tags: [admin] + summary: Get the currently active embedding provider (admin only) + description: | + Returns the persisted provider selection (kind + JSON config + blob) and the live `Provider.ID()` fingerprint. API keys are + never persisted, so the config blob carries env-var NAMES + only — safe to surface to admin clients verbatim. + responses: + "200": + description: Currently active provider + content: + application/json: + schema: + $ref: "#/components/schemas/ActiveEmbeddingProvider" + "401": + $ref: "#/components/responses/Unauthorized" + "403": + $ref: "#/components/responses/Forbidden" + "503": + description: Embeddings service not wired (e.g. CIX_EMBEDDINGS_ENABLED=false) + put: + operationId: switchEmbeddingProvider + tags: [admin] + summary: Switch to a different embedding provider (admin only) + description: | + Atomic switch. The server validates the submitted config, persists + it, then swaps the live Service over (drains the queue first). + On any error the existing provider stays untouched. + + Switching changes the active `Provider.ID()` fingerprint; every + project's `indexed_with_model` becomes stale and the next + clone job per project triggers a full reindex + (`mode=full, reason=model-change`). + requestBody: + required: true + content: + application/json: + schema: + $ref: "#/components/schemas/SwitchEmbeddingProviderRequest" + responses: + "202": + description: Switch accepted; new provider is live + content: + application/json: + schema: + $ref: "#/components/schemas/ActiveEmbeddingProvider" + "400": + $ref: "#/components/responses/BadRequest" + "401": + $ref: "#/components/responses/Unauthorized" + "403": + $ref: "#/components/responses/Forbidden" + + /api/v1/admin/embedding-providers/{kind}/test: + post: + operationId: testEmbeddingProvider + tags: [admin] + summary: Validate an embedding-provider config without persisting (admin only) + description: | + Builds a throw-away provider from the submitted config, calls + Start (one short embed for HTTP providers; spawning a child for + ollama), then Stops it. Returns the detected dimension and an + ok flag. Use this from the dashboard before calling PUT + /embedding-providers/active so the admin sees an actionable + error (bad key, wrong URL, missing env var) before the swap. + parameters: + - name: kind + in: path + required: true + schema: + type: string + enum: [ollama, openai, voyage] + requestBody: + required: true + content: + application/json: + schema: + type: object + description: Provider-specific config blob (shape varies by kind). + additionalProperties: true + responses: + "200": + description: Connect test succeeded + content: + application/json: + schema: + $ref: "#/components/schemas/TestEmbeddingProviderResponse" + "400": + $ref: "#/components/responses/BadRequest" + "401": + $ref: "#/components/responses/Unauthorized" + "403": + $ref: "#/components/responses/Forbidden" + "502": + description: | + Connect test failed against the upstream service (auth + rejected, network unreachable, etc). + /api/v1/api-keys: get: operationId: listApiKeys @@ -2723,6 +2848,12 @@ components: application/json: schema: $ref: "#/components/schemas/Error" + BadRequest: + description: Invalid request (missing fields, unknown kind, bad config) + content: + application/json: + schema: + $ref: "#/components/schemas/Error" InternalError: description: Unhandled server error content: @@ -3079,6 +3210,94 @@ components: type: string description: The CIX_GGUF_CACHE_DIR that was scanned. Empty list with non-empty cache_dir = no .gguf files found. + EmbeddingProviderList: + type: object + required: [providers] + properties: + providers: + type: array + items: + $ref: "#/components/schemas/EmbeddingProviderInfo" + + EmbeddingProviderInfo: + type: object + required: [kind, schema, secret_envs] + properties: + kind: + type: string + enum: [ollama, openai, voyage] + schema: + type: object + description: | + ConfigSchema as JSON — describes the form fields the + provider accepts. Shape is `{fields: [{name, label, kind, + required, default, enum, description}]}`. Hardcoded React + forms ignore this; it's exposed for external tooling. + additionalProperties: true + secret_envs: + type: array + description: | + Env-var names this provider reads for credentials, with a + flag telling whether each is currently set on the server. + Used by the dashboard to render the missing-key banner + before save. + items: + $ref: "#/components/schemas/EmbeddingProviderSecretEnv" + + EmbeddingProviderSecretEnv: + type: object + required: [name, set] + properties: + name: + type: string + description: Env-var name (e.g. `CIX_VOYAGE_API_KEY`). + set: + type: boolean + description: True when the env var is present (and non-empty) on the server. + + ActiveEmbeddingProvider: + type: object + required: [kind, id] + properties: + kind: + type: string + enum: [ollama, openai, voyage] + id: + type: string + description: | + `Provider.ID()` fingerprint, e.g. `voyage:voyage-code-3:1024:float`. + Matches `embedding_model` on /status. + config: + type: object + description: | + Persisted provider config blob. Shape varies by kind; API + keys are NOT stored — only env-var names are. + additionalProperties: true + + SwitchEmbeddingProviderRequest: + type: object + required: [kind, config] + properties: + kind: + type: string + enum: [ollama, openai, voyage] + config: + type: object + additionalProperties: true + + TestEmbeddingProviderResponse: + type: object + required: [ok] + properties: + ok: + type: boolean + dimension: + type: integer + minimum: 0 + description: | + Embedding dimension as reported by the provider after + Start. 0 when the provider learns the dimension lazily. + Session: type: object required: [id, created_at, expires_at, last_seen_at, is_current] @@ -3226,7 +3445,26 @@ components: False when the sidecar is starting or has crashed. embedding_model: type: string - description: Hugging Face model id (e.g. `awhiteside/CodeRankEmbed-Q8_0-GGUF`). + description: | + Active provider fingerprint — formerly the HuggingFace repo id, + now `Provider.ID()` e.g. `ollama:CodeRankEmbed`, + `openai:text-embedding-3-small`, `voyage:voyage-code-3:1024:float`. + Used by the dashboard to compare against each project's + `indexed_with_model` and render the stale-model badge. + embedding_provider: + type: string + description: | + Active provider kind: `ollama`, `openai`, or `voyage`. Empty + when the embedding service is disabled or the fake fixtures + substitute a non-Service implementation. + embedding_provider_manages_process: + type: boolean + description: | + True when the active provider owns an in-process child + (currently only `ollama`). The footer renders a green/red + liveness dot when true, and a permanent green dot otherwise + (HTTP-only providers have no process to die — Ready failures + surface at request time, not on every footer poll). projects: type: integer minimum: 0 diff --git a/docker-compose.cuda.yml b/docker-compose.cuda.yml index a40032a..84268f9 100644 --- a/docker-compose.cuda.yml +++ b/docker-compose.cuda.yml @@ -45,6 +45,25 @@ services: # touches the source again. Subsequent boots find the file in cache # and ignore the env. See volumes block below for an example bind. - CIX_BOOTSTRAP_GGUF_PATH=${CIX_BOOTSTRAP_GGUF_PATH:-} + # ── Pluggable embedding providers (added in migration 12) ── + # The active provider is selected from the dashboard + # (/dashboard/server → Embedding provider). On first boot with an + # empty runtime_settings row, cix-server seeds the ollama provider + # using the CIX_EMBEDDING_MODEL + CIX_LLAMA_* vars above — so the + # default deployment is unchanged. + # + # To use a remote provider (OpenAI-compatible or Voyage AI) you + # MUST export the API-key env var below. The dashboard reads it on + # every embed call; cix-server NEVER persists API keys in the DB, + # only the env-var NAME a provider should look up. Switching + # providers triggers a full reindex per project on the next clone + # job (existing model-change pipeline). + # + # OpenAI-compatible (api.openai.com, vLLM, TEI, LocalAI, …): + # - CIX_OPENAI_API_KEY=${CIX_OPENAI_API_KEY:-} + # + # Voyage AI: + # - CIX_VOYAGE_API_KEY=${CIX_VOYAGE_API_KEY:-} - NVIDIA_VISIBLE_DEVICES=all volumes: # Operator-managed bind for sqlite + chroma so backups and inspection diff --git a/docker-compose.yml b/docker-compose.yml index e758c30..e22b5eb 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -45,6 +45,25 @@ services: # touches the source again. Subsequent boots find the file in cache # and ignore the env. See volumes block below for an example bind. - CIX_BOOTSTRAP_GGUF_PATH=${CIX_BOOTSTRAP_GGUF_PATH:-} + # ── Pluggable embedding providers (added in migration 12) ── + # The active provider is selected from the dashboard + # (/dashboard/server → Embedding provider). On first boot with an + # empty runtime_settings row, cix-server seeds the ollama provider + # using the CIX_EMBEDDING_MODEL + CIX_LLAMA_* vars above — so the + # default deployment is unchanged. + # + # To use a remote provider (OpenAI-compatible or Voyage AI) you + # MUST export the API-key env var below. The dashboard reads it on + # every embed call; cix-server NEVER persists API keys in the DB, + # only the env-var NAME a provider should look up. Switching + # providers triggers a full reindex per project on the next clone + # job (existing model-change pipeline). + # + # OpenAI-compatible (api.openai.com, vLLM, TEI, LocalAI, …): + # - CIX_OPENAI_API_KEY=${CIX_OPENAI_API_KEY:-} + # + # Voyage AI: + # - CIX_VOYAGE_API_KEY=${CIX_VOYAGE_API_KEY:-} volumes: # Operator-managed bind for sqlite + chroma so backups and inspection # are one `cd` away on the host. The CPU image runs as diff --git a/plugins/cix/README.md b/plugins/cix/README.md index 858abfd..b0b0ee8 100644 --- a/plugins/cix/README.md +++ b/plugins/cix/README.md @@ -116,6 +116,26 @@ docs. ## Configuration +### Targeting multiple cix servers + +The bundled CLI supports more than one **named server** (e.g. a local +box and a remote corporate server). One is the default; commands use it +unless `--server ` is passed: + +```bash +cix config set server.corporate.url https://cix.corp.internal +cix config set server.corporate.key +cix config set default_server corporate # optional: make it the default +cix --server corporate search "rate limiter" +cix config show # lists servers; * marks the default +``` + +Legacy single-server config (`api.url` / `api.key`, `--api-url` / +`--api-key`) still works and operates on the default server; old +`~/.cix/config.yaml` files are migrated automatically. The `cix` skill +(SKILL.md) documents this for the agent. Full reference: +[`cli/README.md`](https://github.com/dvcdsys/code-index/blob/main/cli/README.md#multiple-servers). + ### Where the bundled CLI is installed The wrapper installs `cix` to `~/.local/bin/cix` by default. To override diff --git a/plugins/cix/skills/cix/SKILL.md b/plugins/cix/skills/cix/SKILL.md index 10f2a48..3da8ed2 100644 --- a/plugins/cix/skills/cix/SKILL.md +++ b/plugins/cix/skills/cix/SKILL.md @@ -157,6 +157,40 @@ The watcher auto-reindexes on file change — manual `reindex` is rarely needed. `cix status` shows whether the watcher is running and the last-sync timestamp. +### Servers — talk to more than one cix backend + +`cix` can be configured with several **named servers** (e.g. a local +box and a remote corporate server). One is the **default**; every +command targets the default unless you pass `--server `. + +```bash +cix config show # lists servers; * marks the default +cix --server corporate search "rate limiter" # run any command against a named server +cix search "rate limiter" --server corporate # --server is global; either position works +``` + +Servers are managed through `cix config` (persisted in +`~/.cix/config.yaml`): + +```bash +cix config set server.corporate.url https://cix.corp.internal +cix config set server.corporate.key +cix config set default_server corporate # change which server is the default +cix config unset server.corporate # remove a server +cix config unset server.corporate.key # clear just its key +``` + +The legacy single-server keys still work and operate on the **default** +server, so existing setups keep working unchanged: +`cix config set api.url ` / `cix config set api.key `. The +`--api-url` / `--api-key` flags override the selected server's URL/key +for a single invocation. + +**Agent rule:** use the default server (no flag) unless the user names a +specific server. Only add `--server ` when the task explicitly +targets that named backend; never guess an alias — run `cix config show` +to see the configured names if unsure. + --- ## Search quality — what scores mean diff --git a/server/cmd/cix-server/main.go b/server/cmd/cix-server/main.go index c0b9697..0ac00c0 100644 --- a/server/cmd/cix-server/main.go +++ b/server/cmd/cix-server/main.go @@ -21,10 +21,12 @@ import ( "github.com/dvcdsys/code-index/server/internal/config" "github.com/dvcdsys/code-index/server/internal/db" "github.com/dvcdsys/code-index/server/internal/embeddings" + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" + "github.com/dvcdsys/code-index/server/internal/embeddingscfg" "github.com/dvcdsys/code-index/server/internal/githubapi" "github.com/dvcdsys/code-index/server/internal/githubtokens" - "github.com/dvcdsys/code-index/server/internal/groups" "github.com/dvcdsys/code-index/server/internal/gitrepos" + "github.com/dvcdsys/code-index/server/internal/groups" "github.com/dvcdsys/code-index/server/internal/httpapi" "github.com/dvcdsys/code-index/server/internal/indexer" "github.com/dvcdsys/code-index/server/internal/jobs" @@ -33,6 +35,7 @@ import ( "github.com/dvcdsys/code-index/server/internal/runtimecfg" "github.com/dvcdsys/code-index/server/internal/secrets" "github.com/dvcdsys/code-index/server/internal/sessions" + "github.com/dvcdsys/code-index/server/internal/storage" "github.com/dvcdsys/code-index/server/internal/tunnelcfg" "github.com/dvcdsys/code-index/server/internal/tunnels" "github.com/dvcdsys/code-index/server/internal/users" @@ -114,7 +117,16 @@ func run() error { chunker.Configure(cfg.Languages) logger.Info("chunker languages configured", "active", chunker.SupportedLanguages()) - dbPath := cfg.DynamicSQLitePath() + // The system DB is model-INDEPENDENT (one permanent file at + // cfg.SQLitePath holding accounts + catalog + parsed code). Older + // builds suffixed the model name onto the path; adopt any such legacy + // per-model file as the canonical system DB (idempotent, one-time). + // LEGACY-MIGRATION (remove next release): drop this adoption call once + // all deployments have booted on the unified layout. + if err := storage.AdoptLegacyModelDB(cfg.SQLitePath, cfg.LegacyDynamicSQLitePath(), logger); err != nil { + return fmt.Errorf("adopt legacy system db: %w", err) + } + dbPath := cfg.SQLitePath logger.Info("opening database", "path", dbPath) database, err := db.OpenWith(db.OpenOptions{ Path: dbPath, @@ -154,14 +166,10 @@ func run() error { "batch", cfg.LlamaBatchSize, "sources", snap.Source, ) - // DynamicSQLitePath embeds ModelSafeName(); if the dashboard switched the - // model, the storage path resolved a moment ago is for the OLD model. The - // already-opened DB is still correct (it's the OLD model's state) but the - // chroma vectorstore opened below needs to honour the NEW model. Recompute - // dbPath only matters if we want to re-open under the new model — for PR-E - // we deliberately keep the old DB so historical projects keep their - // indexed_with_model and the dashboard can show the drift. Sidecar + - // vectorstore use the new model. + // The system DB is model-independent (opened above at cfg.SQLitePath). + // Only the chroma vector store is namespaced per embedding identity — + // it is opened below once the active provider is known, using + // provider.StorageSlug(Provider.ID()). // Embeddings service. When disabled we still build the value so router // wiring stays consistent — Service methods return ErrDisabled in that case. @@ -169,7 +177,62 @@ func run() error { // window for the HF download path on cold cache. startupCtx, startupCancel := context.WithTimeout(context.Background(), time.Duration(cfg.LlamaStartupSec)*time.Second+30*time.Second) - embedSvc, err := embeddings.New(startupCtx, cfg, logger) + + // Bootstrap pluggable-provider selection. If the runtime_settings row + // has no embedding_provider yet (fresh install or pre-migration DB), + // seed it with the env-derived ollama config and use that for boot. + // Otherwise the persisted blob is authoritative — env vars (other + // than API-key envs that providers read live) are ignored. + embedCfgStore := embeddingscfg.New(database) + persistedProv, hasProv, err := embedCfgStore.Get(context.Background()) + if err != nil { + startupCancel() + return fmt.Errorf("load embedding provider config: %w", err) + } + if !hasProv && cfg.EmbeddingsEnabled { + seed, serr := embeddings.BuildOllamaConfigFromEnv(cfg) + if serr != nil { + startupCancel() + return fmt.Errorf("seed embedding provider config: %w", serr) + } + if serr := embedCfgStore.Save(context.Background(), + embeddingscfg.Snapshot{Kind: "ollama", Config: seed}, ""); serr != nil { + logger.Warn("could not seed embedding provider config (continuing on env)", "err", serr) + } + persistedProv = embeddingscfg.Snapshot{Kind: "ollama", Config: seed} + hasProv = true + } + + var embedSvc *embeddings.Service + if !cfg.EmbeddingsEnabled || !hasProv { + // Legacy / disabled path — fall back to env-only ollama wiring. + embedSvc, err = embeddings.New(startupCtx, cfg, logger) + } else { + prov, perr := provider.Build(startupCtx, persistedProv.Kind, persistedProv.Config, embeddings.EnvSecrets(), logger) + if perr != nil { + // The persisted provider config is malformed (bad JSON / + // unknown kind). Don't brick the whole server over it: fall + // back to env-only ollama so the dashboard stays reachable to + // fix the config or re-switch the provider. + logger.Error("could not build persisted embedding provider; falling back to env ollama (fix or re-switch via dashboard)", + "kind", persistedProv.Kind, "err", perr) + embedSvc, err = embeddings.New(startupCtx, cfg, logger) + } else { + if perr := prov.Start(startupCtx); perr != nil { + // A remote provider's boot connect-test (or the ollama + // sidecar spawn) failed — a transient upstream blip, a + // revoked key, or a missing API-key env after a redeploy. + // Attach the provider anyway instead of failing the whole + // process: the HTTP server (dashboard, auth, every project + // API) must stay up so an operator can recover, and remote + // providers self-heal once the upstream/key is back. Embeds + // return a clear error until then and Status() reports it. + logger.Error("persisted embedding provider failed to start; continuing in degraded state — embeddings unavailable until it recovers or is switched", + "kind", persistedProv.Kind, "err", perr) + } + embedSvc = embeddings.NewWithProvider(cfg, prov, logger) + } + } startupCancel() if err != nil { return fmt.Errorf("embeddings: %w", err) @@ -191,25 +254,67 @@ func run() error { } }() - // Detect and back up a legacy ChromaDB layout left by the Python server. - if backed, bErr := vectorstore.DetectLegacyAndBackup(cfg.DynamicChromaPersistDir()); bErr != nil { - logger.Warn("could not back up legacy chroma dir", "err", bErr) + // Relocate a legacy Python ChromaDB store occupying the container path + // itself, so chromaBase is free to become the nested container below. + if backed, bErr := vectorstore.DetectLegacyAndBackup(cfg.ChromaPersistDir); bErr != nil { + return fmt.Errorf("back up legacy python chroma store: %w", bErr) } else if backed { - logger.Warn("legacy chroma layout detected — backed up; re-run cix init to reindex") + logger.Warn("legacy python chroma layout detected at container path — backed up; re-run cix init to reindex") } - // Vector store (chromem-go). Lives under the dynamic chroma persist dir so - // the path includes the model-safe name, matching Python parity. - vs, err := vectorstore.Open(cfg.DynamicChromaPersistDir()) + // Migrate pre-unification FLAT ollama dirs (_) into the + // unified NESTED layout (/ollama//) so existing + // vectors are reused without a reindex. Unambiguous: every flat sibling + // is a legacy ollama dir (the provider kind is now its own path level, + // not guessed from a name prefix). Idempotent. + // LEGACY-MIGRATION (remove next release): drop once every deployment has + // booted on the nested layout. + if err := storage.MigrateFlatChromaToNested(cfg.ChromaPersistDir, logger); err != nil { + // Fail closed. A half-completed move would leave the server opening + // a fresh empty namespace while existing vectors sit under the + // un-migrated legacy dir — search would silently return nothing on + // a "healthy" server. Surface it so the operator fixes the cause + // (e.g. dir perms under prod uid 1001) rather than losing the index. + return fmt.Errorf("migrate legacy chroma dirs: %w", err) + } + + // The vector store is namespaced by the ACTIVE provider's identity path + // components (/[/]) so vectors of different + // dimensions never share a collection. + components := embedSvc.StoragePath() + if len(components) == 0 { + // Embeddings disabled / provider not built: deterministic ollama- + // shaped fallback so toggling embeddings on/off doesn't move dirs. + components = []string{provider.KindOllama, provider.StorageSlug(cfg.EmbeddingModel)} + } + chromaDir := cfg.ChromaDirFor(components) + + vs, err := vectorstore.Open(chromaDir) if err != nil { return fmt.Errorf("open vectorstore: %w", err) } + // Wrap in a swappable Holder shared by indexer / repojobs / httpapi so + // a runtime provider switch can reopen the store under a new namespace. + vsHolder := vectorstore.NewHolder(vs) + // Wire the live-reopen path used by SwitchProvider. + embedSvc.AttachVectorStore( + vsHolder, + cfg.ChromaDirFor, + vectorstore.Open, + func() error { return storage.MigrateFlatChromaToNested(cfg.ChromaPersistDir, logger) }, + ) - idx := indexer.New(database, vs, embedSvc, logger) + idx := indexer.New(database, vsHolder, embedSvc, logger) idx.SetEmbedIncludePath(cfg.EmbedIncludePath) - // PR-E — record the active embedding model on every indexed project so the - // dashboard can highlight stale vectors when the runtime model changes. - idx.SetEmbeddingModel(cfg.EmbeddingModel) + // Record the active embedding model on every indexed project so the + // dashboard can highlight stale vectors when the runtime provider / + // model changes. Wire it as a live lookup so a runtime provider + // switch (PUT /admin/embedding-providers/active) is reflected in the + // next FinishIndexing without a process restart — the indexer reads + // embedSvc.EmbeddingModel() at write time and that returns the active + // Provider.ID() ("ollama:" / "voyage:..."), matching what the + // drift-detector and dashboard compare against. + idx.SetEmbeddingModelLookup(embedSvc.EmbeddingModel) if cfg.EmbedIncludePath { logger.Info("embedding format: path-aware preamble enabled (CIX_EMBED_INCLUDE_PATH=true) — full reindex required if upgrading") } @@ -283,7 +388,7 @@ func run() error { GitRepos: grSvc, GithubTokens: ghSvc, Indexer: idx, - VectorStore: vs, + VectorStore: vsHolder, DataDir: cfg.WorkspacesDataDir, Logger: logger, DefaultPollIntervalSeconds: int(cfg.DefaultPollInterval.Seconds()), @@ -405,9 +510,10 @@ func run() error { Sessions: sessSvc, APIKeys: akSvc, EmbeddingSvc: embedSvc, - VectorStore: vs, + VectorStore: vsHolder, Indexer: idx, RuntimeCfg: rcfg, + EmbeddingsCfg: embedCfgStore, VersionCheck: vcSvc, Workspaces: wsSvc, GithubTokens: ghSvc, diff --git a/server/dashboard/src/api/types.ts b/server/dashboard/src/api/types.ts index d9b4b6a..a1d8707 100644 --- a/server/dashboard/src/api/types.ts +++ b/server/dashboard/src/api/types.ts @@ -74,3 +74,13 @@ export type SidecarStatus = components['schemas']['SidecarStatus']; export type ModelEntry = components['schemas']['ModelEntry']; export type ModelList = components['schemas']['ModelList']; export type RestartAccepted = components['schemas']['RestartAccepted']; + +export type EmbeddingProviderInfo = components['schemas']['EmbeddingProviderInfo']; +export type EmbeddingProviderSecretEnv = components['schemas']['EmbeddingProviderSecretEnv']; +export type EmbeddingProviderList = components['schemas']['EmbeddingProviderList']; +export type ActiveEmbeddingProvider = components['schemas']['ActiveEmbeddingProvider']; +export type SwitchEmbeddingProviderRequest = components['schemas']['SwitchEmbeddingProviderRequest']; +export type TestEmbeddingProviderResponse = components['schemas']['TestEmbeddingProviderResponse']; + +// Provider kind union — the dashboard uses this in form-state discriminants. +export type EmbeddingProviderKind = 'ollama' | 'openai' | 'voyage'; diff --git a/server/dashboard/src/app/Footer.tsx b/server/dashboard/src/app/Footer.tsx index f772808..10d812b 100644 --- a/server/dashboard/src/app/Footer.tsx +++ b/server/dashboard/src/app/Footer.tsx @@ -5,31 +5,51 @@ import { cn } from '@/lib/cn'; // Footer spans the full width below the sidebar + main pane. Reads // from the shared /status query (polled every 30 s) — server version -// on the left, llama sidecar liveness dot on the right. The "llama" -// label links to /server (admin-only page); viewers see plain text -// since the route isn't mounted for them. +// on the left, embedding-provider indicator on the right. +// +// The label is the active provider kind ("ollama" / "openai" / +// "voyage") and the dot logic depends on whether the provider +// manages an in-process child: +// ollama (manages_process=true): green when /health alive, red +// otherwise — real liveness signal. +// openai / voyage (manages_process=false): permanently green. +// We don't ping remote APIs on every footer poll; failures +// surface at search/embed time with diagnostics. +// +// The provider name links to /server (admin-only page); viewers see +// plain text since the route isn't mounted for them. export function Footer() { const { data, isLoading } = useServerStatus(); const { user } = useAuth(); const version = data?.server_version ?? 'dev'; + const providerKind = data?.embedding_provider ?? ''; + const managesProcess = data?.embedding_provider_manages_process === true; const alive = data?.model_loaded === true; const isAdmin = user?.role === 'admin'; const dotClass = isLoading ? 'bg-muted-foreground/40' - : alive - ? 'bg-emerald-500' - : 'bg-red-500'; + : managesProcess + ? alive + ? 'bg-emerald-500' + : 'bg-red-500' + : 'bg-emerald-500'; const dotTitle = isLoading - ? 'Checking sidecar status…' - : alive - ? 'Sidecar is alive' - : 'Sidecar is not responding'; + ? 'Checking embedding provider status…' + : managesProcess + ? alive + ? 'Ollama sidecar is alive' + : 'Ollama sidecar is not responding' + : providerKind + ? `${providerKind} backend (no managed process)` + : 'Embedding backend'; + + const label = providerKind || 'embeddings'; const indicator = ( <> - llama + {label} ); diff --git a/server/dashboard/src/lib/useServerStatus.ts b/server/dashboard/src/lib/useServerStatus.ts index 049bbce..78f5480 100644 --- a/server/dashboard/src/lib/useServerStatus.ts +++ b/server/dashboard/src/lib/useServerStatus.ts @@ -5,6 +5,11 @@ interface StatusPayload { server_version: string; embedding_model: string; model_loaded: boolean; + // Pluggable-provider fields (server >= migration 12). Present on + // every fresh-built server; older clients may see them as + // undefined while a rolling upgrade is in progress. + embedding_provider?: string; + embedding_provider_manages_process?: boolean; // Version-check fields are present only when the server has the // versioncheck service wired (see CIX_VERSION_CHECK_ENABLED). update_available?: boolean; diff --git a/server/dashboard/src/modules/server/ServerPage.tsx b/server/dashboard/src/modules/server/ServerPage.tsx index 8481ae7..8346cb5 100644 --- a/server/dashboard/src/modules/server/ServerPage.tsx +++ b/server/dashboard/src/modules/server/ServerPage.tsx @@ -6,11 +6,18 @@ import type { RuntimeConfig, RuntimeConfigUpdate } from '@/api/types'; import { Alert, AlertDescription, AlertTitle } from '@/ui/alert'; import { Button } from '@/ui/button'; import { Skeleton } from '@/ui/skeleton'; -import { useRestartSidecar, useRuntimeConfig, useSidecarStatus, useUpdateRuntimeConfig } from './hooks'; +import { useServerStatus } from '@/lib/useServerStatus'; +import { + useRestartSidecar, + useRuntimeConfig, + useSidecarStatus, + useUpdateRuntimeConfig, +} from './hooks'; import { EmbeddingModelSection } from './sections/EmbeddingModelSection'; import { RuntimeParamsSection } from './sections/RuntimeParamsSection'; import { SidecarSection } from './sections/SidecarSection'; import { AdvancedSection } from './sections/AdvancedSection'; +import { EmbeddingProviderSection } from './sections/EmbeddingProviderSection'; import { SaveAndRestartDialog } from './components/SaveAndRestartDialog'; interface Draft { @@ -62,6 +69,14 @@ export default function ServerPage() { const status = useSidecarStatus(); const update = useUpdateRuntimeConfig(); const restart = useRestartSidecar(); + // /status is shared with the footer (already polled every 30s) and + // its embedding_provider field reflects the LIVE active provider — + // the right signal for "should we show ollama sections?". We default + // to true while it loads so the page doesn't flash empty between + // mount and the first /status response. + const serverStatus = useServerStatus(); + const activeKind = serverStatus.data?.embedding_provider ?? 'ollama'; + const showOllamaSections = activeKind === 'ollama'; const [draft, setDraft] = useState(null); const [confirmOpen, setConfirmOpen] = useState(false); @@ -132,9 +147,9 @@ export default function ServerPage() {

Server

- Embedding model, indexing parameters, sidecar lifecycle. Saved - overrides land in the database and are reapplied on the next - sidecar restart — env vars stay as bootstrap defaults. + {showOllamaSections + ? 'Embedding provider + model, indexing parameters, sidecar lifecycle, throughput. Saved overrides land in the database and are reapplied on the next sidecar restart — env vars stay as bootstrap defaults.' + : 'Embedding provider + concurrency. For remote providers (OpenAI-compatible, Voyage) the per-provider form above is the main edit surface; this page also exposes the server-wide concurrency cap that all providers honour.'}

@@ -158,30 +173,56 @@ export default function ServerPage() { ) : null} - setDraft({ ...draft, embedding_model: v })} - /> - - setDraft({ ...draft, llama_ctx_size: n })} - onDraftGpuLayers={(n) => setDraft({ ...draft, llama_n_gpu_layers: n })} - onDraftThreads={(n) => setDraft({ ...draft, llama_n_threads: n })} - /> - - + + + {/* + Ollama-specific cards — rendered only when the active provider + is ollama. For openai/voyage these sections do not apply: + there's no GGUF, no llama-server child to restart, no GPU + layers / threads, no batch size knob. The provider form above + is the only edit surface in that case. + + Concurrency lives inside AdvancedSection together with the + ollama-only batch size — for v1 we hide the whole card on + remote providers. A follow-up may split concurrency into a + provider-agnostic card if operators ask for it. + */} + {showOllamaSections ? ( + <> + setDraft({ ...draft, embedding_model: v })} + /> + + setDraft({ ...draft, llama_ctx_size: n })} + onDraftGpuLayers={(n) => setDraft({ ...draft, llama_n_gpu_layers: n })} + onDraftThreads={(n) => setDraft({ ...draft, llama_n_threads: n })} + /> + + + + ) : null} + {/* + Throughput / concurrency — always visible. The queue concurrency + is the Service-level cap on parallel /v1/embeddings POSTs and + applies to every provider (ollama, openai, voyage all accept + concurrent requests). The llama batch field inside the card + is gated on isOllama. + */} setDraft({ ...draft, max_embedding_concurrency: n })} onDraftBatch={(n) => setDraft({ ...draft, llama_batch_size: n })} + isOllama={showOllamaSections} /> + api.get('/admin/embedding-providers', { signal }), + staleTime: 30_000, + }); +} + +// useActiveProvider returns the persisted active provider + config. +// Invalidated by useSwitchProvider on success. +export function useActiveProvider() { + return useQuery({ + queryKey: serverKeys.activeProvider, + queryFn: ({ signal }) => + api.get('/admin/embedding-providers/active', { signal }), + }); +} + +// useTestProvider calls /test for a given kind+config. Doesn't +// touch the active state on the server. +export function useTestProvider(kind: string) { + return useMutation({ + mutationFn: (config: Record) => + api.post( + `/admin/embedding-providers/${encodeURIComponent(kind)}/test`, + config + ), + }); +} + +// useSwitchProvider PUTs the new selection. On success: invalidate +// the active-provider cache, the /status cache (footer indicator), +// and the sidecar-status cache (the latter goes to "n/a" for +// non-ollama providers). +export function useSwitchProvider() { + const qc = useQueryClient(); + return useMutation({ + mutationFn: (req: SwitchEmbeddingProviderRequest) => + api.put('/admin/embedding-providers/active', req), + onSuccess: () => { + qc.invalidateQueries({ queryKey: serverKeys.activeProvider }); + qc.invalidateQueries({ queryKey: serverKeys.sidecarStatus }); + qc.invalidateQueries({ queryKey: ['runtime-model'] }); + }, + }); +} diff --git a/server/dashboard/src/modules/server/sections/AdvancedSection.tsx b/server/dashboard/src/modules/server/sections/AdvancedSection.tsx index 1854ea7..4bbb908 100644 --- a/server/dashboard/src/modules/server/sections/AdvancedSection.tsx +++ b/server/dashboard/src/modules/server/sections/AdvancedSection.tsx @@ -11,6 +11,12 @@ interface Props { draftBatch: number; onDraftConcurrency: (n: number) => void; onDraftBatch: (n: number) => void; + // isOllama controls whether the llama-only batch-size field is + // rendered. Concurrency (the Service-level queue depth) applies to + // every provider — caps how many parallel /v1/embeddings POSTs go + // out, which OpenAI / Voyage both honour natively — and is shown + // regardless. + isOllama: boolean; } // AdvancedSection: throughput-tuning fields most operators won't touch. @@ -22,6 +28,7 @@ export function AdvancedSection({ draftBatch, onDraftConcurrency, onDraftBatch, + isOllama, }: Props) { const concId = useId(); const batchId = useId(); @@ -31,13 +38,18 @@ export function AdvancedSection({ return ( - Advanced - Throughput tuning. Leave at recommended unless you have a specific reason. + Throughput + + The indexer sends all chunks of one file in a single batched POST + ({'{"input": [chunk1, chunk2, ...]}'}). Concurrency + here caps how many such batched POSTs run in parallel — applies + to every backend. Llama batch (below) is sidecar-only. + -
+
- Show advanced tunables + Show throughput tunables
@@ -60,35 +72,46 @@ export function AdvancedSection({ className="max-w-xs" />

- Concurrent /v1/embeddings calls allowed against the sidecar. 1 = strictly sequential. - Recommended: {rec?.max_embedding_concurrency ?? 1}. + Maximum batched /v1/embeddings POSTs in flight + across the whole server (each POST already carries one + file's chunks as a batch). 1 = strictly sequential. OpenAI + and Voyage both accept concurrent requests, but their + account-level rate limits still apply — start low (e.g. 2) + and raise it if you don't see 429s. Per-request limits + (Voyage voyage-code-*: 128 inputs or + ~100K tokens; OpenAI: 2048 inputs) are split server-side + under one queue slot, so oversized files are safe. + Recommended:{' '} + {rec?.max_embedding_concurrency ?? 1}.

-
-
- - + {isOllama ? ( +
+
+ + +
+ { + const n = parseInt(e.target.value, 10); + onDraftBatch(Number.isFinite(n) ? n : 0); + }} + className="max-w-xs" + /> +

+ Logical batch passed to llama-server (-b). 0 = match context window. + Recommended: {rec?.llama_batch_size ?? 'ctx'}. +

- { - const n = parseInt(e.target.value, 10); - onDraftBatch(Number.isFinite(n) ? n : 0); - }} - className="max-w-xs" - /> -

- Logical batch passed to llama-server (-b). 0 = match context window. - Recommended: {rec?.llama_batch_size ?? 'ctx'}. -

-
+ ) : null}
diff --git a/server/dashboard/src/modules/server/sections/EmbeddingProviderSection.tsx b/server/dashboard/src/modules/server/sections/EmbeddingProviderSection.tsx new file mode 100644 index 0000000..5469fef --- /dev/null +++ b/server/dashboard/src/modules/server/sections/EmbeddingProviderSection.tsx @@ -0,0 +1,304 @@ +import { useEffect, useMemo, useState } from 'react'; +import { AlertCircle, CheckCircle2, Info, Loader2, Save } from 'lucide-react'; +import { toast } from 'sonner'; +import { ApiError } from '@/api/client'; +import { Alert, AlertDescription, AlertTitle } from '@/ui/alert'; +import { Button } from '@/ui/button'; +import { Card, CardContent, CardDescription, CardHeader, CardTitle } from '@/ui/card'; +import { Label } from '@/ui/label'; +import type { EmbeddingProviderKind, EmbeddingProviderSecretEnv } from '@/api/types'; +import { + useActiveProvider, + useEmbeddingProviders, + useSwitchProvider, + useTestProvider, +} from '../hooks'; +import { OpenAIProviderForm, type OpenAIConfig, defaultOpenAIConfig } from './providers/OpenAIProviderForm'; +import { VoyageProviderForm, type VoyageConfig, defaultVoyageConfig } from './providers/VoyageProviderForm'; + +// EmbeddingProviderSection wraps the provider-kind dropdown + the +// per-kind form. The ollama-specific sections (EmbeddingModelSection, +// RuntimeParamsSection, SidecarSection) stay rendered by the parent +// ServerPage when the active kind is "ollama" — switching to a remote +// provider hides them in ServerPage by checking activeProvider.kind. +// +// Save flow: +// 1. POST /admin/embedding-providers/{kind}/test with the draft. +// 2. On success → PUT /admin/embedding-providers/active. +// 3. Surface toast + invalidate caches so the footer / sidecar +// cards update immediately. +// +// API keys are never stored on the server: configs only carry the +// NAME of the env var that holds the key. When the relevant env var +// is missing the form renders a red banner and the Save button is +// disabled. +export function EmbeddingProviderSection() { + const providers = useEmbeddingProviders(); + const active = useActiveProvider(); + const switchMut = useSwitchProvider(); + + const [draftKind, setDraftKind] = useState('ollama'); + const [openAIDraft, setOpenAIDraft] = useState(defaultOpenAIConfig); + const [voyageDraft, setVoyageDraft] = useState(defaultVoyageConfig); + + // When the persisted active provider loads / changes (e.g. after a + // successful switch), reset the drafts so the form mirrors what is + // live. Selecting a different kind in the dropdown only changes the + // form being rendered — it does NOT mutate the underlying drafts + // until the admin clicks Save. + useEffect(() => { + const data = active.data; + if (!data?.kind) return; + setDraftKind(data.kind as EmbeddingProviderKind); + const cfg = (data.config ?? {}) as Record; + if (data.kind === 'openai') { + setOpenAIDraft({ + base_url: String(cfg.base_url ?? defaultOpenAIConfig.base_url), + model: String(cfg.model ?? defaultOpenAIConfig.model), + api_key_env: String(cfg.api_key_env ?? defaultOpenAIConfig.api_key_env), + dimensions: typeof cfg.dimensions === 'number' ? cfg.dimensions : undefined, + }); + } + if (data.kind === 'voyage') { + setVoyageDraft({ + model: String(cfg.model ?? defaultVoyageConfig.model), + api_key_env: String(cfg.api_key_env ?? defaultVoyageConfig.api_key_env), + output_dimension: Number(cfg.output_dimension ?? defaultVoyageConfig.output_dimension), + output_dtype: + (cfg.output_dtype as 'float' | 'int8') ?? defaultVoyageConfig.output_dtype, + truncation: cfg.truncation !== false, + rate_limit_rpm: typeof cfg.rate_limit_rpm === 'number' ? cfg.rate_limit_rpm : undefined, + rate_limit_tpm: typeof cfg.rate_limit_tpm === 'number' ? cfg.rate_limit_tpm : undefined, + max_inputs_per_request: + typeof cfg.max_inputs_per_request === 'number' ? cfg.max_inputs_per_request : undefined, + max_tokens_per_request: + typeof cfg.max_tokens_per_request === 'number' ? cfg.max_tokens_per_request : undefined, + }); + } + }, [active.data]); + + // Lookup the env-key readiness for the currently selected kind so + // the relevant form can render a "set CIX_VOYAGE_API_KEY before + // saving" banner without each form duplicating the query. + const envsForKind = useMemo(() => { + if (!providers.data) return []; + return providers.data.providers.find((p) => p.kind === draftKind)?.secret_envs ?? []; + }, [providers.data, draftKind]); + + const test = useTestProvider(draftKind); + + if (providers.isLoading || active.isLoading) { + return ( + + + Embedding provider + + +
Loading providers…
+
+
+ ); + } + if (providers.error || !providers.data || active.error) { + return ( + + + Could not load embedding providers + + {String(providers.error ?? active.error ?? 'unknown error')} + + + ); + } + + // Build the current draft config blob for the selected kind. + // For ollama we always send an empty object — the backend's + // SwitchEmbeddingProvider handler synthesizes a complete ollama + // config from runtime-cfg + env on receipt, because the + // ollama-specific tuning fields (GGUF model, ctx, GPU layers, + // sidecar paths) are not part of this card's form. + const draftConfig: Record = (() => { + switch (draftKind) { + case 'openai': + return { ...openAIDraft }; + case 'voyage': + return { ...voyageDraft }; + case 'ollama': + return {}; + } + })(); + + // Validation: we let the backend's /test endpoint be the source of + // truth, but disable the Save button locally when an obviously + // required field is empty or a referenced env var is missing. + const allEnvsSet = envsForKind.every((e) => e.set); + const localValid = (() => { + if (draftKind === 'openai') { + return !!openAIDraft.base_url && !!openAIDraft.model && !!openAIDraft.api_key_env; + } + if (draftKind === 'voyage') { + return !!voyageDraft.model && !!voyageDraft.api_key_env; + } + return true; // ollama is edited via the lower sections + })(); + + const canSave = localValid && allEnvsSet && !switchMut.isPending && !test.isPending; + // Dirty when the kind has changed; for remote providers also dirty + // when the per-kind form differs from what's persisted. Ollama- + // is-ollama is never dirty (form has no editable fields here — + // those live in the sections below). + const kindChanged = draftKind !== active.data?.kind; + const dirty = kindChanged || (() => { + if (draftKind === 'ollama') return false; + const a = JSON.stringify(active.data?.config ?? {}); + const b = JSON.stringify(draftConfig); + return a !== b; + })(); + + async function onSave() { + try { + // Skip the /test pre-check when switching to ollama — the + // backend builds the full config from runtime-cfg + env on + // receipt, so the client's empty {} can't be tested as-is + // (would fail factory validation: model is required). + // Ollama config correctness will be exercised by Start() + // inside SwitchProvider anyway. + if (draftKind !== 'ollama') { + await test.mutateAsync(draftConfig); + } + await switchMut.mutateAsync({ kind: draftKind, config: draftConfig }); + toast.success(`Switched to ${draftKind}`, { + description: 'Every project will get a Stale-model badge until reindex.', + }); + } catch (e) { + const detail = e instanceof ApiError ? e.detail : String(e); + toast.error('Provider switch failed', { description: detail }); + } + } + + return ( + + + + Embedding provider + {active.data?.kind ? ( + + {active.data.kind} + + ) : null} + + + Choose where embeddings are computed. Switching providers triggers a + full reindex per project on the next clone job — every project's + stored model fingerprint becomes stale. + + + + + + Cost & rate limits — read before picking + +
    +
  • + Ollama — free, runs the llama-server sidecar + locally on this machine's CPU/GPU. No external API, no rate + limits, no API keys. +
  • +
  • + OpenAI-compatible — pay-as-you-go on{' '} + + api.openai.com + {' '} + (account billing required) or free against your own + self-hosted vLLM / TEI / LocalAI instance. +
  • +
  • + Voyage AI — paid plan strongly recommended. + The{' '} + + free tier + {' '} + is capped at 3 RPM / 10K TPM — fine for a smoke test, not + usable for indexing a real repo. Add a payment method + before pointing the indexer at it. +
  • +
+
+
+ +
+ + +
+ + {draftKind === 'openai' ? ( + + ) : null} + {draftKind === 'voyage' ? ( + + ) : null} + {draftKind === 'ollama' ? ( +
+ {kindChanged ? ( + <> + Switching back to Ollama will restart the llama-server + sidecar with the current model + tuning from the runtime + config (see the sections below). After the switch, every + project will need to be reindexed (full reindex on the + next clone job). + + ) : ( + <> + Ollama tuning (model picker, ctx, GPU layers, sidecar + status) is configured in the sections below. + + )} +
+ ) : null} + +
+ + {test.isSuccess && !switchMut.isPending && draftKind !== 'ollama' ? ( + + Last test ok + + ) : null} +
+
+
+ ); +} diff --git a/server/dashboard/src/modules/server/sections/providers/OpenAIProviderForm.tsx b/server/dashboard/src/modules/server/sections/providers/OpenAIProviderForm.tsx new file mode 100644 index 0000000..12ff44a --- /dev/null +++ b/server/dashboard/src/modules/server/sections/providers/OpenAIProviderForm.tsx @@ -0,0 +1,132 @@ +import { AlertTriangle } from 'lucide-react'; +import { Alert, AlertDescription, AlertTitle } from '@/ui/alert'; +import { Input } from '@/ui/input'; +import { Label } from '@/ui/label'; +import type { EmbeddingProviderSecretEnv } from '@/api/types'; + +// OpenAIConfig mirrors the openai provider's persisted config blob +// shape (see server/internal/embeddings/provider/openai/openai.go). +export interface OpenAIConfig { + base_url: string; + model: string; + api_key_env: string; + dimensions?: number; +} + +export const defaultOpenAIConfig: OpenAIConfig = { + base_url: 'https://api.openai.com', + model: 'text-embedding-3-small', + api_key_env: 'CIX_OPENAI_API_KEY', +}; + +interface Props { + value: OpenAIConfig; + onChange: (next: OpenAIConfig) => void; + secretEnvs: EmbeddingProviderSecretEnv[]; +} + +// Common OpenAI-compatible model picker entries. Free-text input +// stays the source of truth (any string is valid for self-hosted +// servers); these are just suggestions. +const SUGGESTED_MODELS = [ + 'text-embedding-3-small', + 'text-embedding-3-large', + 'text-embedding-ada-002', +]; + +export function OpenAIProviderForm({ value, onChange, secretEnvs }: Props) { + const apiKeyEnv = secretEnvs.find((e) => e.name === value.api_key_env); + const apiKeyMissing = apiKeyEnv != null && !apiKeyEnv.set; + + return ( +
+
+ + onChange({ ...value, base_url: e.target.value })} + placeholder="https://api.openai.com" + /> +

+ Server origin without the trailing /v1. Works for OpenAI + proper, vLLM, TEI, LocalAI, Ollama's openai endpoint, and any other + OpenAI-compatible /v1/embeddings server. +

+
+ +
+ + onChange({ ...value, model: e.target.value })} + /> + + {SUGGESTED_MODELS.map((m) => ( + +

+ For self-hosted servers use whichever model name the server expects. +

+
+ +
+ + + onChange({ + ...value, + dimensions: e.target.value === '' ? undefined : Number(e.target.value), + }) + } + placeholder="(server default)" + /> +

+ Matryoshka shrink for text-embedding-3-*. Leave empty + to use the model's native dimension. +

+
+ +
+ + onChange({ ...value, api_key_env: e.target.value })} + placeholder="CIX_OPENAI_API_KEY" + /> +

+ The dashboard never stores the key itself. The server reads this + env var live on every embed call. +

+
+ + {apiKeyMissing ? ( + + + API key env var is not set + + Export {value.api_key_env} on the server (compose, + portainer, systemd, …) and restart the container before saving. + Calls would fail until the key becomes available. + + + ) : null} + +

+ Rate-limit handling: the server forwards the upstream HTTP status + as-is — there is no retry-with-backoff yet. If you hit 429s, lower + the concurrency in the Throughput card below or pick an account + tier with higher RPM. Self-hosted servers (vLLM, TEI, LocalAI) + typically don't rate-limit at all. +

+
+ ); +} diff --git a/server/dashboard/src/modules/server/sections/providers/VoyageProviderForm.tsx b/server/dashboard/src/modules/server/sections/providers/VoyageProviderForm.tsx new file mode 100644 index 0000000..38ab3c4 --- /dev/null +++ b/server/dashboard/src/modules/server/sections/providers/VoyageProviderForm.tsx @@ -0,0 +1,262 @@ +import { AlertTriangle, ExternalLink, Info } from 'lucide-react'; +import { Alert, AlertDescription, AlertTitle } from '@/ui/alert'; +import { Input } from '@/ui/input'; +import { Label } from '@/ui/label'; +import { Switch } from '@/ui/switch'; +import type { EmbeddingProviderSecretEnv } from '@/api/types'; + +// VoyageConfig mirrors the voyage provider's persisted config blob +// shape (see server/internal/embeddings/provider/voyage/voyage.go). +export interface VoyageConfig { + model: string; + api_key_env: string; + output_dimension: number; + output_dtype: 'float' | 'int8'; + truncation: boolean; + + // Operator-supplied rate-limit caps. 0 = no client-side throttling + // (the server will only react to upstream 429/400). Sourced from + // the operator's Voyage dashboard Rate Limits page; we can't fetch + // them programmatically (Voyage has no API for limits). + rate_limit_rpm?: number; + rate_limit_tpm?: number; + max_inputs_per_request?: number; + max_tokens_per_request?: number; +} + +export const defaultVoyageConfig: VoyageConfig = { + model: 'voyage-code-3', + api_key_env: 'CIX_VOYAGE_API_KEY', + output_dimension: 1024, + output_dtype: 'float', + truncation: true, +}; + +interface Props { + value: VoyageConfig; + onChange: (next: VoyageConfig) => void; + secretEnvs: EmbeddingProviderSecretEnv[]; +} + +const MODELS = [ + 'voyage-code-3', + 'voyage-3-large', + 'voyage-3', + 'voyage-3-lite', + 'voyage-code-2', +]; + +const DIMENSIONS = [256, 512, 1024, 2048]; + +// numberOrUndef parses a number input; empty / NaN / negative → undefined +// so the field round-trips to "unset" (no client-side enforcement). +function numberOrUndef(v: string): number | undefined { + if (v.trim() === '') return undefined; + const n = Number(v); + if (!Number.isFinite(n) || n < 0) return undefined; + return n; +} + +export function VoyageProviderForm({ value, onChange, secretEnvs }: Props) { + const apiKeyEnv = secretEnvs.find((e) => e.name === value.api_key_env); + const apiKeyMissing = apiKeyEnv != null && !apiKeyEnv.set; + + return ( +
+ + + Rate limits — fill in from your Voyage dashboard + + Voyage doesn't expose per-account rate limits via API, so the + server can't fetch yours automatically. Open the{' '} + + Voyage dashboard + {' '} + → Rate Limits, copy your tier's numbers into the fields below, + and the indexer will throttle itself accordingly via a + token-bucket. Leave all four blank to disable client-side + throttling (the server will only react to upstream 429/400). + + + +
+ + +
+ +
+
+ + +

+ Changing this triggers a full reindex per project. +

+
+ +
+ + +

+ binary / ubinary are not supported — the + vector store has no hamming-distance search. +

+
+
+ + {/* Rate-limit fields. All four optional. Defaults in the comment + below mirror the public docs; the operator should override + per their actual tier on the Voyage dashboard. */} +
+ Rate limits (from your Voyage dashboard) +

+ Public-docs Tier 1 baseline (multiply by ×2 / ×3 for Tier 2 / Tier + 3 spend):{' '} + voyage-code-* = 2000 RPM / 3M TPM / 128 inputs / + 120K tokens per request.{' '} + voyage-3* = 2000 RPM / 3–16M TPM / 1000 inputs / + 120K tokens per request. Free tier = 3 RPM / 10K TPM regardless + of model. +

+
+
+ + onChange({ ...value, rate_limit_rpm: numberOrUndef(e.target.value) })} + /> +
+
+ + onChange({ ...value, rate_limit_tpm: numberOrUndef(e.target.value) })} + /> +
+
+ + onChange({ ...value, max_inputs_per_request: numberOrUndef(e.target.value) })} + /> +
+
+ + onChange({ ...value, max_tokens_per_request: numberOrUndef(e.target.value) })} + /> +
+
+

+ Empty = no client-side enforcement for that field. RPM/TPM + empty means the indexer doesn't throttle itself (you'll see + 429s on overflow); per-request fields empty fall back to safe + defaults (128 inputs / ~100K tokens). +

+
+ +
+
+ onChange({ ...value, truncation: c === true })} + /> + +
+

+ Controls Voyage's per-input behaviour when a single chunk + exceeds the model's context window (e.g. 32K tokens for + voyage-code-3). ON (default): Voyage silently truncates the + chunk and embeds the truncated version — you always get a + vector, but content past the cap is lost from the + embedding. OFF: Voyage returns 400 on over-long inputs so + the operator can shorten the indexer's chunk size or pick + a model with a larger context. Unrelated to the + 120K-tokens-per-batch cap (which our adaptive bisect + handles separately). +

+
+ +
+ + onChange({ ...value, api_key_env: e.target.value })} + placeholder="CIX_VOYAGE_API_KEY" + /> +

+ The dashboard never stores the key — only this env-var name. +

+
+ + {apiKeyMissing ? ( + + + API key env var is not set + + Export {value.api_key_env} on the server and restart + the container before saving. Voyage API calls would fail until + the key becomes available. + + + ) : null} +
+ ); +} diff --git a/server/go.mod b/server/go.mod index 0913a64..d3cf3be 100644 --- a/server/go.mod +++ b/server/go.mod @@ -12,6 +12,7 @@ require ( github.com/philippgille/chromem-go v0.7.0 golang.org/x/crypto v0.52.0 golang.org/x/sync v0.20.0 + golang.org/x/time v0.15.0 modernc.org/sqlite v1.34.1 ) diff --git a/server/go.sum b/server/go.sum index 57f0c75..8b5046d 100644 --- a/server/go.sum +++ b/server/go.sum @@ -235,6 +235,8 @@ golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ= golang.org/x/text v0.3.7/go.mod h1:u+2+/6zg+i71rQMx5EYifcz6MCKuco9NR6JIITiCfzQ= golang.org/x/text v0.37.0 h1:Cqjiwd9eSg8e0QAkyCaQTNHFIIzWtidPahFWR83rTrc= golang.org/x/text v0.37.0/go.mod h1:a5sjxXGs9hsn/AJVwuElvCAo9v8QYLzvavO5z2PiM38= +golang.org/x/time v0.15.0 h1:bbrp8t3bGUeFOx08pvsMYRTCVSMk89u4tKbNOZbp88U= +golang.org/x/time v0.15.0/go.mod h1:Y4YMaQmXwGQZoFaVFk4YpCt4FLQMYKZe9oeV/f4MSno= golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ= golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e/go.mod h1:b+2E5dAYhXwXZwtnZ6UAqBI28+e2cm9otk0dWdXHAEo= golang.org/x/tools v0.0.0-20201224043029-2b0845dc783e/go.mod h1:emZCQorbCU4vsT4fOWvOPXz4eW1wZW4PmDk9uLelYpA= diff --git a/server/internal/config/config.go b/server/internal/config/config.go index 3a938de..84e9240 100644 --- a/server/internal/config/config.go +++ b/server/internal/config/config.go @@ -19,7 +19,7 @@ import ( // default was a Python-FastAPI parallel-rollout carry-over; the Python // backend was archived 2026-04 and the parity is no longer meaningful. type Config struct { - APIKey string + APIKey string // AuthDisabled, when true, makes the server skip the API-key check on // every endpoint. Off by default — must be turned on EXPLICITLY via // CIX_AUTH_DISABLED=true (and also requires CIX_API_KEY to be empty). @@ -176,24 +176,43 @@ type Config struct { } // ModelSafeName returns the embedding model name normalised for use inside -// filesystem paths. Matches Settings.model_safe_name in api/app/config.py. +// filesystem paths. Originally mirrored Settings.model_safe_name in the +// archived Python backend; now used ONLY to reconstruct the legacy +// per-model SQLite filename during the one-time DB adoption migration +// (internal/storage.AdoptLegacyModelDB). It no longer drives any live +// storage path — the system DB is model-independent and the vector store +// is namespaced by provider.StorageSlug(Provider.ID()). +// +// LEGACY-MIGRATION (remove next release): this and LegacyDynamicSQLitePath +// exist solely for the one-time storage-unification adoption. Once every +// deployment has booted on the unified layout, delete both along with +// storage.AdoptLegacyModelDB and its call in cmd/cix-server/main.go. func (c *Config) ModelSafeName() string { s := strings.ReplaceAll(c.EmbeddingModel, "/", "_") s = strings.ReplaceAll(s, "-", "_") return strings.ToLower(s) } -// DynamicSQLitePath returns the SQLite path with the model-safe name suffixed -// before the extension. Matches Settings.dynamic_sqlite_path in Python. -func (c *Config) DynamicSQLitePath() string { +// LegacyDynamicSQLitePath reconstructs the OLD per-model SQLite filename +// (_.db) that pre-unification builds wrote to. It is +// used solely by the boot-time DB adoption migration to locate the file +// to adopt as the new model-independent system DB; no live code path +// should depend on it. +func (c *Config) LegacyDynamicSQLitePath() string { ext := filepath.Ext(c.SQLitePath) base := strings.TrimSuffix(c.SQLitePath, ext) return fmt.Sprintf("%s_%s%s", base, c.ModelSafeName(), ext) } -// DynamicChromaPersistDir matches Settings.dynamic_chroma_persist_dir. -func (c *Config) DynamicChromaPersistDir() string { - return fmt.Sprintf("%s_%s", c.ChromaPersistDir, c.ModelSafeName()) +// ChromaDirFor returns the on-disk vector-store directory for an embedding +// identity expressed as nested path components (see +// provider.Provider.StorageComponents): {kind, model-slug[, variant]}. +// ChromaPersistDir is the container; each component is its own directory +// level, so the provider kind can never collide with a model name that +// normalises to a kind-looking slug. Empty components → the bare +// container (callers guard against that). +func (c *Config) ChromaDirFor(components []string) string { + return filepath.Join(append([]string{c.ChromaPersistDir}, components...)...) } // Load reads CIX_* environment variables and returns a populated Config. @@ -345,7 +364,6 @@ func Load() (*Config, error) { c.VersionCheckRepo = getenv("CIX_VERSION_CHECK_REPO", "dvcdsys/code-index") - c.SecretKey = getenv("CIX_SECRET_KEY", "") c.SecretKeyFile = getenv("CIX_SECRET_KEYFILE", "") c.SecretsDataDir = getenv("CIX_SECRETS_DATA_DIR", filepath.Dir(c.SQLitePath)) @@ -463,8 +481,8 @@ func defaultDataDir() string { } // defaultSQLitePath resolves the local SQLite database path under the -// platform data dir. The `_` suffix from DynamicSQLitePath is appended at -// query time, not here. +// platform data dir. This is the literal, model-independent system DB +// path the server opens (no model suffix is appended any more). func defaultSQLitePath() string { return filepath.Join(defaultDataDir(), "sqlite", "projects.db") } diff --git a/server/internal/config/config_test.go b/server/internal/config/config_test.go index a09ffda..58415a8 100644 --- a/server/internal/config/config_test.go +++ b/server/internal/config/config_test.go @@ -1,6 +1,7 @@ package config import ( + "path/filepath" "strings" "testing" ) @@ -107,8 +108,15 @@ func TestLoadOverrides(t *testing.T) { if got := c.ModelSafeName(); got != "test_model_name" { t.Errorf("ModelSafeName = %q", got) } - if got := c.DynamicSQLitePath(); got != "/tmp/test_test_model_name.db" { - t.Errorf("DynamicSQLitePath = %q", got) + // LegacyDynamicSQLitePath still reconstructs the OLD per-model filename + // (used only by the boot-time adoption migration). + if got := c.LegacyDynamicSQLitePath(); got != "/tmp/test_test_model_name.db" { + t.Errorf("LegacyDynamicSQLitePath = %q", got) + } + // ChromaDirFor joins the identity path components under the chroma base. + comps := []string{"voyage", "voyage_code_3", "2048", "float"} + if got, want := c.ChromaDirFor(comps), filepath.Join(append([]string{c.ChromaPersistDir}, comps...)...); got != want { + t.Errorf("ChromaDirFor = %q, want %q", got, want) } } diff --git a/server/internal/db/db.go b/server/internal/db/db.go index 3e2bd8a..7332bfc 100644 --- a/server/internal/db/db.go +++ b/server/internal/db/db.go @@ -64,6 +64,8 @@ var registeredMigrations = []migration{ {9, "git_repos_polling", func(db *sql.DB, _ OpenOptions) error { return migrateGitReposPolling(db) }}, {10, "auth_groups_ownership", func(db *sql.DB, _ OpenOptions) error { return migrateAuthGroupsOwnership(db) }}, {11, "project_machine_identity", func(db *sql.DB, _ OpenOptions) error { return migrateProjectMachineIdentity(db) }}, + {12, "embedding_provider", func(db *sql.DB, _ OpenOptions) error { return migrateEmbeddingProvider(db) }}, + {13, "indexed_with_model_provider_prefix", func(db *sql.DB, _ OpenOptions) error { return migrateIndexedWithModelProviderPrefix(db) }}, } // DriverName is the registered database/sql driver name for modernc.org/sqlite. @@ -138,6 +140,49 @@ func OpenWith(opts OpenOptions) (*sql.DB, error) { return db, nil } +// HasTables reports whether the SQLite database at path contains ALL of +// the named tables. It opens the file read-write (so any pending WAL is +// recovered cleanly) with a busy timeout, runs NO migrations, and closes +// before returning. A missing file is not an error — it returns +// (false, nil). Used by the boot-time DB adoption migration +// (internal/storage) to tell a real unified system DB (has both +// schema_migrations and users) apart from a pre-auth fossil that merely +// happens to occupy the target path. +func HasTables(path string, names ...string) (bool, error) { + if path == "" || len(names) == 0 { + return false, nil + } + if _, err := os.Stat(path); err != nil { + if errors.Is(err, os.ErrNotExist) { + return false, nil + } + return false, fmt.Errorf("stat %s: %w", path, err) + } + dsn, err := buildDSN(path) + if err != nil { + return false, err + } + sdb, err := sql.Open(DriverName, dsn) + if err != nil { + return false, fmt.Errorf("open %s: %w", path, err) + } + defer sdb.Close() + sdb.SetMaxOpenConns(1) + for _, name := range names { + var got string + err := sdb.QueryRow( + `SELECT name FROM sqlite_master WHERE type='table' AND name = ?`, name, + ).Scan(&got) + if errors.Is(err, sql.ErrNoRows) { + return false, nil + } + if err != nil { + return false, fmt.Errorf("check table %q in %s: %w", name, path, err) + } + } + return true, nil +} + // applyMigrations runs every entry in registeredMigrations whose version is // greater than the current high-water mark in schema_migrations. Each // successful migration records a (version, name, applied_at) row so the @@ -668,10 +713,10 @@ func migratePathHash(db *sql.DB) error { haveColumn := false for rows.Next() { var ( - cid int - name, typ string - notnull, pk int - dflt sql.NullString + cid int + name, typ string + notnull, pk int + dflt sql.NullString ) if err := rows.Scan(&cid, &name, &typ, ¬null, &dflt, &pk); err != nil { rows.Close() @@ -720,6 +765,87 @@ func migratePathHash(db *sql.DB) error { return nil } +// migrateEmbeddingProvider adds the pluggable-provider columns to +// runtime_settings: +// - embedding_provider TEXT — kind selector ("ollama"/"openai"/"voyage") +// - embedding_provider_config TEXT — provider-specific JSON blob +// +// Rows stay NULL until the admin first persists a non-default provider; +// boot logic in main.go then falls through to the env-derived ollama +// defaults exactly as before. Idempotent — checked via PRAGMA +// table_info, ALTER only on absence. +func migrateEmbeddingProvider(db *sql.DB) error { + rows, err := db.Query(`PRAGMA table_info(runtime_settings)`) + if err != nil { + return fmt.Errorf("table_info runtime_settings: %w", err) + } + have := map[string]bool{} + for rows.Next() { + var ( + cid int + name, typ string + notnull, pk int + dflt sql.NullString + ) + if err := rows.Scan(&cid, &name, &typ, ¬null, &dflt, &pk); err != nil { + rows.Close() + return err + } + have[name] = true + } + rows.Close() + if !have["embedding_provider"] { + if _, err := db.Exec(`ALTER TABLE runtime_settings ADD COLUMN embedding_provider TEXT`); err != nil { + return fmt.Errorf("add embedding_provider column: %w", err) + } + } + if !have["embedding_provider_config"] { + if _, err := db.Exec(`ALTER TABLE runtime_settings ADD COLUMN embedding_provider_config TEXT`); err != nil { + return fmt.Errorf("add embedding_provider_config column: %w", err) + } + } + return nil +} + +// migrateIndexedWithModelProviderPrefix backfills projects indexed +// before the pluggable-provider refactor (migration 12). Pre-refactor +// the indexer wrote a bare model name like +// "awhiteside/CodeRankEmbed-Q8_0-GGUF"; post-refactor it writes the +// fully-qualified Provider.ID() of the form "ollama:". Without +// this migration every legacy project would show a "stale model" +// badge forever because the bare string never matches the live +// "ollama:" and a reindex would *still* write the new prefixed +// form — leaving every UN-reindexed project flagged falsely. +// +// Heuristic: rows that don't already start with a known provider-kind +// prefix predate the prefix convention. Prepend "ollama:" — safe +// because pre-refactor there was no other embedding backend; every +// legacy row was produced by the in-process llama-server sidecar. +// (Testing for the kind prefix rather than for the mere presence of a +// ":" matters: a legacy Ollama-style model name like +// "nomic-embed-text:latest" contains a colon but is NOT yet prefixed, +// so a presence-of-colon test would wrongly skip it and leave it +// flagged stale forever.) +// +// Idempotent: rows already starting with ollama:/openai:/voyage: are +// left alone, so re-running this migration (or running it against a DB +// that was already partially upgraded) is a no-op. +func migrateIndexedWithModelProviderPrefix(db *sql.DB) error { + _, err := db.Exec(` + UPDATE projects + SET indexed_with_model = 'ollama:' || indexed_with_model + WHERE indexed_with_model IS NOT NULL + AND indexed_with_model != '' + AND indexed_with_model NOT LIKE 'ollama:%' + AND indexed_with_model NOT LIKE 'openai:%' + AND indexed_with_model NOT LIKE 'voyage:%' + `) + if err != nil { + return fmt.Errorf("backfill indexed_with_model prefix: %w", err) + } + return nil +} + // migrateIndexedWithModel adds projects.indexed_with_model to pre-PR-E // databases. Idempotent: PRAGMA table_info first; ALTER only if absent. Rows // stay NULL — the dashboard treats NULL as "indexed before drift tracking diff --git a/server/internal/db/db_test.go b/server/internal/db/db_test.go index 4b484a1..f86c210 100644 --- a/server/internal/db/db_test.go +++ b/server/internal/db/db_test.go @@ -210,6 +210,96 @@ func TestOpenMigratesPreEDB(t *testing.T) { } } +// TestMigrate_IndexedWithModelProviderPrefix covers the backfill that +// the pluggable-provider PR adds: legacy rows whose indexed_with_model +// is a bare model name ("awhiteside/CodeRankEmbed-Q8_0-GGUF") must +// be rewritten to the prefixed form ("ollama:awhiteside/...") so the +// drift-detector and dashboard see a match with the live Provider.ID(). +// Rows that already start with a known kind prefix (ollama:/openai:/ +// voyage:) must be left untouched — important for idempotency and for +// DBs partially upgraded before this migration shipped. A legacy +// Ollama-style name that merely contains a colon (e.g. +// "nomic-embed-text:latest") is NOT yet prefixed and MUST still get the +// "ollama:" prefix. +func TestMigrate_IndexedWithModelProviderPrefix(t *testing.T) { + tmp := filepath.Join(t.TempDir(), "indexed-prefix.db") + + // Stage a minimal projects table at the migration-12 layout (i.e. + // indexed_with_model column already exists) so we exercise just + // the prefix backfill without crossing other migrations' concerns. + seed, err := sql.Open(DriverName, "file:"+tmp) + if err != nil { + t.Fatalf("seed Open: %v", err) + } + if _, err := seed.Exec(`CREATE TABLE projects ( + host_path TEXT PRIMARY KEY, + container_path TEXT NOT NULL, + languages TEXT DEFAULT '[]', + settings TEXT DEFAULT '{}', + stats TEXT DEFAULT '{}', + status TEXT DEFAULT 'created', + created_at TEXT NOT NULL, + updated_at TEXT NOT NULL, + last_indexed_at TEXT, + path_hash TEXT, + indexed_with_model TEXT + )`); err != nil { + t.Fatalf("seed CREATE TABLE: %v", err) + } + rows := []struct { + host, model string + }{ + {"/legacy/bare", "awhiteside/CodeRankEmbed-Q8_0-GGUF"}, // should get "ollama:" prefix + {"/legacy/colon", "nomic-embed-text:latest"}, // colon, but no kind prefix → should get "ollama:" + {"/already/prefixed", "ollama:awhiteside/CodeRankEmbed-Q8_0-GGUF"}, // untouched + {"/already/voyage", "voyage:voyage-code-3:1024:float"}, // untouched + } + for _, r := range rows { + if _, err := seed.Exec( + `INSERT INTO projects (host_path, container_path, created_at, updated_at, path_hash, indexed_with_model) + VALUES (?, ?, '2024-01-01', '2024-01-01', ?, ?)`, + r.host, r.host, r.host, r.model, + ); err != nil { + t.Fatalf("seed INSERT %s: %v", r.host, err) + } + } + // Row with NULL model should also be left alone (legacy pre-PR-E projects). + if _, err := seed.Exec( + `INSERT INTO projects (host_path, container_path, created_at, updated_at, path_hash) + VALUES ('/legacy/null', '/legacy/null', '2024-01-01', '2024-01-01', 'null')`, + ); err != nil { + t.Fatalf("seed INSERT null: %v", err) + } + seed.Close() + + database, err := Open(tmp) + if err != nil { + t.Fatalf("Open migrates DB: %v", err) + } + defer database.Close() + defer os.Remove(tmp) + + expectations := map[string]sql.NullString{ + "/legacy/bare": {String: "ollama:awhiteside/CodeRankEmbed-Q8_0-GGUF", Valid: true}, + "/legacy/colon": {String: "ollama:nomic-embed-text:latest", Valid: true}, + "/already/prefixed": {String: "ollama:awhiteside/CodeRankEmbed-Q8_0-GGUF", Valid: true}, + "/already/voyage": {String: "voyage:voyage-code-3:1024:float", Valid: true}, + "/legacy/null": {Valid: false}, + } + for host, want := range expectations { + var got sql.NullString + if err := database.QueryRow( + `SELECT indexed_with_model FROM projects WHERE host_path = ?`, host, + ).Scan(&got); err != nil { + t.Fatalf("select %s: %v", host, err) + } + if got.Valid != want.Valid || got.String != want.String { + t.Errorf("%s: indexed_with_model = %v (valid=%v), want %v (valid=%v)", + host, got.String, got.Valid, want.String, want.Valid) + } + } +} + // TestOpenMigratesPreM9DB simulates a pre-m9 database (git_repos without the // polling columns — i.e. pre git_repos_polling migration) and verifies Open // adds them + the scheduler index without crashing, and that an existing row diff --git a/server/internal/db/schema.go b/server/internal/db/schema.go index cbe6646..791fd64 100644 --- a/server/internal/db/schema.go +++ b/server/internal/db/schema.go @@ -167,7 +167,16 @@ CREATE TABLE IF NOT EXISTS runtime_settings ( max_embedding_concurrency INTEGER, llama_batch_size INTEGER, updated_at TEXT NOT NULL, - updated_by TEXT + updated_by TEXT, + -- Pluggable embedding provider (added in migration 12). + -- embedding_provider selects the active backend kind; if NULL the + -- server falls back to the env/recommended ollama defaults so old + -- installs stay on llama-server until the admin picks otherwise. + embedding_provider TEXT, + -- embedding_provider_config holds the provider-specific config as + -- a JSON blob (shape varies by provider). API keys are NEVER stored + -- here — providers read them live from env vars named in this blob. + embedding_provider_config TEXT ); -- Workspaces group indexed projects (rows in the projects table, diff --git a/server/internal/embeddings/parity_test.go b/server/internal/embeddings/parity_test.go index 6db1417..4a5d21e 100644 --- a/server/internal/embeddings/parity_test.go +++ b/server/internal/embeddings/parity_test.go @@ -83,7 +83,7 @@ func TestEmbeddingParity(t *testing.T) { embedCtx, embedCancel := context.WithTimeout(ctx, 90*time.Second) defer embedCancel() - got, err := svc.embedRaw(embedCtx, texts) + got, err := embedRawForParity(embedCtx, svc, texts) if err != nil { t.Fatalf("embedRaw: %v", err) } @@ -131,6 +131,31 @@ func TestEmbeddingParity(t *testing.T) { // --- helpers --- +// rawEmbedder is the subset of the active provider used by the parity +// gate: embed verbatim, no asymmetric-retrieval prefix. The ollama +// provider satisfies it via Provider.EmbedRaw; this matches the +// reference inputs (which already carry their prefixes) 1:1. +type rawEmbedder interface { + EmbedRaw(context.Context, []string) ([][]float32, error) +} + +// embedRawForParity reaches the live provider behind the Service and +// embeds texts verbatim, bypassing the queue and prefix logic — the +// behaviour the deleted Service.embedRaw helper used to provide. +func embedRawForParity(ctx context.Context, s *Service, texts []string) ([][]float32, error) { + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur == nil { + return nil, ErrSupervisor + } + re, ok := cur.(rawEmbedder) + if !ok { + return nil, fmt.Errorf("active provider %T does not support raw embedding", cur) + } + return re.EmbedRaw(ctx, texts) +} + type refItem struct { Phrase string `json:"phrase"` IsQuery bool `json:"is_query"` diff --git a/server/internal/embeddings/provider/factory.go b/server/internal/embeddings/provider/factory.go new file mode 100644 index 0000000..4f2e179 --- /dev/null +++ b/server/internal/embeddings/provider/factory.go @@ -0,0 +1,133 @@ +package provider + +import ( + "context" + "fmt" + "log/slog" + "sort" + "sync" +) + +// Factory constructs Providers of a given kind from a typed Config. +// One Factory per kind; registered at init time by each provider +// sub-package via Register. The Service builds the active provider +// by calling Build on the factory matching the persisted kind. +type Factory interface { + // Kind matches one of the Kind* constants in provider.go. + Kind() string + + // SchemaJSON returns a JSON-encoded ConfigSchema describing the + // fields this provider accepts. The dashboard reads this to know + // which input controls to render in the provider-specific form. + // Returned bytes must be deterministic across calls so the API + // response is cacheable. + SchemaJSON() []byte + + // SecretEnvVars lists the env-var names this provider consults + // for credentials. Order is irrelevant. Used by the admin + // GET /embedding-providers endpoint to tell the dashboard which + // keys are set in the environment so it can render a "set + // CIX_VOYAGE_API_KEY before saving" banner when the operator + // configures a provider whose key is missing. + SecretEnvVars() []string + + // Build constructs a concrete Provider from a JSON-encoded config + // blob (shape matches the provider's ConfigSchema). secrets is + // used by HTTP-only providers to resolve API-key env-var names; + // ollama's factory ignores it. logger may be nil; implementations + // fall back to slog.Default(). + // + // Build does NOT start the provider — call Provider.Start + // separately so the caller can decide whether a failed Start + // rolls back to the previous active provider. + Build(cfg []byte, secrets SecretLookup, logger *slog.Logger) (Provider, error) +} + +// ConfigSchema is the dashboard-facing description of a provider's +// config form. Hardcoded React components in the dashboard ignore +// this in favour of typed forms, but the JSON is still exposed via +// /admin/embedding-providers so external tooling (curl, ad-hoc +// admin scripts) has a contract to read. +type ConfigSchema struct { + Fields []ConfigField `json:"fields"` +} + +// ConfigField describes one input control in a provider config form. +// Kind drives the widget; the dashboard's hardcoded forms map field +// Name → input. +type ConfigField struct { + Name string `json:"name"` + Label string `json:"label"` + Kind string `json:"kind"` // "string" | "int" | "bool" | "enum" | "secret-env" + Required bool `json:"required,omitempty"` + Default any `json:"default,omitempty"` + Enum []string `json:"enum,omitempty"` + Description string `json:"description,omitempty"` +} + +var ( + registryMu sync.RWMutex + registry = map[string]Factory{} +) + +// Register adds a Factory to the global registry. Called from each +// provider sub-package's init(). Panics if the kind is already +// registered — that indicates a programmer error (duplicate +// init) rather than something an operator can recover from. +func Register(f Factory) { + if f == nil { + panic("provider.Register: nil factory") + } + kind := f.Kind() + if kind == "" { + panic("provider.Register: factory returned empty Kind") + } + registryMu.Lock() + defer registryMu.Unlock() + if _, exists := registry[kind]; exists { + panic(fmt.Sprintf("provider.Register: kind %q already registered", kind)) + } + registry[kind] = f +} + +// Lookup returns the Factory registered for kind, or (nil, false). +func Lookup(kind string) (Factory, bool) { + registryMu.RLock() + defer registryMu.RUnlock() + f, ok := registry[kind] + return f, ok +} + +// Kinds returns the registered kinds in deterministic order, useful +// for the admin /embedding-providers list endpoint. +func Kinds() []string { + registryMu.RLock() + defer registryMu.RUnlock() + kinds := make([]string, 0, len(registry)) + for k := range registry { + kinds = append(kinds, k) + } + sort.Strings(kinds) + return kinds +} + +// Build constructs a provider by kind. Convenience wrapper around +// Lookup + Factory.Build that returns a clear error when the kind +// is not registered (so callers don't have to fmt.Errorf at each +// site). +func Build(ctx context.Context, kind string, cfg []byte, secrets SecretLookup, logger *slog.Logger) (Provider, error) { + _ = ctx // reserved for future per-build cancellation; Build itself is fast + f, ok := Lookup(kind) + if !ok { + return nil, fmt.Errorf("provider %q is not registered", kind) + } + return f.Build(cfg, secrets, logger) +} + +// EnvSecrets returns a SecretLookup that reads from os.LookupEnv. +// This is the production wiring; tests pass their own SecretLookup +// returning fixed values so the test does not have to set env vars +// in the process. +func EnvSecrets(lookup func(key string) (string, bool)) SecretLookup { + return SecretLookup(lookup) +} diff --git a/server/internal/embeddings/provider/httpremote.go b/server/internal/embeddings/provider/httpremote.go new file mode 100644 index 0000000..555d2aa --- /dev/null +++ b/server/internal/embeddings/provider/httpremote.go @@ -0,0 +1,61 @@ +package provider + +import ( + "fmt" + "strings" +) + +// This file holds the plumbing every HTTP-only embedding provider +// (openai, voyage, and any future REST backend) shares: base-URL +// normalization, live API-key resolution, and the Ready()/Status() +// implementations. Keeping it in one place stops the providers from +// drifting apart (they previously copy-pasted all four). + +// NormalizeBaseURL trims trailing slashes from an HTTP base URL so that +// joining "/v1/embeddings" never yields a double slash — which stricter +// OpenAI-compatible servers behind a proxy (vLLM/TEI) 404 on. +func NormalizeBaseURL(raw string) string { + return strings.TrimRight(raw, "/") +} + +// ResolveAPIKey looks up apiKeyEnv via secrets, returning (value, true) +// only when the env var is set to a non-empty value. HTTP providers read +// their bearer token live through this — the raw value never lives in the +// provider's persisted config. Returns ("", false) when secrets is nil, +// apiKeyEnv is empty, or the var is unset/blank. +func ResolveAPIKey(secrets SecretLookup, apiKeyEnv string) (string, bool) { + if secrets == nil || apiKeyEnv == "" { + return "", false + } + v, ok := secrets(apiKeyEnv) + if !ok || v == "" { + return "", false + } + return v, true +} + +// RemoteReady is the shared Ready() for HTTP-only providers: nil when the +// API key is present, ErrMissingAPIKey otherwise. We deliberately do NOT +// ping the upstream on every call (the /status footer polls Ready every +// 30s) — real outages surface as embed failures with diagnostics. +func RemoteReady(secrets SecretLookup, apiKeyEnv string) error { + if _, ok := ResolveAPIKey(secrets, apiKeyEnv); !ok { + return fmt.Errorf("%w: %s", ErrMissingAPIKey, apiKeyEnv) + } + return nil +} + +// RemoteStatus is the shared Status() for HTTP-only providers: StateRemote +// when the API key is present, StateFailed with a diagnostic otherwise. +func RemoteStatus(model, apiKeyEnv string, secrets SecretLookup) Status { + st := Status{ + State: StateRemote, + ManagesProcess: false, + Model: model, + } + if _, ok := ResolveAPIKey(secrets, apiKeyEnv); !ok { + st.State = StateFailed + st.LastError = "API key env var " + apiKeyEnv + " is not set" + } + return st +} diff --git a/server/internal/embeddings/bootstrap_test.go b/server/internal/embeddings/provider/ollama/bootstrap_test.go similarity index 99% rename from server/internal/embeddings/bootstrap_test.go rename to server/internal/embeddings/provider/ollama/bootstrap_test.go index 455df72..46db148 100644 --- a/server/internal/embeddings/bootstrap_test.go +++ b/server/internal/embeddings/provider/ollama/bootstrap_test.go @@ -1,4 +1,4 @@ -package embeddings +package ollama import ( "io/fs" diff --git a/server/internal/embeddings/client.go b/server/internal/embeddings/provider/ollama/client.go similarity index 99% rename from server/internal/embeddings/client.go rename to server/internal/embeddings/provider/ollama/client.go index bbcf9b8..4097ddb 100644 --- a/server/internal/embeddings/client.go +++ b/server/internal/embeddings/provider/ollama/client.go @@ -1,4 +1,4 @@ -package embeddings +package ollama import ( "bytes" diff --git a/server/internal/embeddings/provider/ollama/factory.go b/server/internal/embeddings/provider/ollama/factory.go new file mode 100644 index 0000000..9e15a6d --- /dev/null +++ b/server/internal/embeddings/provider/ollama/factory.go @@ -0,0 +1,104 @@ +package ollama + +import ( + "encoding/json" + "fmt" + "log/slog" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" +) + +// factory is the provider.Factory implementation for ollama. The +// init() block below registers it in the global registry so the +// Service can build an ollama provider purely by kind string. +type factory struct{} + +func (factory) Kind() string { return provider.KindOllama } + +// SchemaJSON describes the editable ollama fields for the dashboard. +// Sidecar-internal knobs (BinDir, SocketPath, Transport, CacheDir, +// BootstrapPath) are populated from env at bootstrap and are not +// surfaced for runtime edit — they are deployment-level concerns. +func (factory) SchemaJSON() []byte { + schema := provider.ConfigSchema{ + Fields: []provider.ConfigField{ + { + Name: "model", + Label: "Embedding model", + Kind: "string", + Required: true, + Description: "HuggingFace repo id (owner/repo) or an absolute path to a .gguf file.", + }, + { + Name: "gguf_path", + Label: "GGUF path override", + Kind: "string", + Description: "Optional absolute path that takes precedence over Model (CIX_GGUF_PATH).", + }, + { + Name: "ctx_size", + Label: "Context size", + Kind: "int", + Default: 2048, + }, + { + Name: "n_gpu_layers", + Label: "GPU layers", + Kind: "int", + Description: "-1 = all on Metal, 0 = CPU only.", + }, + { + Name: "n_threads", + Label: "Threads", + Kind: "int", + Description: "0 = let llama-server auto-detect.", + }, + { + Name: "batch_size", + Label: "Batch size", + Kind: "int", + Description: "0 = match context size.", + }, + }, + } + // Schema is small and stable; build once per call so the registry + // doesn't have to hold long-lived buffers. + b, _ := json.Marshal(schema) + return b +} + +// SecretEnvVars: ollama has no remote API key. +func (factory) SecretEnvVars() []string { return nil } + +// Build unmarshals cfg into Config and constructs a Provider. Does +// not call Start; the caller (Service.SwitchProvider) sequences the +// start so it can roll back to the previous provider on failure. +func (factory) Build(cfg []byte, _ provider.SecretLookup, logger *slog.Logger) (provider.Provider, error) { + if logger == nil { + logger = slog.Default() + } + if len(cfg) == 0 { + return nil, fmt.Errorf("ollama: empty config") + } + var c Config + if err := json.Unmarshal(cfg, &c); err != nil { + return nil, fmt.Errorf("ollama: unmarshal config: %w", err) + } + if c.Model == "" { + return nil, fmt.Errorf("ollama: model is required") + } + if c.Transport == "" { + c.Transport = "unix" + } + if c.CtxSize == 0 { + c.CtxSize = 2048 + } + if c.StartupSec == 0 { + c.StartupSec = 60 + } + return New(c, logger), nil +} + +func init() { + provider.Register(factory{}) +} diff --git a/server/internal/embeddings/provider/ollama/gguf.go b/server/internal/embeddings/provider/ollama/gguf.go new file mode 100644 index 0000000..4767c8c --- /dev/null +++ b/server/internal/embeddings/provider/ollama/gguf.go @@ -0,0 +1,164 @@ +package ollama + +import ( + "context" + "fmt" + "io" + "log/slog" + "os" + "path/filepath" + "strings" +) + +// GGUFInputs bundles the env/config values needed to locate (or +// download) the GGUF weights for the llama-server child. Service +// extracts them from *config.Config so the ollama package stays free +// of the config dependency. +type GGUFInputs struct { + GGUFPath string // CIX_GGUF_PATH absolute override + Model string // HF repo id ("owner/repo") or absolute path + CacheDir string // base dir under which downloaded GGUFs live + BootstrapPath string // CIX_BOOTSTRAP_GGUF_PATH one-shot import source +} + +// ResolveGGUFPath walks the precedence chain: +// 1. in.GGUFPath (absolute path env override, validated by Stat). +// 2. in.Model as absolute path — when the dashboard's "Local path" +// mode wrote a filesystem path through the runtime_settings row. +// 3. Cached file under in.CacheDir//*.gguf when in.Model +// is an HF repo ID. +// 4. in.BootstrapPath one-shot import — copies the file into the +// cache layout, then behaves like step 3 forever after. +// 5. HuggingFace download into the same cache (only step that +// actually writes to disk). +func ResolveGGUFPath(ctx context.Context, in GGUFInputs, logger *slog.Logger) (string, error) { + if in.GGUFPath != "" { + if _, err := os.Stat(in.GGUFPath); err != nil { + return "", fmt.Errorf("CIX_GGUF_PATH=%s: %w", in.GGUFPath, err) + } + return in.GGUFPath, nil + } + if filepath.IsAbs(in.Model) { + if _, err := os.Stat(in.Model); err != nil { + return "", fmt.Errorf("embedding model path %s: %w", in.Model, err) + } + return in.Model, nil + } + if !strings.Contains(in.Model, "/") { + return "", fmt.Errorf("embedding model %q is neither an absolute path nor an HF repo id (owner/repo)", in.Model) + } + + if cached := findCachedGGUF(in.CacheDir, in.Model); cached != "" { + logger.Info("using cached gguf", "path", cached) + return cached, nil + } + + // CIX_BOOTSTRAP_GGUF_PATH — one-time import. Idempotent across + // boots: subsequent boots find the imported file via findCachedGGUF + // above and skip this branch entirely. + if in.BootstrapPath != "" { + imported, err := importBootstrapGGUF(in.CacheDir, in.Model, in.BootstrapPath, logger) + if err != nil { + logger.Warn("bootstrap gguf import failed; falling through to HF download", + "src", in.BootstrapPath, "err", err) + } else if imported != "" { + return imported, nil + } + } + + return DownloadGGUF(ctx, in.Model, in.CacheDir, logger) +} + +// importBootstrapGGUF copies srcPath into // +// atomically (write to .partial, fsync, rename). Returns the final path +// on success, "" if the source is missing (caller falls through to HF +// download), or an error for IO problems we should surface to the operator. +// +// safe_repo derived from the HF repo id (`owner/repo` → `owner__repo`) +// to match DownloadGGUF's layout exactly — so subsequent boots' cache +// scan finds the imported file under the same name HF would have used. +func importBootstrapGGUF(cacheDir, repo, srcPath string, logger *slog.Logger) (string, error) { + if cacheDir == "" || repo == "" { + return "", nil + } + srcInfo, err := os.Stat(srcPath) + if err != nil { + if os.IsNotExist(err) { + return "", nil + } + return "", fmt.Errorf("stat bootstrap gguf %s: %w", srcPath, err) + } + if srcInfo.IsDir() { + return "", fmt.Errorf("bootstrap gguf %s is a directory, expected file", srcPath) + } + + safeRepo := strings.ReplaceAll(repo, "/", "__") + targetDir := filepath.Join(cacheDir, safeRepo) + if err := os.MkdirAll(targetDir, 0o755); err != nil { + return "", fmt.Errorf("mkdir cache dir: %w", err) + } + finalPath := filepath.Join(targetDir, filepath.Base(srcPath)) + + if _, err := os.Stat(finalPath); err == nil { + return finalPath, nil + } + + logger.Info("importing bootstrap gguf into cache", + "src", srcPath, "dst", finalPath, "size", srcInfo.Size()) + + src, err := os.Open(srcPath) + if err != nil { + return "", fmt.Errorf("open bootstrap gguf: %w", err) + } + defer src.Close() + + partial := finalPath + ".partial" + dst, err := os.OpenFile(partial, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o644) + if err != nil { + return "", fmt.Errorf("create cache target: %w", err) + } + + if _, err := io.Copy(dst, src); err != nil { + _ = dst.Close() + _ = os.Remove(partial) + return "", fmt.Errorf("copy bootstrap gguf: %w", err) + } + if err := dst.Sync(); err != nil { + _ = dst.Close() + _ = os.Remove(partial) + return "", fmt.Errorf("fsync bootstrap gguf: %w", err) + } + if err := dst.Close(); err != nil { + _ = os.Remove(partial) + return "", fmt.Errorf("close bootstrap gguf: %w", err) + } + if err := os.Rename(partial, finalPath); err != nil { + _ = os.Remove(partial) + return "", fmt.Errorf("atomic rename bootstrap gguf: %w", err) + } + logger.Info("bootstrap gguf imported", "path", finalPath) + return finalPath, nil +} + +// findCachedGGUF looks for a previously-downloaded .gguf under the +// standard cache layout produced by DownloadGGUF. Returns "" on any +// miss (including IO errors) so the caller proceeds to the download +// path. +func findCachedGGUF(cacheDir, repo string) string { + safeRepo := strings.ReplaceAll(repo, "/", "__") + dir := cacheDir + string(os.PathSeparator) + safeRepo + entries, err := os.ReadDir(dir) + if err != nil { + return "" + } + for _, e := range entries { + if e.IsDir() { + continue + } + name := e.Name() + if len(name) > 5 && strings.EqualFold(name[len(name)-5:], ".gguf") { + return dir + string(os.PathSeparator) + name + } + } + return "" +} diff --git a/server/internal/embeddings/hfdownload.go b/server/internal/embeddings/provider/ollama/hfdownload.go similarity index 99% rename from server/internal/embeddings/hfdownload.go rename to server/internal/embeddings/provider/ollama/hfdownload.go index fc812f7..9510443 100644 --- a/server/internal/embeddings/hfdownload.go +++ b/server/internal/embeddings/provider/ollama/hfdownload.go @@ -1,4 +1,4 @@ -package embeddings +package ollama import ( "context" diff --git a/server/internal/embeddings/logwriter.go b/server/internal/embeddings/provider/ollama/logwriter.go similarity index 98% rename from server/internal/embeddings/logwriter.go rename to server/internal/embeddings/provider/ollama/logwriter.go index 0b469d0..7c9ba61 100644 --- a/server/internal/embeddings/logwriter.go +++ b/server/internal/embeddings/provider/ollama/logwriter.go @@ -1,4 +1,4 @@ -package embeddings +package ollama import ( "bytes" diff --git a/server/internal/embeddings/prefix.go b/server/internal/embeddings/provider/ollama/prefix.go similarity index 74% rename from server/internal/embeddings/prefix.go rename to server/internal/embeddings/provider/ollama/prefix.go index 76248d1..4ee577c 100644 --- a/server/internal/embeddings/prefix.go +++ b/server/internal/embeddings/provider/ollama/prefix.go @@ -1,4 +1,4 @@ -package embeddings +package ollama import "strings" @@ -9,11 +9,11 @@ import "strings" // Keep this map string-for-string identical to the Python dict. The parity gate // depends on the prefix being literally the same bytes sent to the model. var QueryPrefixes = map[string]string{ - "nomic-ai/CodeRankEmbed": "Represent this query for searching relevant code: ", - "nomic-ai/nomic-embed-text-v1.5": "search_query: ", - "BAAI/bge-base-en-v1.5": "Represent this sentence for searching relevant passages: ", - "BAAI/bge-large-en-v1.5": "Represent this sentence for searching relevant passages: ", - "awhiteside/CodeRankEmbed-Q8_0-GGUF": "Represent this query for searching relevant code: ", + "nomic-ai/CodeRankEmbed": "Represent this query for searching relevant code: ", + "nomic-ai/nomic-embed-text-v1.5": "search_query: ", + "BAAI/bge-base-en-v1.5": "Represent this sentence for searching relevant passages: ", + "BAAI/bge-large-en-v1.5": "Represent this sentence for searching relevant passages: ", + "awhiteside/CodeRankEmbed-Q8_0-GGUF": "Represent this query for searching relevant code: ", } // ResolveQueryPrefix returns the prefix string to prepend to queries for the diff --git a/server/internal/embeddings/prefix_test.go b/server/internal/embeddings/provider/ollama/prefix_test.go similarity index 99% rename from server/internal/embeddings/prefix_test.go rename to server/internal/embeddings/provider/ollama/prefix_test.go index 9f3e707..7b960e8 100644 --- a/server/internal/embeddings/prefix_test.go +++ b/server/internal/embeddings/provider/ollama/prefix_test.go @@ -1,4 +1,4 @@ -package embeddings +package ollama import "testing" diff --git a/server/internal/embeddings/provider/ollama/provider.go b/server/internal/embeddings/provider/ollama/provider.go new file mode 100644 index 0000000..9c7173a --- /dev/null +++ b/server/internal/embeddings/provider/ollama/provider.go @@ -0,0 +1,395 @@ +// Package ollama implements provider.Provider on top of an in-process +// llama-server child. It owns the supervisor (fork+exec, crash-restart +// budget, readiness probe), the unix/TCP llamaClient, the asymmetric +// retrieval prefix, the GGUF resolver, and the token-aware multi-pass +// embedding pipeline. +// +// The Service layer (embeddings.Service) brackets all calls with a +// Queue for backpressure; this package does not impose its own +// concurrency limit. +package ollama + +import ( + "context" + "fmt" + "log/slog" + "sync" + "sync/atomic" + "time" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" +) + +// Config is the typed shape of the ollama provider's persisted config +// (the JSON-blob in runtime_settings.embedding_provider_config when +// embedding_provider="ollama"). The Factory's Build method +// unmarshals JSON into this struct and validates. +type Config struct { + // Model is the HF repo id ("owner/repo") or an absolute path to + // a .gguf file. Resolved via ResolveGGUFPath at Start. + Model string `json:"model"` + + // GGUFPath is an optional absolute-path override (CIX_GGUF_PATH + // equivalent). When set it takes precedence over Model and the + // cache lookup. + GGUFPath string `json:"gguf_path,omitempty"` + + // CacheDir is where downloaded GGUFs live and where the + // bootstrap-import drops files (CIX_GGUF_CACHE_DIR). + CacheDir string `json:"cache_dir,omitempty"` + + // BootstrapPath is the one-shot import source + // (CIX_BOOTSTRAP_GGUF_PATH). Idempotent across boots. + BootstrapPath string `json:"bootstrap_path,omitempty"` + + // BinDir is the directory containing llama-server + its + // dylibs (CIX_LLAMA_BIN_DIR). + BinDir string `json:"bin_dir,omitempty"` + + // SocketPath is the unix socket for IPC with llama-server. + // Auto-falls-back to TCP when this exceeds the platform limit + // (104 bytes on darwin). + SocketPath string `json:"socket_path,omitempty"` + + // Transport is "unix" or "tcp". + Transport string `json:"transport,omitempty"` + + // CtxSize is the context window in tokens. + CtxSize int `json:"ctx_size,omitempty"` + + // NGpuLayers: -1 = all on Metal, 0 = CPU only. + NGpuLayers int `json:"n_gpu_layers"` + + // NThreads: 0 = let llama-server pick. + NThreads int `json:"n_threads,omitempty"` + + // BatchSize: 0 = match CtxSize. + BatchSize int `json:"batch_size,omitempty"` + + // StartupSec bounds the readiness probe. + StartupSec int `json:"startup_sec,omitempty"` +} + +// Provider is the ollama-backed provider.Provider implementation. +// One per active config; rebuilt (Stop+new+Start) when the admin +// changes any config field. +type Provider struct { + cfg Config + logger *slog.Logger + + mu sync.Mutex + sup *supervisor + started atomic.Bool + ggufPath string // resolved at Start; empty before +} + +// New constructs an ollama Provider. Does not start it — call Start +// to spawn llama-server. Provider methods that need the running +// child return provider.ErrNotReady until Start succeeds. +func New(cfg Config, logger *slog.Logger) *Provider { + if logger == nil { + logger = slog.Default() + } + return &Provider{cfg: cfg, logger: logger} +} + +// Kind reports the registered factory kind for this provider. +func (p *Provider) Kind() string { return provider.KindOllama } + +// ID is the fingerprint stored in projects.indexed_with_model. Tied +// to the model name only — different GGUF tunings (ctx, threads, +// batch) do not change embedding output, so they must NOT change +// the ID (would force unnecessary reindexes). +func (p *Provider) ID() string { + return "ollama:" + p.cfg.Model +} + +// Dimension returns 0 — the vector store infers dimension from the +// first upsert (chromem-go behaviour) and CodeRankEmbed-Q8 reports +// it on first call. +func (p *Provider) Dimension() int { return 0 } + +// SupportsTokenize is true: llama-server exposes /tokenize. +func (p *Provider) SupportsTokenize() bool { return true } + +// StorageComponents namespaces the vector store as ollama/. +// Dimension is not part of the path: it is unknown at config time +// (Dimension() == 0) and the model name already pins the GGUF/quant. +func (p *Provider) StorageComponents() []string { + return []string{provider.KindOllama, provider.StorageSlug(p.cfg.Model)} +} + +// Start resolves the GGUF path then spawns the supervisor. Blocks +// until the readiness probe succeeds or ctx expires. +func (p *Provider) Start(ctx context.Context) error { + p.mu.Lock() + defer p.mu.Unlock() + if p.sup != nil { + // Idempotent — re-Start on an already-running provider is a + // no-op (Service uses Stop+new+Start for restarts, never + // Start twice on the same instance, but be defensive). + return nil + } + + gguf, err := ResolveGGUFPath(ctx, GGUFInputs{ + GGUFPath: p.cfg.GGUFPath, + Model: p.cfg.Model, + CacheDir: p.cfg.CacheDir, + BootstrapPath: p.cfg.BootstrapPath, + }, p.logger) + if err != nil { + return fmt.Errorf("resolve gguf: %w", err) + } + p.ggufPath = gguf + + supCfg := supervisorConfig{ + BinDir: p.cfg.BinDir, + GGUFPath: gguf, + SocketPath: p.cfg.SocketPath, + Transport: p.cfg.Transport, + CtxSize: p.cfg.CtxSize, + NGpuLayers: p.cfg.NGpuLayers, + NThreads: p.cfg.NThreads, + BatchSize: p.cfg.BatchSize, + StartupSec: p.cfg.StartupSec, + Model: p.cfg.Model, + } + sup, err := newSupervisor(ctx, supCfg, p.logger) + if err != nil { + return err + } + p.sup = sup + p.started.Store(true) + return nil +} + +// Stop tears the supervisor down within ctx. Idempotent. +func (p *Provider) Stop(ctx context.Context) error { + p.mu.Lock() + sup := p.sup + p.sup = nil + p.started.Store(false) + p.mu.Unlock() + if sup == nil { + return nil + } + return sup.Stop(ctx) +} + +// Ready reports nil when llama-server is alive and responding to +// /health, provider.ErrUnrecoverable when the restart budget is +// exhausted, or provider.ErrNotReady while warming up. +func (p *Provider) Ready(ctx context.Context) error { + p.mu.Lock() + sup := p.sup + p.mu.Unlock() + if sup == nil { + return provider.ErrNotReady + } + if sup.dead.Load() { + return provider.ErrUnrecoverable + } + return sup.Ready(ctx) +} + +// Status returns the dashboard snapshot. +func (p *Provider) Status() provider.Status { + p.mu.Lock() + sup := p.sup + p.mu.Unlock() + st := provider.Status{ + ManagesProcess: true, + Model: p.cfg.Model, + } + if sup == nil { + st.State = provider.StateDisabled + return st + } + src := sup.Status() + st.PID = src.PID + st.UptimeSeconds = int64(src.Uptime.Seconds()) + st.LastError = src.LastError + switch src.State { + case "running": + st.State = provider.StateRunning + case "failed": + st.State = provider.StateFailed + case "starting": + st.State = provider.StateStarting + default: + st.State = src.State + } + return st +} + +// EmbedQuery prepends the asymmetric-retrieval prefix and embeds a +// single query. +func (p *Provider) EmbedQuery(ctx context.Context, query string) ([]float32, error) { + p.mu.Lock() + sup := p.sup + p.mu.Unlock() + if sup == nil { + return nil, provider.ErrNotReady + } + if sup.dead.Load() { + return nil, provider.ErrUnrecoverable + } + if err := p.waitReady(ctx, sup); err != nil { + return nil, err + } + text := ResolveQueryPrefix(p.cfg.Model) + query + vecs, err := sup.client.Embeddings(ctx, []string{text}) + if err != nil { + return nil, err + } + return vecs[0], nil +} + +// EmbedDocuments embeds passages unchanged. +func (p *Provider) EmbedDocuments(ctx context.Context, texts []string) ([][]float32, error) { + if len(texts) == 0 { + return nil, nil + } + p.mu.Lock() + sup := p.sup + p.mu.Unlock() + if sup == nil { + return nil, provider.ErrNotReady + } + if sup.dead.Load() { + return nil, provider.ErrUnrecoverable + } + if err := p.waitReady(ctx, sup); err != nil { + return nil, err + } + return sup.client.Embeddings(ctx, texts) +} + +// TokenizeAndEmbed is the token-aware embedding pipeline. For each +// text it: +// 1. Calls /tokenize to get token IDs (CLS + content + SEP). +// 2. Splits sequences longer than CtxSize at token boundaries. +// 3. Embeds all sequences in one /v1/embeddings call. +// 4. Averages sub-window vectors back to one vector per text. +func (p *Provider) TokenizeAndEmbed(ctx context.Context, texts []string) ([][]float32, error) { + if len(texts) == 0 { + return nil, nil + } + p.mu.Lock() + sup := p.sup + maxTokens := p.cfg.CtxSize + p.mu.Unlock() + if sup == nil { + return nil, provider.ErrNotReady + } + if sup.dead.Load() { + return nil, provider.ErrUnrecoverable + } + if err := p.waitReady(ctx, sup); err != nil { + return nil, err + } + + type span struct{ start, length int } + spans := make([]span, len(texts)) + var sequences [][]int + + for i, text := range texts { + ids, err := sup.client.Tokenize(ctx, text) + if err != nil { + return nil, fmt.Errorf("tokenize text[%d]: %w", i, err) + } + + if len(ids) == 0 { + spans[i] = span{start: len(sequences), length: 1} + sequences = append(sequences, []int{}) + continue + } + + if len(ids) <= maxTokens { + spans[i] = span{start: len(sequences), length: 1} + sequences = append(sequences, ids) + continue + } + + cls := ids[0] + sep := ids[len(ids)-1] + content := ids[1 : len(ids)-1] + // Reserve two slots for the CLS/SEP tokens. Guard against a + // pathologically small CtxSize (<= 2): windowSize must be >= 1 + // or the split loop below never advances `start` and spins + // forever while holding a queue slot. + windowSize := max(maxTokens-2, 1) + + spanStart := len(sequences) + for start := 0; start < len(content); start += windowSize { + end := start + windowSize + if end > len(content) { + end = len(content) + } + window := make([]int, 0, end-start+2) + window = append(window, cls) + window = append(window, content[start:end]...) + window = append(window, sep) + sequences = append(sequences, window) + } + spans[i] = span{start: spanStart, length: len(sequences) - spanStart} + } + + allVecs, err := sup.client.EmbedBatchTokenIDs(ctx, sequences) + if err != nil { + return nil, err + } + + result := make([][]float32, len(texts)) + for i, sp := range spans { + if sp.length == 1 { + result[i] = allVecs[sp.start] + continue + } + dim := len(allVecs[sp.start]) + avg := make([]float32, dim) + for k := 0; k < sp.length; k++ { + v := allVecs[sp.start+k] + for d := range avg { + avg[d] += v[d] + } + } + n := float32(sp.length) + for d := range avg { + avg[d] /= n + } + result[i] = avg + } + return result, nil +} + +// EmbedRaw is a parity-test helper: skip prefix, embed texts verbatim. +// Lower-case in production paths to discourage misuse; the parity +// test file in the embeddings package needs cross-package access so +// it lives upper-cased here. +func (p *Provider) EmbedRaw(ctx context.Context, texts []string) ([][]float32, error) { + if len(texts) == 0 { + return nil, nil + } + p.mu.Lock() + sup := p.sup + p.mu.Unlock() + if sup == nil { + return nil, provider.ErrNotReady + } + return sup.client.Embeddings(ctx, texts) +} + +// CacheDir returns the configured GGUF cache directory. Used by the +// admin /models endpoint to enumerate cached files. +func (p *Provider) CacheDir() string { return p.cfg.CacheDir } + +// waitReady waits up to 5 seconds for the supervisor's child to be +// ready. The 5s window is short — a healthy steady-state Service is +// always ready in <1ms; during restarts the queue has already been +// drained / blocked, so callers waiting here are by design rare. +func (p *Provider) waitReady(ctx context.Context, sup *supervisor) error { + readyCtx, cancel := context.WithTimeout(ctx, 5*time.Second) + defer cancel() + return sup.Ready(readyCtx) +} diff --git a/server/internal/embeddings/provider/ollama/storagecomponents_test.go b/server/internal/embeddings/provider/ollama/storagecomponents_test.go new file mode 100644 index 0000000..44e280d --- /dev/null +++ b/server/internal/embeddings/provider/ollama/storagecomponents_test.go @@ -0,0 +1,29 @@ +package ollama + +import ( + "reflect" + "testing" +) + +func TestStorageComponents(t *testing.T) { + got := New(Config{Model: "nomic-embed-text"}, nil).StorageComponents() + if want := []string{"ollama", "nomic_embed_text"}; !reflect.DeepEqual(got, want) { + t.Errorf("StorageComponents = %v, want %v", got, want) + } +} + +// TestStorageComponents_KindNotGluedToModel is the #8 anti-collision guard: +// the provider kind is its own path segment, so a model literally named +// "ollama-foo" (slug "ollama_foo") never shares a namespace with model +// "foo". Under the old flat single-slug scheme both collapsed to +// "chroma_ollama_foo". +func TestStorageComponents_KindNotGluedToModel(t *testing.T) { + foo := New(Config{Model: "foo"}, nil).StorageComponents() + ollamaFoo := New(Config{Model: "ollama-foo"}, nil).StorageComponents() + if reflect.DeepEqual(foo, ollamaFoo) { + t.Fatalf("distinct models must not share a path: %v == %v", foo, ollamaFoo) + } + if got, want := ollamaFoo[len(ollamaFoo)-1], "ollama_foo"; got != want { + t.Errorf("model slug = %q, want %q", got, want) + } +} diff --git a/server/internal/embeddings/supervisor.go b/server/internal/embeddings/provider/ollama/supervisor.go similarity index 97% rename from server/internal/embeddings/supervisor.go rename to server/internal/embeddings/provider/ollama/supervisor.go index 70b27e0..bd17111 100644 --- a/server/internal/embeddings/supervisor.go +++ b/server/internal/embeddings/provider/ollama/supervisor.go @@ -1,4 +1,4 @@ -package embeddings +package ollama import ( "context" @@ -15,6 +15,8 @@ import ( "sync/atomic" "syscall" "time" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" ) // darwinSunPathMax is the platform limit for unix socket paths on macOS. @@ -270,7 +272,7 @@ func (s *supervisor) spawn(ctx context.Context) error { s.lastSpawnErr.Store(err.Error()) s.killGroup() <-s.waiterDone - return fmt.Errorf("%w: %v", ErrNotReady, err) + return fmt.Errorf("%w: %v", provider.ErrNotReady, err) } close(s.readySignal) s.lastSpawnErr.Store("") // clear any stale error from a prior failed start @@ -395,15 +397,20 @@ func pruneRestarts(ts []time.Time, now time.Time, window time.Duration) []time.T // the graceful path failed. The caller's context controls the deadline — // main.go already uses a 10s shutdown context. func (s *supervisor) Stop(ctx context.Context) error { + // Snapshot cmd + waiterDone together under the lock: a crash-driven + // spawn() reassigns s.waiterDone under s.mu, so reading the field bare + // races that write. Use the local for every wait below. + s.mu.RLock() + cmd := s.cmd + waiterDone := s.waiterDone + s.mu.RUnlock() + if !s.stopping.CompareAndSwap(false, true) { // Already stopping; just wait for the existing teardown. - <-s.waiterDone + <-waiterDone return nil } - s.mu.RLock() - cmd := s.cmd - s.mu.RUnlock() if cmd == nil || cmd.Process == nil { return nil } @@ -416,7 +423,7 @@ func (s *supervisor) Stop(ctx context.Context) error { _ = syscall.Kill(-pgid, syscall.SIGTERM) select { - case <-s.waiterDone: + case <-waiterDone: // Also clean up the socket file so a subsequent run does not trip on it. if s.cfg.Transport == "unix" { _ = os.Remove(s.cfg.SocketPath) @@ -425,7 +432,7 @@ func (s *supervisor) Stop(ctx context.Context) error { case <-ctx.Done(): s.logger.Warn("SIGTERM timed out, sending SIGKILL", "pgid", pgid) _ = syscall.Kill(-pgid, syscall.SIGKILL) - <-s.waiterDone + <-waiterDone if s.cfg.Transport == "unix" { _ = os.Remove(s.cfg.SocketPath) } @@ -573,7 +580,7 @@ func (s *supervisor) Status() SupervisorStatus { // Ready blocks until the current child is ready or ctx expires. func (s *supervisor) Ready(ctx context.Context) error { if s.dead.Load() { - return ErrSupervisor + return provider.ErrUnrecoverable } s.mu.RLock() ch := s.readySignal diff --git a/server/internal/embeddings/provider/openai/factory.go b/server/internal/embeddings/provider/openai/factory.go new file mode 100644 index 0000000..5d69c91 --- /dev/null +++ b/server/internal/embeddings/provider/openai/factory.go @@ -0,0 +1,57 @@ +package openai + +import ( + "encoding/json" + "fmt" + "log/slog" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" +) + +const defaultAPIKeyEnv = "CIX_OPENAI_API_KEY" + +type factory struct{} + +func (factory) Kind() string { return provider.KindOpenAI } + +func (factory) SchemaJSON() []byte { + s := provider.ConfigSchema{ + Fields: []provider.ConfigField{ + {Name: "base_url", Label: "Base URL", Kind: "string", Required: true, Default: "https://api.openai.com", Description: "Server origin without /v1 suffix."}, + {Name: "model", Label: "Model", Kind: "string", Required: true, Default: "text-embedding-3-small"}, + {Name: "api_key_env", Label: "API key env var", Kind: "secret-env", Required: true, Default: defaultAPIKeyEnv, Description: "Server-side env var name that holds the API key."}, + {Name: "dimensions", Label: "Dimensions", Kind: "int", Description: "Optional Matryoshka shrink (text-embedding-3*)."}, + }, + } + b, _ := json.Marshal(s) + return b +} + +func (factory) SecretEnvVars() []string { return []string{defaultAPIKeyEnv} } + +func (factory) Build(cfg []byte, secrets provider.SecretLookup, logger *slog.Logger) (provider.Provider, error) { + if logger == nil { + logger = slog.Default() + } + if len(cfg) == 0 { + return nil, fmt.Errorf("openai: empty config") + } + var c Config + if err := json.Unmarshal(cfg, &c); err != nil { + return nil, fmt.Errorf("openai: unmarshal config: %w", err) + } + if c.Model == "" { + return nil, fmt.Errorf("openai: model is required") + } + if c.BaseURL == "" { + c.BaseURL = "https://api.openai.com" + } + if c.APIKeyEnv == "" { + c.APIKeyEnv = defaultAPIKeyEnv + } + return New(c, secrets, logger), nil +} + +func init() { + provider.Register(factory{}) +} diff --git a/server/internal/embeddings/provider/openai/openai.go b/server/internal/embeddings/provider/openai/openai.go new file mode 100644 index 0000000..d152c2c --- /dev/null +++ b/server/internal/embeddings/provider/openai/openai.go @@ -0,0 +1,244 @@ +// Package openai implements provider.Provider against any +// OpenAI-compatible /v1/embeddings endpoint (OpenAI proper, vLLM, +// TEI, LocalAI, Ollama's own openai-compat endpoint, …). All +// providers share the same request/response shape; the differences +// are only the base URL and which API key env var to read. +package openai + +import ( + "bytes" + "context" + "encoding/json" + "errors" + "fmt" + "io" + "log/slog" + "net/http" + "strconv" + "time" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" +) + +// Config is the persisted shape of the openai provider's config blob. +type Config struct { + BaseURL string `json:"base_url"` + Model string `json:"model"` + APIKeyEnv string `json:"api_key_env"` + Dimensions int `json:"dimensions,omitempty"` +} + +// maxBatchSize caps how many inputs we send in a single +// /v1/embeddings POST. OpenAI proper accepts up to 2048 inputs +// per request for text-embedding-3-*; self-hosted clones (vLLM, +// TEI, LocalAI) may be tighter but rarely lower than that. The +// split is transparent to callers — same queue slot, sequential +// sub-batches. +const maxBatchSize = 2048 + +// Provider is the openai-compatible HTTP client wrapped behind the +// provider.Provider interface. +type Provider struct { + cfg Config + logger *slog.Logger + secrets provider.SecretLookup + http *http.Client +} + +// New constructs the Provider. Does not contact the endpoint — call +// Start to perform a one-shot connect test. +func New(cfg Config, secrets provider.SecretLookup, logger *slog.Logger) *Provider { + if logger == nil { + logger = slog.Default() + } + // Normalise away a trailing slash so url building (BaseURL + + // "/v1/embeddings") never produces a double slash, which stricter + // OpenAI-compatible servers (vLLM/TEI behind a proxy) can 404 on. + cfg.BaseURL = provider.NormalizeBaseURL(cfg.BaseURL) + return &Provider{ + cfg: cfg, + logger: logger, + secrets: secrets, + http: &http.Client{Timeout: 60 * time.Second}, + } +} + +func (p *Provider) Kind() string { return provider.KindOpenAI } + +// ID is "openai:{model}[:{dim}]". The dimension is part of the ID only +// when explicitly configured (Matryoshka shrink via the `dimensions` +// param) — otherwise different model versions are distinguished by +// the model name alone. +func (p *Provider) ID() string { + if p.cfg.Dimensions > 0 { + return "openai:" + p.cfg.Model + ":" + strconv.Itoa(p.cfg.Dimensions) + } + return "openai:" + p.cfg.Model +} + +func (p *Provider) Dimension() int { return p.cfg.Dimensions } +func (p *Provider) SupportsTokenize() bool { return false } + +// StorageComponents namespaces the vector store as +// openai/[/]. The dimension is a path segment only when +// explicitly configured (Matryoshka shrink), mirroring ID(). +func (p *Provider) StorageComponents() []string { + comps := []string{provider.KindOpenAI, provider.StorageSlug(p.cfg.Model)} + if p.cfg.Dimensions > 0 { + comps = append(comps, strconv.Itoa(p.cfg.Dimensions)) + } + return comps +} + +// Start runs a one-shot connect test: embed a single short string. +// Surfaces auth / network errors before the provider is wired into +// the request path. +func (p *Provider) Start(ctx context.Context) error { + if p.cfg.BaseURL == "" { + return errors.New("openai: base_url is required") + } + if p.cfg.Model == "" { + return errors.New("openai: model is required") + } + if _, ok := p.apiKey(); !ok { + return fmt.Errorf("%w: %s", provider.ErrMissingAPIKey, p.cfg.APIKeyEnv) + } + testCtx, cancel := context.WithTimeout(ctx, 30*time.Second) + defer cancel() + _, err := p.embed(testCtx, []string{"ping"}) + if err != nil { + return fmt.Errorf("openai: connect test failed: %w", err) + } + return nil +} + +// Stop is a no-op — the provider holds no managed process or +// long-lived connection. +func (p *Provider) Stop(_ context.Context) error { return nil } + +// Ready returns nil if the API key is set. We do NOT ping the remote +// on every Ready call (which the /status footer polls every 30s) — +// remote outages surface as real embed failures with diagnostics; an +// always-green footer dot for HTTP-only providers matches the +// dashboard's documented behaviour. +func (p *Provider) Ready(_ context.Context) error { + return provider.RemoteReady(p.secrets, p.cfg.APIKeyEnv) +} + +func (p *Provider) Status() provider.Status { + return provider.RemoteStatus(p.cfg.Model, p.cfg.APIKeyEnv, p.secrets) +} + +// EmbedQuery is a pass-through to EmbedDocuments — generic +// OpenAI-compatible servers have no query/document differentiation. +func (p *Provider) EmbedQuery(ctx context.Context, query string) ([]float32, error) { + vecs, err := p.embed(ctx, []string{query}) + if err != nil { + return nil, err + } + return vecs[0], nil +} + +func (p *Provider) EmbedDocuments(ctx context.Context, texts []string) ([][]float32, error) { + if len(texts) == 0 { + return nil, nil + } + if len(texts) <= maxBatchSize { + return p.embed(ctx, texts) + } + out := make([][]float32, 0, len(texts)) + for i := 0; i < len(texts); i += maxBatchSize { + end := i + maxBatchSize + if end > len(texts) { + end = len(texts) + } + part, err := p.embed(ctx, texts[i:end]) + if err != nil { + return nil, fmt.Errorf("openai: sub-batch [%d:%d]: %w", i, end, err) + } + out = append(out, part...) + } + return out, nil +} + +// TokenizeAndEmbed falls back to EmbedDocuments — generic openai-style +// servers expose neither /tokenize nor reliable input token counts. +// The indexer's chunking step pre-truncates inputs to a conservative +// limit when SupportsTokenize() is false. +func (p *Provider) TokenizeAndEmbed(ctx context.Context, texts []string) ([][]float32, error) { + return p.EmbedDocuments(ctx, texts) +} + +type embedRequest struct { + Input []string `json:"input"` + Model string `json:"model"` + Dimensions int `json:"dimensions,omitempty"` +} + +type embedResponseItem struct { + Embedding []float32 `json:"embedding"` + Index int `json:"index"` +} + +type embedResponse struct { + Data []embedResponseItem `json:"data"` +} + +// embed POSTs /v1/embeddings and returns vectors in input order. +func (p *Provider) embed(ctx context.Context, texts []string) ([][]float32, error) { + key, ok := p.apiKey() + if !ok { + return nil, fmt.Errorf("%w: %s", provider.ErrMissingAPIKey, p.cfg.APIKeyEnv) + } + body, err := json.Marshal(embedRequest{ + Input: texts, + Model: p.cfg.Model, + Dimensions: p.cfg.Dimensions, + }) + if err != nil { + return nil, fmt.Errorf("openai: marshal: %w", err) + } + url := p.cfg.BaseURL + "/v1/embeddings" + req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body)) + if err != nil { + return nil, fmt.Errorf("openai: build request: %w", err) + } + req.Header.Set("Content-Type", "application/json") + req.Header.Set("Authorization", "Bearer "+key) + + resp, err := p.http.Do(req) + if err != nil { + return nil, fmt.Errorf("openai: %w", err) + } + defer resp.Body.Close() + + if resp.StatusCode != http.StatusOK { + snippet, _ := io.ReadAll(io.LimitReader(resp.Body, 2048)) + return nil, fmt.Errorf("openai: status %d: %s", resp.StatusCode, string(snippet)) + } + + var er embedResponse + if err := json.NewDecoder(resp.Body).Decode(&er); err != nil { + return nil, fmt.Errorf("openai: decode: %w", err) + } + if len(er.Data) != len(texts) { + return nil, fmt.Errorf("openai: got %d vectors for %d inputs", len(er.Data), len(texts)) + } + out := make([][]float32, len(er.Data)) + for _, item := range er.Data { + if item.Index < 0 || item.Index >= len(out) { + return nil, fmt.Errorf("openai: out-of-range index %d", item.Index) + } + out[item.Index] = item.Embedding + } + for i, v := range out { + if v == nil { + return nil, fmt.Errorf("openai: missing vector at index %d", i) + } + } + return out, nil +} + +func (p *Provider) apiKey() (string, bool) { + return provider.ResolveAPIKey(p.secrets, p.cfg.APIKeyEnv) +} diff --git a/server/internal/embeddings/provider/openai/openai_test.go b/server/internal/embeddings/provider/openai/openai_test.go new file mode 100644 index 0000000..bf4f58c --- /dev/null +++ b/server/internal/embeddings/provider/openai/openai_test.go @@ -0,0 +1,224 @@ +package openai + +import ( + "context" + "encoding/json" + "errors" + "io" + "net/http" + "net/http/httptest" + "strings" + "testing" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" +) + +// fixedSecrets returns a SecretLookup that resolves a single +// (key, value) pair. Used by tests that don't want to touch +// os.Setenv (which would race other parallel tests). +func fixedSecrets(key, value string) provider.SecretLookup { + return func(name string) (string, bool) { + if name == key { + return value, true + } + return "", false + } +} + +// stubServer returns an httptest.Server that responds to one POST +// /v1/embeddings hit. The recorded request body is sent on the +// returned channel so the test can assert on it. +func stubServer(t *testing.T, status int, body string) (*httptest.Server, <-chan []byte) { + t.Helper() + gotBody := make(chan []byte, 1) + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + if r.Method != http.MethodPost || !strings.HasSuffix(r.URL.Path, "/v1/embeddings") { + http.NotFound(w, r) + return + } + if got := r.Header.Get("Authorization"); !strings.HasPrefix(got, "Bearer ") { + t.Errorf("missing Bearer auth header; got %q", got) + } + raw, _ := io.ReadAll(r.Body) + select { + case gotBody <- raw: + default: + } + w.WriteHeader(status) + _, _ = io.WriteString(w, body) + })) + t.Cleanup(srv.Close) + return srv, gotBody +} + +func TestEmbedDocumentsBatch(t *testing.T) { + srv, gotBody := stubServer(t, http.StatusOK, `{ + "data": [ + {"index": 1, "embedding": [0.2, 0.3]}, + {"index": 0, "embedding": [0.1, 0.4]} + ] + }`) + p := New(Config{ + BaseURL: srv.URL, + Model: "text-embedding-3-small", + APIKeyEnv: "TEST_KEY", + }, fixedSecrets("TEST_KEY", "sk-test"), nil) + + vecs, err := p.EmbedDocuments(context.Background(), []string{"first", "second"}) + if err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + // Result must be in input order even though the server returned them swapped. + if vecs[0][0] != 0.1 || vecs[1][0] != 0.2 { + t.Fatalf("ordering wrong: got %v", vecs) + } + + var req embedRequest + if err := json.Unmarshal(<-gotBody, &req); err != nil { + t.Fatalf("decode request: %v", err) + } + if req.Model != "text-embedding-3-small" { + t.Errorf("model %q", req.Model) + } + if len(req.Input) != 2 || req.Input[0] != "first" { + t.Errorf("input %v", req.Input) + } +} + +// TestBaseURLTrailingSlashNormalized guards L4: a base_url with a +// trailing slash must not produce a double-slash request path, which +// stricter OpenAI-compatible servers can 404 on. +func TestBaseURLTrailingSlashNormalized(t *testing.T) { + var gotPath string + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + gotPath = r.URL.Path + _, _ = io.WriteString(w, `{"data":[{"index":0,"embedding":[0.1]}]}`) + })) + t.Cleanup(srv.Close) + p := New(Config{ + BaseURL: srv.URL + "/", // trailing slash + Model: "m", + APIKeyEnv: "K", + }, fixedSecrets("K", "v"), nil) + if _, err := p.EmbedDocuments(context.Background(), []string{"x"}); err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + if gotPath != "/v1/embeddings" { + t.Errorf("request path = %q, want /v1/embeddings (no double slash)", gotPath) + } +} + +func TestEmbedDocumentsHTTPError(t *testing.T) { + srv, _ := stubServer(t, http.StatusUnauthorized, `{"error":"bad key"}`) + p := New(Config{ + BaseURL: srv.URL, + Model: "m", + APIKeyEnv: "K", + }, fixedSecrets("K", "v"), nil) + + _, err := p.EmbedDocuments(context.Background(), []string{"x"}) + if err == nil { + t.Fatal("expected error") + } + if !strings.Contains(err.Error(), "401") { + t.Errorf("error should surface status code: %v", err) + } +} + +func TestMissingAPIKey(t *testing.T) { + p := New(Config{ + BaseURL: "http://unused", + Model: "m", + APIKeyEnv: "MISSING_VAR", + }, fixedSecrets("OTHER", "v"), nil) + + _, err := p.EmbedDocuments(context.Background(), []string{"x"}) + if !errors.Is(err, provider.ErrMissingAPIKey) { + t.Fatalf("expected ErrMissingAPIKey, got %v", err) + } + + st := p.Status() + if st.State != provider.StateFailed { + t.Errorf("Status state %q, expected failed", st.State) + } +} + +func TestIDFingerprint(t *testing.T) { + cases := []struct { + cfg Config + want string + }{ + {Config{Model: "m"}, "openai:m"}, + {Config{Model: "m", Dimensions: 512}, "openai:m:512"}, + } + for _, tc := range cases { + p := New(tc.cfg, fixedSecrets("", ""), nil) + if got := p.ID(); got != tc.want { + t.Errorf("ID() = %q, want %q", got, tc.want) + } + } +} + +// TestEmbedDocumentsSplitsOversizeBatch covers the transparent +// per-provider split: OpenAI proper accepts up to 2048 inputs per +// /v1/embeddings POST. A 3000-item EmbedDocuments call must produce +// TWO POSTs (2048 + 952) and return all 3000 vectors in input order. +func TestEmbedDocumentsSplitsOversizeBatch(t *testing.T) { + posts := 0 + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + posts++ + raw, _ := io.ReadAll(r.Body) + var req embedRequest + _ = json.Unmarshal(raw, &req) + if len(req.Input) > 2048 { + t.Errorf("POST #%d carried %d inputs, expected <= 2048", posts, len(req.Input)) + } + items := make([]map[string]any, len(req.Input)) + for i := range req.Input { + items[i] = map[string]any{"index": i, "embedding": []float32{float32(i)}} + } + body, _ := json.Marshal(map[string]any{"data": items}) + w.WriteHeader(http.StatusOK) + _, _ = w.Write(body) + })) + t.Cleanup(srv.Close) + + p := New(Config{ + BaseURL: srv.URL, Model: "text-embedding-3-small", APIKeyEnv: "K", + }, fixedSecrets("K", "v"), nil) + + texts := make([]string, 3000) + for i := range texts { + texts[i] = "chunk" + } + vecs, err := p.EmbedDocuments(context.Background(), texts) + if err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + if got := len(vecs); got != 3000 { + t.Fatalf("got %d vectors, want 3000", got) + } + if posts != 2 { + t.Errorf("expected 2 POSTs (2048 + 952), got %d", posts) + } +} + +func TestEmbedDocumentsSendsDimensions(t *testing.T) { + srv, gotBody := stubServer(t, http.StatusOK, `{ + "data": [{"index": 0, "embedding": [0.1]}] + }`) + p := New(Config{ + BaseURL: srv.URL, + Model: "m", + APIKeyEnv: "K", + Dimensions: 256, + }, fixedSecrets("K", "v"), nil) + if _, err := p.EmbedDocuments(context.Background(), []string{"x"}); err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + var req embedRequest + _ = json.Unmarshal(<-gotBody, &req) + if req.Dimensions != 256 { + t.Errorf("dimensions should be 256, got %d", req.Dimensions) + } +} diff --git a/server/internal/embeddings/provider/openai/storagecomponents_test.go b/server/internal/embeddings/provider/openai/storagecomponents_test.go new file mode 100644 index 0000000..295cee9 --- /dev/null +++ b/server/internal/embeddings/provider/openai/storagecomponents_test.go @@ -0,0 +1,18 @@ +package openai + +import ( + "reflect" + "testing" +) + +func TestStorageComponents(t *testing.T) { + base := New(Config{Model: "text-embedding-3-large"}, nil, nil).StorageComponents() + if want := []string{"openai", "text_embedding_3_large"}; !reflect.DeepEqual(base, want) { + t.Errorf("StorageComponents = %v, want %v", base, want) + } + // Explicit Matryoshka dimension becomes a trailing path segment. + dim := New(Config{Model: "text-embedding-3-large", Dimensions: 256}, nil, nil).StorageComponents() + if want := []string{"openai", "text_embedding_3_large", "256"}; !reflect.DeepEqual(dim, want) { + t.Errorf("StorageComponents (dim) = %v, want %v", dim, want) + } +} diff --git a/server/internal/embeddings/provider/provider.go b/server/internal/embeddings/provider/provider.go new file mode 100644 index 0000000..9a251e6 --- /dev/null +++ b/server/internal/embeddings/provider/provider.go @@ -0,0 +1,231 @@ +// Package provider defines the pluggable embedding-backend abstraction +// used by embeddings.Service. Implementations live in sub-packages +// (provider/ollama, provider/openai, provider/voyage). The Service holds +// exactly one active Provider; switching provider at runtime is what the +// admin /api/v1/admin/embedding-providers/active endpoint does. +// +// Identity & reindex. Every Provider exposes ID() — a stable fingerprint +// string written to projects.indexed_with_model at index time. When the +// active provider's ID() changes (different kind, model, or dimension), +// every project's stored fingerprint is stale; the existing per-project +// drift check in internal/repojobs detects it on the next clone job and +// forces a full reindex. Provider switching therefore reuses the model- +// change pipeline unchanged. +// +// Lifecycle. Start is called once after construction (ollama spawns the +// child process; HTTP-only providers do a connect-test). Stop is called +// before switching to a different provider and on server shutdown. +package provider + +import ( + "context" + "errors" + "strings" +) + +// Kind enumerates the built-in provider kinds. New kinds are added by +// registering a Factory with that kind string. +const ( + KindOllama = "ollama" + KindOpenAI = "openai" + KindVoyage = "voyage" +) + +// Provider is the embedding backend abstraction. Implementations cover +// one upstream service each (the in-process llama-server sidecar, the +// OpenAI-compatible /v1/embeddings REST API, the Voyage AI REST API, …). +// +// Concurrency. All methods on a Provider must be safe for concurrent +// use; Service brackets them with a Queue (rate-limit / backpressure) +// but does not serialise them. +// +// Errors. Implementations should wrap upstream HTTP / process failures +// with enough context for an operator to diagnose. Use ErrNotReady when +// the backend is alive but not yet able to serve (e.g. ollama warming +// up); the Service layer treats it as a retriable busy signal. +type Provider interface { + // Kind returns the registered factory kind for this provider, e.g. + // "ollama", "openai", "voyage". + Kind() string + + // ID returns the fingerprint that uniquely identifies this provider + // configuration for the purposes of index invalidation. Format: + // "{kind}:{model}[:{dim}][:{dtype}]". The string is opaque to + // callers — they only compare it for equality. + ID() string + + // Dimension reports the embedding vector dimension this provider + // will produce. Used by the vector store to dimension the Chroma + // collection when it is first created. May be 0 if the dimension + // is only known after the first embed call; the vectorstore then + // infers it from the first upsert as before. + Dimension() int + + // SupportsTokenize reports whether the provider implements + // TokenizeAndEmbed natively. The Service uses this to decide + // whether to call TokenizeAndEmbed or fall back to EmbedDocuments + // in the indexer's chunking path. + SupportsTokenize() bool + + // Start prepares the provider for serving requests. For ollama + // this spawns the llama-server child process and blocks until the + // readiness probe succeeds. For HTTP-only providers it performs + // a one-shot connect-test against the configured endpoint with + // any provided API key. ctx bounds the startup; on failure the + // caller may try a different config without calling Stop first. + Start(ctx context.Context) error + + // Stop tears the provider down within the ctx deadline. Idempotent + // and safe to call on a provider that never Start()-ed. + Stop(ctx context.Context) error + + // Ready reports whether the provider is currently able to serve an + // embedding request. nil = ready. Returning ErrNotReady is the + // recommended busy signal during startup / restart windows. + Ready(ctx context.Context) error + + // Status returns a snapshot for the dashboard. Fields that are + // not meaningful for a given provider (e.g. PID for HTTP-only + // providers) should be zero-valued. + Status() Status + + // EmbedQuery embeds a single query string. Providers that support + // asymmetric retrieval apply their model-specific transform here + // (ollama prepends the model's query prefix; voyage sends + // input_type=query; openai applies nothing). + EmbedQuery(ctx context.Context, query string) ([]float32, error) + + // EmbedDocuments embeds a batch of passages / chunks. Returned + // vectors follow input order. Empty input is a no-op returning + // (nil, nil). + EmbedDocuments(ctx context.Context, texts []string) ([][]float32, error) + + // TokenizeAndEmbed is the token-aware embedding pipeline used by + // the indexer for chunks that may exceed the model's context + // window. Providers without native tokenization (SupportsTokenize + // returns false) may implement this as a pass-through to + // EmbedDocuments — callers must use SupportsTokenize() to decide + // whether to chunk inputs themselves. + TokenizeAndEmbed(ctx context.Context, texts []string) ([][]float32, error) + + // StorageComponents returns the on-disk path components that + // namespace this provider's vector store, MOST-significant first: + // {kind, model-slug[, variant]}. Each component is already + // filesystem-safe (run through StorageSlug at the source). The + // vector-store dir is filepath.Join(ChromaPersistDir, components...). + // + // Crucially these are STRUCTURED fields the provider knows directly + // — never derived by re-parsing the flattened ID() — so the kind + // (always its own path segment) can never collide with a model name + // that happens to normalise to "ollama_…"/"voyage_…". That collision + // is what the flat single-slug scheme suffered from. + StorageComponents() []string +} + +// State enumerates the dashboard-facing provider states surfaced via +// Status. Implementations should pick the closest match. +const ( + StateStarting = "starting" + StateRunning = "running" + StateFailed = "failed" + StateDisabled = "disabled" + // StateRemote marks an HTTP-only provider that has no managed + // process: it cannot fail to "start" beyond a config-time connect + // test, and uptime / pid / restart concepts do not apply. Footer + // indicator stays permanently green for this state. + StateRemote = "remote" +) + +// Status is the dashboard-facing snapshot of a provider's runtime +// state. Sidecar-specific fields (PID, Uptime, restart counts) are +// zero / empty for HTTP-only providers. +type Status struct { + // State is one of the State* constants above. + State string `json:"state"` + + // ManagesProcess reports whether this provider manages an + // in-process child (true for ollama, false for HTTP providers). + // The footer uses this to decide whether to render an + // alive/red-dot indicator or a permanent green dot. + ManagesProcess bool `json:"manages_process"` + + // Model is the human-readable model identifier (HF repo id, + // OpenAI model name, Voyage model name). + Model string `json:"model"` + + // PID is the child process id when ManagesProcess. Zero otherwise. + PID int `json:"pid,omitempty"` + + // UptimeSeconds is the time the current child has been alive. + // Zero for remote providers. + UptimeSeconds int64 `json:"uptime_seconds,omitempty"` + + // LastError surfaces the most recent spawn / health-probe / HTTP + // error so the dashboard can render it without grepping logs. + // Empty when healthy. + LastError string `json:"last_error,omitempty"` + + // InFlight reports queue depth at the Service layer. The Service + // fills this in after the provider returns its Status; providers + // should leave it at 0. + InFlight int `json:"in_flight"` +} + +// ErrNotReady signals that the provider is alive but not yet able to +// serve a request (e.g. ollama warming up after Start). The Service +// layer translates this to a busy-style 503 with a Retry-After hint. +var ErrNotReady = errors.New("provider: not ready") + +// ErrMissingAPIKey signals that the provider was constructed against +// a config naming an env-var that is not set at the moment of the +// call. The admin /test endpoint reports it verbatim so the dashboard +// can guide the operator to set the env var before saving. +var ErrMissingAPIKey = errors.New("provider: required API key env var is not set") + +// ErrUnrecoverable signals a terminal provider failure — e.g. the +// ollama sidecar exceeded its crash-restart budget. Subsequent calls +// return this until the provider is replaced (admin perspective) or +// the process is restarted. Caller maps to HTTP 503 without retry. +var ErrUnrecoverable = errors.New("provider: unrecoverable failure") + +// StorageSlug turns a Provider.ID() fingerprint into a filesystem-safe +// slug used to namespace the on-disk vector store directory, so each +// distinct embedding identity (kind + model + dim + dtype) gets its own +// chroma collection space. Switching providers therefore never mixes +// vectors of different dimensions in one collection, and switching back +// reuses the prior namespace without a reindex. +// +// Rules: lowercase, then replace every rune outside [a-z0-9_] (including +// '/', '-', ':') with '_'. Deliberately a pure per-rune map — no +// run-collapsing or trimming — so the transform is deterministic and +// idempotent. (It is not strictly injective: e.g. "a:b" and "a-b" both +// map to "a_b". That is harmless here because real Provider.ID() strings +// for a given kind never differ only in a separator — model names carry +// no ':' and dims/dtypes are fixed tokens.) Examples: +// +// "voyage:voyage-code-3:2048:float" → "voyage_voyage_code_3_2048_float" +// "ollama:awhiteside/CodeRankEmbed-Q8_0-GGUF" → "ollama_awhiteside_coderankembed_q8_0_gguf" +// "openai:text-embedding-3-large:256" → "openai_text_embedding_3_large_256" +// +// An empty ID yields an empty slug; callers guard against that. +func StorageSlug(id string) string { + lower := strings.ToLower(id) + var b strings.Builder + b.Grow(len(lower)) + for _, r := range lower { + switch { + case r >= 'a' && r <= 'z', r >= '0' && r <= '9', r == '_': + b.WriteRune(r) + default: + b.WriteByte('_') + } + } + return b.String() +} + +// SecretLookup resolves an env-var name to its current value at the +// moment of the call. Implementations must return (value, true) when +// the env var is set (even if empty), and ("", false) when it is +// unset. This is the only surface providers see for secrets; the +// raw value never lives in the Provider's config struct. +type SecretLookup func(envVarName string) (value string, ok bool) diff --git a/server/internal/embeddings/provider/storageslug_test.go b/server/internal/embeddings/provider/storageslug_test.go new file mode 100644 index 0000000..803ed61 --- /dev/null +++ b/server/internal/embeddings/provider/storageslug_test.go @@ -0,0 +1,38 @@ +package provider + +import "testing" + +func TestStorageSlug(t *testing.T) { + cases := []struct { + in, want string + }{ + {"voyage:voyage-code-3:2048:float", "voyage_voyage_code_3_2048_float"}, + {"ollama:awhiteside/CodeRankEmbed-Q8_0-GGUF", "ollama_awhiteside_coderankembed_q8_0_gguf"}, + {"openai:text-embedding-3-large:256", "openai_text_embedding_3_large_256"}, + {"OpenAI:Foo.Bar", "openai_foo_bar"}, // mixed case + dot + {"a b", "a_b"}, // space + {"", ""}, // empty + {"already_safe_123", "already_safe_123"}, // identity for safe chars + } + for _, tc := range cases { + if got := StorageSlug(tc.in); got != tc.want { + t.Errorf("StorageSlug(%q) = %q, want %q", tc.in, got, tc.want) + } + } +} + +// TestStorageSlugIdempotent ensures slugging an already-slugged string is +// a no-op (the chroma migration relies on this so re-running never double- +// transforms a name). +func TestStorageSlugIdempotent(t *testing.T) { + for _, in := range []string{ + "voyage:voyage-code-3:2048:float", + "ollama:awhiteside/CodeRankEmbed-Q8_0-GGUF", + } { + once := StorageSlug(in) + twice := StorageSlug(once) + if once != twice { + t.Errorf("StorageSlug not idempotent: %q -> %q -> %q", in, once, twice) + } + } +} diff --git a/server/internal/embeddings/provider/voyage/factory.go b/server/internal/embeddings/provider/voyage/factory.go new file mode 100644 index 0000000..64d12c5 --- /dev/null +++ b/server/internal/embeddings/provider/voyage/factory.go @@ -0,0 +1,70 @@ +package voyage + +import ( + "encoding/json" + "fmt" + "log/slog" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" +) + +const defaultAPIKeyEnv = "CIX_VOYAGE_API_KEY" + +type factory struct{} + +func (factory) Kind() string { return provider.KindVoyage } + +func (factory) SchemaJSON() []byte { + s := provider.ConfigSchema{ + Fields: []provider.ConfigField{ + { + Name: "model", Label: "Model", Kind: "enum", Required: true, + Enum: []string{"voyage-code-3", "voyage-3-large", "voyage-3", "voyage-3-lite", "voyage-code-2"}, + Default: "voyage-code-3", + }, + { + Name: "output_dimension", Label: "Output dimension", Kind: "enum", + Enum: []string{"256", "512", "1024", "2048"}, Default: "1024", + Description: "Matryoshka shrink. Changing this triggers a full reindex.", + }, + { + Name: "output_dtype", Label: "Output dtype", Kind: "enum", + Enum: []string{DtypeFloat, DtypeInt8}, Default: DtypeFloat, + Description: "int8 is dequantized to float32 on the server side.", + }, + {Name: "truncation", Label: "Truncate over-length input", Kind: "bool", Default: true}, + {Name: "api_key_env", Label: "API key env var", Kind: "secret-env", Required: true, Default: defaultAPIKeyEnv}, + }, + } + b, _ := json.Marshal(s) + return b +} + +func (factory) SecretEnvVars() []string { return []string{defaultAPIKeyEnv} } + +func (factory) Build(cfg []byte, secrets provider.SecretLookup, logger *slog.Logger) (provider.Provider, error) { + if logger == nil { + logger = slog.Default() + } + if len(cfg) == 0 { + return nil, fmt.Errorf("voyage: empty config") + } + var c Config + if err := json.Unmarshal(cfg, &c); err != nil { + return nil, fmt.Errorf("voyage: unmarshal config: %w", err) + } + if c.Model == "" { + return nil, fmt.Errorf("voyage: model is required") + } + if c.APIKeyEnv == "" { + c.APIKeyEnv = defaultAPIKeyEnv + } + if c.OutputDtype == "" { + c.OutputDtype = DtypeFloat + } + return New(c, secrets, logger), nil +} + +func init() { + provider.Register(factory{}) +} diff --git a/server/internal/embeddings/provider/voyage/storagecomponents_test.go b/server/internal/embeddings/provider/voyage/storagecomponents_test.go new file mode 100644 index 0000000..cfa82fe --- /dev/null +++ b/server/internal/embeddings/provider/voyage/storagecomponents_test.go @@ -0,0 +1,18 @@ +package voyage + +import ( + "reflect" + "testing" +) + +func TestStorageComponents(t *testing.T) { + got := New(Config{Model: "voyage-code-3", OutputDimension: 2048, OutputDtype: "int8"}, nil, nil).StorageComponents() + if want := []string{"voyage", "voyage_code_3", "2048_int8"}; !reflect.DeepEqual(got, want) { + t.Errorf("StorageComponents = %v, want %v", got, want) + } + // Unset dimension → "auto" variant prefix (mirrors ID()). + auto := New(Config{Model: "voyage-3", OutputDtype: "float"}, nil, nil).StorageComponents() + if want := []string{"voyage", "voyage_3", "auto_float"}; !reflect.DeepEqual(auto, want) { + t.Errorf("StorageComponents (auto) = %v, want %v", auto, want) + } +} diff --git a/server/internal/embeddings/provider/voyage/voyage.go b/server/internal/embeddings/provider/voyage/voyage.go new file mode 100644 index 0000000..c43ea96 --- /dev/null +++ b/server/internal/embeddings/provider/voyage/voyage.go @@ -0,0 +1,776 @@ +// Package voyage implements provider.Provider against the Voyage AI +// embeddings API (https://api.voyageai.com/v1/embeddings). +// +// Voyage diverges from the OpenAI shape in three ways we care about: +// - input_type: "query" vs "document" — required for retrieval +// quality. EmbedQuery sends "query"; EmbedDocuments sends +// "document". +// - output_dimension: Matryoshka shrink, configured by the admin +// (256/512/1024/2048). Part of Provider.ID() because changing it +// invalidates the existing index. +// - output_dtype: float|int8 (binary/ubinary are out of scope — +// chromem-go has no hamming search). For int8 the server returns +// a list of integers per dimension; we dequantize to float32 in +// this package before returning vectors to the vector store. +// +// usage. Voyage omits prompt_tokens from the usage object — only +// total_tokens is present. The response struct therefore has its own +// shape distinct from OpenAI's. +package voyage + +import ( + "bytes" + "context" + "encoding/base64" + "encoding/json" + "errors" + "fmt" + "io" + "log/slog" + "net/http" + "regexp" + "strconv" + "time" + "unicode/utf8" + + "golang.org/x/time/rate" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" +) + +// voyageBatchTooLargeRegex matches Voyage's per-batch token-limit +// 400 response so the caller can react adaptively. Voyage's message +// is fairly stable: +// +// "The max allowed tokens per submitted batch is 120000. +// Your batch has 187609 tokens after truncation." +// +// We capture both numbers; the actual count drives how aggressively +// the caller bisects. +var voyageBatchTooLargeRegex = regexp.MustCompile( + `max allowed tokens per submitted batch is (\d+).*Your batch has (\d+) tokens`, +) + +// parseBatchTooLarge tries to extract (cap, actual) token counts +// from a Voyage 400 message. Returns (0, 0, false) when the message +// doesn't match — e.g. a different 400 like "model not found". +func parseBatchTooLarge(errMsg string) (cap, actual int, ok bool) { + m := voyageBatchTooLargeRegex.FindStringSubmatch(errMsg) + if len(m) < 3 { + return 0, 0, false + } + c, err1 := strconv.Atoi(m[1]) + a, err2 := strconv.Atoi(m[2]) + if err1 != nil || err2 != nil { + return 0, 0, false + } + return c, a, true +} + +// DefaultBaseURL is the public Voyage AI embeddings endpoint origin. +const DefaultBaseURL = "https://api.voyageai.com" + +// Supported dtypes. binary/ubinary intentionally absent for v1. +const ( + DtypeFloat = "float" + DtypeInt8 = "int8" +) + +// defaultMaxBatchSize is the static safe default for inputs per POST +// when the operator has not configured an explicit MaxInputsPerRequest +// in the provider config. Voyage's voyage-code-* models cap at 128; +// voyage-3* accept up to 1000. We pick the lower bound so a single +// default works across all models without 422s. +const defaultMaxBatchSize = 128 + +// defaultMaxTokensPerBatch is the static safe default for total +// estimated tokens per POST when the operator has not configured an +// explicit MaxTokensPerRequest. Voyage's hard limit (observed in 400 +// responses) is 120K; we target 100K to leave 17% headroom for the +// byte→token estimation error. +const defaultMaxTokensPerBatch = 100_000 + +// defaultMaxInputBytes caps the byte-length of any SINGLE input +// (one chunk) before it goes to Voyage. When a chunk exceeds this +// the provider splits it into non-overlapping byte windows and +// averages the resulting per-window vectors — same pattern as the +// ollama provider's TokenizeAndEmbed, but byte-based here because +// Voyage doesn't expose a tokenize endpoint. +// +// Sized for voyage-code-3's 32K-token per-input context window +// with headroom: worst-case dense code can hit ~1 byte / token, +// so 30K bytes ≈ 30K tokens, leaving the model 2K tokens of +// margin. Prose typically has ~4 bytes / token, so a 30K-byte +// English passage is ~7.5K tokens — well under the cap. +const defaultMaxInputBytes = 30_000 + +// bytesPerToken is the chars-per-token heuristic used to estimate +// token cost without a real tokenizer. Voyage's own docs at +// https://docs.voyageai.com/docs/tokenization recommend "dividing +// character count by 5" for English prose, but cix is primarily a +// CODE-indexing workload and dense source code runs much hotter: +// production logs against voyage-code-3 show 1 token ≈ 1.4 bytes +// for tight Go/Rust files, so a /5 estimate under-counts by +// ~3.6×. That's exactly the kind of error that ships an +// estimated-51K-token batch into a real 187K-token POST and +// triggers Voyage's 120K hard-cap 400. +// +// 2 is the safer baseline: matches code reality within a small +// margin, and over-estimates prose by ~2.5× (fewer inputs packed +// per batch — more round-trips, never a 400). The +// embedWithAdaptiveSplit bisect remains as a residual safety net +// for outliers (Voyage publishes real HF tokenizers on +// huggingface.co/voyageai/voyage-* but pulling one in here would +// require a CGO Rust dep — deferred to a follow-up). +// +// len() in Go returns BYTE length, not rune count, so multi-byte +// UTF-8 input (Cyrillic comments, CJK) gets over-counted relative +// to Voyage's character-based heuristic — safe direction (more +// splits, never fewer). +const bytesPerToken = 2 + +// Config is the persisted shape of the voyage provider's config blob. +type Config struct { + BaseURL string `json:"base_url,omitempty"` + APIKeyEnv string `json:"api_key_env"` + Model string `json:"model"` + OutputDimension int `json:"output_dimension,omitempty"` + OutputDtype string `json:"output_dtype,omitempty"` + Truncation bool `json:"truncation,omitempty"` + + // RateLimitRPM caps requests-per-minute the provider will emit. + // 0 = no client-side throttling (rely on Voyage to 429 us). When + // >0, a token-bucket waits before each POST so we don't exceed + // the configured rate. The operator sets this from the Voyage + // dashboard's "Rate Limits" page to match their account tier. + RateLimitRPM int `json:"rate_limit_rpm,omitempty"` + + // RateLimitTPM caps tokens-per-minute (estimated, summed across + // all in-flight + recent requests). 0 = no throttling. + RateLimitTPM int `json:"rate_limit_tpm,omitempty"` + + // MaxInputsPerRequest overrides defaultMaxBatchSize. 0 = use + // the default (128, safe for voyage-code-*). Operators running + // only voyage-3* may bump this to 1000 for fewer round-trips. + MaxInputsPerRequest int `json:"max_inputs_per_request,omitempty"` + + // MaxTokensPerRequest overrides defaultMaxTokensPerBatch. 0 = + // use the default (100K with 20K headroom from Voyage's 120K + // hard cap). + MaxTokensPerRequest int `json:"max_tokens_per_request,omitempty"` + + // MaxInputBytes caps the byte-length of any SINGLE input before + // the provider splits it into byte-aligned windows + averages + // the resulting vectors. 0 → use defaultMaxInputBytes (sized + // for voyage-code-3's 32K-token per-input context window with + // margin). The operator only needs to override when running a + // model with a substantially larger context (e.g. future + // voyage-* with 64K context) or a different bytes-per-token + // regime (heavily non-ASCII content). + MaxInputBytes int `json:"max_input_bytes,omitempty"` +} + +// maxBatchSize returns the effective per-POST input cap: explicit +// config override, falling back to the static default. +func (c *Config) maxBatchSize() int { + if c.MaxInputsPerRequest > 0 { + return c.MaxInputsPerRequest + } + return defaultMaxBatchSize +} + +// maxTokensPerBatch returns the effective per-POST token cap. +func (c *Config) maxTokensPerBatch() int { + if c.MaxTokensPerRequest > 0 { + return c.MaxTokensPerRequest + } + return defaultMaxTokensPerBatch +} + +// maxInputBytes returns the effective per-input byte cap (defines +// when splitOversizeInput kicks in). +func (c *Config) maxInputBytes() int { + if c.MaxInputBytes > 0 { + return c.MaxInputBytes + } + return defaultMaxInputBytes +} + +// splitOversizeInput slices text into non-overlapping byte windows +// no larger than maxBytes each, aligned to UTF-8 rune boundaries so +// we never cut a multi-byte character mid-sequence. Returns the +// original text in a single-element slice when it's already small +// enough — common case is zero allocations beyond the slice header. +// +// Why byte-based rather than token-aligned: Voyage doesn't expose a +// /tokenize endpoint we can call client-side. The ollama provider +// gets to split at exact token boundaries (CLS + content_window + +// SEP) because llama-server tokenises for us. Voyage's real +// tokenizer is opaque, so we approximate with bytes. The adaptive +// bisect on 400 (see embedWithAdaptiveSplit) is the safety net +// when this approximation under-counts. +func splitOversizeInput(text string, maxBytes int) []string { + if maxBytes <= 0 || len(text) <= maxBytes { + return []string{text} + } + var windows []string + start := 0 + for start < len(text) { + end := start + maxBytes + if end >= len(text) { + windows = append(windows, text[start:]) + break + } + // Walk backward to the nearest rune-start byte so we never + // split a multi-byte UTF-8 character in the middle. + // utf8.RuneStart returns true for ASCII (single-byte) and + // for the leading byte of a multi-byte sequence. + for end > start && !utf8.RuneStart(text[end]) { + end-- + } + if end == start { + // Degenerate: maxBytes < the length of the next rune. + // Cut at the original boundary to make progress; the + // resulting partial codepoint is still bytes Voyage + // can tokenise (just less meaningfully). In practice + // maxBytes is in the tens of thousands so this branch + // is unreachable on real input. + end = start + maxBytes + } + windows = append(windows, text[start:end]) + start = end + } + return windows +} + +// Provider is the Voyage HTTP client. +type Provider struct { + cfg Config + logger *slog.Logger + secrets provider.SecretLookup + http *http.Client + + // reqLimiter caps requests-per-minute when cfg.RateLimitRPM > 0. + // nil when no throttling is configured. Token-bucket with burst + // = 1 — we don't allow client-side bursts, since the upstream + // budget is a sliding minute and bursting saves nothing. + reqLimiter *rate.Limiter + + // tokenLimiter caps tokens-per-minute when cfg.RateLimitTPM > 0. + // Burst is set to maxTokensPerBatch so a single full-budget POST + // can pass even when the bucket is otherwise empty (we'd just + // wait longer afterward). nil when no throttling. + tokenLimiter *rate.Limiter +} + +// New constructs the Provider. Does not contact the endpoint. +func New(cfg Config, secrets provider.SecretLookup, logger *slog.Logger) *Provider { + if logger == nil { + logger = slog.Default() + } + if cfg.BaseURL == "" { + cfg.BaseURL = DefaultBaseURL + } + // Normalise away a trailing slash so url building (BaseURL + + // "/v1/embeddings") never produces a double slash, which stricter + // OpenAI-compatible proxies in front of Voyage can 404 on. + cfg.BaseURL = provider.NormalizeBaseURL(cfg.BaseURL) + if cfg.OutputDtype == "" { + cfg.OutputDtype = DtypeFloat + } + p := &Provider{ + cfg: cfg, + logger: logger, + secrets: secrets, + http: &http.Client{Timeout: 60 * time.Second}, + } + // Convert RPM/TPM to per-second token-bucket rates. burst on the + // request bucket is 1 (one request worth of "credit"); burst on + // the token bucket equals one full POST so we don't deadlock a + // legitimate big batch. + if cfg.RateLimitRPM > 0 { + p.reqLimiter = rate.NewLimiter(rate.Limit(float64(cfg.RateLimitRPM)/60.0), 1) + } + if cfg.RateLimitTPM > 0 { + p.tokenLimiter = rate.NewLimiter(rate.Limit(float64(cfg.RateLimitTPM)/60.0), cfg.maxTokensPerBatch()) + } + return p +} + +func (p *Provider) Kind() string { return provider.KindVoyage } + +// ID is "voyage:{model}:{dim}:{dtype}". All three parts contribute to +// embedding identity — switching any of them invalidates the index. +func (p *Provider) ID() string { + dim := p.cfg.OutputDimension + dimStr := "auto" + if dim > 0 { + dimStr = strconv.Itoa(dim) + } + return "voyage:" + p.cfg.Model + ":" + dimStr + ":" + p.cfg.OutputDtype +} + +func (p *Provider) Dimension() int { return p.cfg.OutputDimension } +func (p *Provider) SupportsTokenize() bool { return false } + +// StorageComponents namespaces the vector store as +// voyage//_. dim+dtype share one variant segment +// because both change vector identity and are known at config time; +// mirrors the dim/dtype parts of ID() ("auto" when dimension is unset). +func (p *Provider) StorageComponents() []string { + dimStr := "auto" + if p.cfg.OutputDimension > 0 { + dimStr = strconv.Itoa(p.cfg.OutputDimension) + } + variant := provider.StorageSlug(dimStr + "_" + p.cfg.OutputDtype) + return []string{provider.KindVoyage, provider.StorageSlug(p.cfg.Model), variant} +} + +func (p *Provider) Start(ctx context.Context) error { + if p.cfg.Model == "" { + return errors.New("voyage: model is required") + } + switch p.cfg.OutputDtype { + case DtypeFloat, DtypeInt8: + default: + return fmt.Errorf("voyage: unsupported output_dtype %q (use float or int8)", p.cfg.OutputDtype) + } + if _, ok := p.apiKey(); !ok { + return fmt.Errorf("%w: %s", provider.ErrMissingAPIKey, p.cfg.APIKeyEnv) + } + testCtx, cancel := context.WithTimeout(ctx, 30*time.Second) + defer cancel() + _, err := p.embed(testCtx, []string{"ping"}, "document") + if err != nil { + return fmt.Errorf("voyage: connect test failed: %w", err) + } + return nil +} + +func (p *Provider) Stop(_ context.Context) error { return nil } + +func (p *Provider) Ready(_ context.Context) error { + return provider.RemoteReady(p.secrets, p.cfg.APIKeyEnv) +} + +func (p *Provider) Status() provider.Status { + return provider.RemoteStatus(p.cfg.Model, p.cfg.APIKeyEnv, p.secrets) +} + +func (p *Provider) EmbedQuery(ctx context.Context, query string) ([]float32, error) { + vecs, err := p.embedAndAverage(ctx, []string{query}, "query") + if err != nil { + return nil, err + } + return vecs[0], nil +} + +func (p *Provider) EmbedDocuments(ctx context.Context, texts []string) ([][]float32, error) { + if len(texts) == 0 { + return nil, nil + } + return p.embedAndAverage(ctx, texts, "document") +} + +// embedAndAverage is the per-input sliding-window pipeline (mirrors +// ollama.Provider.TokenizeAndEmbed, but byte-based since Voyage has +// no tokenize endpoint we can hit): +// +// 1. Walk every input. If its byte-length exceeds maxInputBytes, +// split at rune-aligned boundaries into N non-overlapping +// windows. Remember which original each window belongs to via +// a (start, length) span table. +// 2. Send the expanded slice through planBatches → POST chains +// with the same adaptive-bisect-on-400 behaviour as before. +// 3. Reassemble: for inputs that produced multiple windows, +// average the per-window vectors back to a single vector. +// For inputs that fit unchanged, the vector passes through. +// +// The averaging step is what keeps tail content of an over-long +// chunk from being dropped (which is what truncation:true would do +// upstream). The trade-off is N times more POST/token cost per +// such chunk, but oversize chunks are rare on well-chunked +// indexes — the indexer should already be cutting at function / +// class boundaries. +func (p *Provider) embedAndAverage(ctx context.Context, texts []string, inputType string) ([][]float32, error) { + maxIn := p.cfg.maxInputBytes() + + // Phase 1: expand oversize inputs into windows; track spans. + type span struct{ start, length int } + spans := make([]span, len(texts)) + var expanded []string + totalSplits := 0 + for i, t := range texts { + windows := splitOversizeInput(t, maxIn) + spans[i] = span{start: len(expanded), length: len(windows)} + expanded = append(expanded, windows...) + if len(windows) > 1 { + totalSplits += len(windows) + } + } + if totalSplits > 0 { + p.logger.Info("voyage: oversize inputs split into byte-windows", + "original_inputs", len(texts), + "total_windows", len(expanded), + "split_windows", totalSplits, + "max_input_bytes", maxIn, + ) + } + + // Phase 2: batch + POST as before, on the expanded slice. + batches := planBatches(expanded, p.cfg.maxBatchSize(), p.cfg.maxTokensPerBatch()) + if len(batches) > 1 { + p.logger.Info("voyage: splitting batch", + "model", p.cfg.Model, + "total_inputs", len(expanded), + "sub_batches", len(batches), + "limit_inputs", p.cfg.maxBatchSize(), + "limit_tokens", p.cfg.maxTokensPerBatch(), + ) + } + allVecs := make([][]float32, 0, len(expanded)) + offset := 0 + for i, batch := range batches { + p.logger.Debug("voyage: sub-batch POST", + "index", i+1, + "of", len(batches), + "inputs", len(batch), + "est_tokens", sumEstimateTokens(batch), + ) + part, err := p.embedWithAdaptiveSplit(ctx, batch, inputType) + if err != nil { + return nil, fmt.Errorf("voyage: sub-batch %d/%d (offset=%d, inputs=%d, ~%d tokens): %w", + i+1, len(batches), offset, len(batch), sumEstimateTokens(batch), err) + } + allVecs = append(allVecs, part...) + offset += len(batch) + } + + // Phase 3: reassemble — average sub-window vectors back to one + // vector per original input. Vectors that came through alone + // (one-window inputs) pass through unchanged. + if totalSplits == 0 { + // Fast path: no oversize inputs, allVecs already maps 1:1 + // to texts. + return allVecs, nil + } + result := make([][]float32, len(texts)) + for i, sp := range spans { + if sp.length == 1 { + result[i] = allVecs[sp.start] + continue + } + dim := len(allVecs[sp.start]) + avg := make([]float32, dim) + for k := 0; k < sp.length; k++ { + v := allVecs[sp.start+k] + // All windows of one input must share a width; otherwise the + // avg[d] += v[d] below would panic with index-out-of-range. + // Surface a clean error instead (only reachable if the API + // returns inconsistent dims across a split input). + if len(v) != dim { + return nil, fmt.Errorf("voyage: inconsistent window dims for input %d: window %d has %d, want %d", + i, k, len(v), dim) + } + for d := range avg { + avg[d] += v[d] + } + } + n := float32(sp.length) + for d := range avg { + avg[d] /= n + } + result[i] = avg + } + return result, nil +} + +// embedWithAdaptiveSplit wraps embed() with a defensive bisect-on-400 +// loop. Our byte→token estimator (see bytesPerToken) cannot match +// Voyage's real tokenizer exactly; pathological inputs may still +// overflow the per-batch cap. On a "batch too large" 400 we split +// the batch in half and retry both halves recursively. When the +// batch is already a single input and STILL too large there's +// nothing to bisect — we return a clear error pointing the operator +// at the chunker upstream (each chunk should fit voyage's per-input +// limits; if it doesn't, the chunker let through an over-long unit). +func (p *Provider) embedWithAdaptiveSplit(ctx context.Context, texts []string, inputType string) ([][]float32, error) { + vecs, err := p.embed(ctx, texts, inputType) + if err == nil { + return vecs, nil + } + cap, actual, ok := parseBatchTooLarge(err.Error()) + if !ok { + // Different error class (auth, network, rate-limit, …) — + // surface as-is, retry would not help. + return nil, err + } + if len(texts) <= 1 { + // A single chunk on its own exceeds Voyage's hard cap. + // We CAN'T split the text — that would corrupt the + // semantic unit the indexer chose. Tell the operator + // where to fix it instead. + return nil, fmt.Errorf( + "voyage: a single chunk produced %d tokens (cap %d). Reduce the indexer's max chunk size or switch to a model with a higher per-request cap: %w", + actual, cap, err, + ) + } + // Bisect — the caller's batch had multiple inputs whose real + // token sum exceeded the cap. Logging deliberately captures + // the cap+actual so the operator can spot a pattern (e.g. + // estimator consistently off by 1.5x) without grepping the + // raw error. + mid := len(texts) / 2 + p.logger.Warn("voyage: batch too large — bisecting and retrying", + "inputs", len(texts), + "est_tokens", sumEstimateTokens(texts), + "voyage_actual_tokens", actual, + "voyage_cap", cap, + "left_half", mid, + "right_half", len(texts)-mid, + ) + left, err := p.embedWithAdaptiveSplit(ctx, texts[:mid], inputType) + if err != nil { + return nil, err + } + right, err := p.embedWithAdaptiveSplit(ctx, texts[mid:], inputType) + if err != nil { + return nil, err + } + return append(left, right...), nil +} + +// planBatches groups texts into sub-batches that each respect BOTH +// the input-count cap and the token-budget cap. A single text that +// on its own exceeds the token budget is placed in its own batch — +// Voyage will then 400 with a clear "tokens after truncation" +// message and the caller surfaces that to the operator (indicates +// the chunker upstream let through an over-long chunk). +// +// maxInputs and maxTokens come from the live Provider.cfg so the +// operator can override them via the admin form when their tier or +// chosen model allows a higher cap (e.g. voyage-3-large at 1000 +// inputs/POST instead of 128). +func planBatches(texts []string, maxInputs, maxTokens int) [][]string { + if len(texts) == 0 { + return nil + } + var batches [][]string + var current []string + currentTokens := 0 + for _, t := range texts { + est := estimateTokens(t) + // Close the current batch when adding this text would exceed + // either limit (and the batch already has something to send). + if len(current) > 0 && (len(current) >= maxInputs || currentTokens+est > maxTokens) { + batches = append(batches, current) + current = nil + currentTokens = 0 + } + current = append(current, t) + currentTokens += est + } + if len(current) > 0 { + batches = append(batches, current) + } + return batches +} + +// estimateTokens returns a conservative upper bound on the token cost +// of one text, in Voyage's tokenizer. Uses byte-length divided by a +// chars-per-token heuristic; see bytesPerToken doc for rationale. +func estimateTokens(s string) int { + return len(s) / bytesPerToken +} + +// sumEstimateTokens sums estimateTokens over a slice. Cheap; used in +// log lines so an operator can see the per-batch cost. +func sumEstimateTokens(texts []string) int { + n := 0 + for _, t := range texts { + n += estimateTokens(t) + } + return n +} + +func (p *Provider) TokenizeAndEmbed(ctx context.Context, texts []string) ([][]float32, error) { + return p.EmbedDocuments(ctx, texts) +} + +type embedRequest struct { + Input []string `json:"input"` + Model string `json:"model"` + InputType string `json:"input_type,omitempty"` + OutputDimension int `json:"output_dimension,omitempty"` + OutputDtype string `json:"output_dtype,omitempty"` + Truncation bool `json:"truncation,omitempty"` +} + +// embedResponseItem.Embedding is decoded as json.RawMessage because +// the shape depends on output_dtype: []float for float, []int for +// int8. dequantize() handles both branches. +type embedResponseItem struct { + Embedding json.RawMessage `json:"embedding"` + Index int `json:"index"` +} + +type embedResponseUsage struct { + TotalTokens int `json:"total_tokens"` +} + +type embedResponse struct { + Data []embedResponseItem `json:"data"` + Model string `json:"model"` + Usage embedResponseUsage `json:"usage"` +} + +func (p *Provider) embed(ctx context.Context, texts []string, inputType string) ([][]float32, error) { + key, ok := p.apiKey() + if !ok { + return nil, fmt.Errorf("%w: %s", provider.ErrMissingAPIKey, p.cfg.APIKeyEnv) + } + + // Wait on the operator-configured rate-limit token-buckets before + // hitting the wire. Both reservations honour ctx cancellation so + // a server shutdown / drain doesn't strand callers in Wait(). + if p.reqLimiter != nil { + if err := p.reqLimiter.Wait(ctx); err != nil { + return nil, fmt.Errorf("voyage: request-rate wait: %w", err) + } + } + if p.tokenLimiter != nil { + est := sumEstimateTokens(texts) + if est > p.tokenLimiter.Burst() { + est = p.tokenLimiter.Burst() + } + if est > 0 { + if err := p.tokenLimiter.WaitN(ctx, est); err != nil { + return nil, fmt.Errorf("voyage: token-rate wait (~%d tokens): %w", est, err) + } + } + } + + body, err := json.Marshal(embedRequest{ + Input: texts, + Model: p.cfg.Model, + InputType: inputType, + OutputDimension: p.cfg.OutputDimension, + OutputDtype: p.cfg.OutputDtype, + Truncation: p.cfg.Truncation, + }) + if err != nil { + return nil, fmt.Errorf("voyage: marshal: %w", err) + } + url := p.cfg.BaseURL + "/v1/embeddings" + req, err := http.NewRequestWithContext(ctx, http.MethodPost, url, bytes.NewReader(body)) + if err != nil { + return nil, fmt.Errorf("voyage: build request: %w", err) + } + req.Header.Set("Content-Type", "application/json") + req.Header.Set("Authorization", "Bearer "+key) + + resp, err := p.http.Do(req) + if err != nil { + return nil, fmt.Errorf("voyage: %w", err) + } + defer resp.Body.Close() + + if resp.StatusCode != http.StatusOK { + snippet, _ := io.ReadAll(io.LimitReader(resp.Body, 2048)) + return nil, fmt.Errorf("voyage: status %d: %s", resp.StatusCode, string(snippet)) + } + + var er embedResponse + if err := json.NewDecoder(resp.Body).Decode(&er); err != nil { + return nil, fmt.Errorf("voyage: decode: %w", err) + } + if len(er.Data) != len(texts) { + return nil, fmt.Errorf("voyage: got %d vectors for %d inputs", len(er.Data), len(texts)) + } + out := make([][]float32, len(er.Data)) + for _, item := range er.Data { + if item.Index < 0 || item.Index >= len(out) { + return nil, fmt.Errorf("voyage: out-of-range index %d", item.Index) + } + vec, err := dequantize(item.Embedding, p.cfg.OutputDtype) + if err != nil { + return nil, fmt.Errorf("voyage: decode embedding[%d]: %w", item.Index, err) + } + // Guard against a model silently ignoring output_dimension (e.g. + // a model that doesn't support Matryoshka shrink, or a typo'd + // model name): writing the wrong-width vector into the store + // corrupts the collection deep in the upsert path with no + // attribution back to here. Only enforced when a dimension was + // explicitly requested (0 = model's native default, unknown). + if want := p.cfg.OutputDimension; want > 0 && len(vec) != want { + return nil, fmt.Errorf("voyage: embedding[%d] has %d dims, want %d (model ignored output_dimension?)", + item.Index, len(vec), want) + } + out[item.Index] = vec + } + for i, v := range out { + if v == nil { + return nil, fmt.Errorf("voyage: missing vector at index %d", i) + } + } + return out, nil +} + +// dequantize converts the raw JSON embedding to []float32 per dtype. +// +// For dtype=float: passthrough — Voyage returns IEEE 754 floats. +// For dtype=int8: each component is a signed 8-bit integer in +// [-128, 127]; Voyage's docs prescribe dividing by 127.0 to recover +// the approximate unit-norm float representation. +// +// This is the only place in the codebase that handles int8 quantized +// embeddings; chromem-go and the search path both work exclusively +// in float32. +func dequantize(raw json.RawMessage, dtype string) ([]float32, error) { + switch dtype { + case DtypeInt8: + // Voyage returns int8 either as a JSON array of integers (the + // default, which is what cix gets since it never sets + // encoding_format) or, in some configurations / behind an + // OpenAI-compatible proxy, as a base64-packed byte string. + // Handle the string form defensively so a proxy swap doesn't + // fail the whole batch with an opaque "int8 decode" error. + if len(raw) > 0 && raw[0] == '"' { + var b64 string + if err := json.Unmarshal(raw, &b64); err != nil { + return nil, fmt.Errorf("int8 base64 string decode: %w", err) + } + bs, err := base64.StdEncoding.DecodeString(b64) + if err != nil { + return nil, fmt.Errorf("int8 base64 decode: %w", err) + } + out := make([]float32, len(bs)) + for i, b := range bs { + out[i] = float32(int8(b)) / 127.0 + } + return out, nil + } + var ints []int8 + if err := json.Unmarshal(raw, &ints); err != nil { + return nil, fmt.Errorf("int8 decode: %w", err) + } + out := make([]float32, len(ints)) + for i, v := range ints { + out[i] = float32(v) / 127.0 + } + return out, nil + default: + // "float" (and empty as a defensive default — Voyage's docs + // say float is the implicit choice when output_dtype is + // omitted from the request). + var floats []float32 + if err := json.Unmarshal(raw, &floats); err != nil { + return nil, fmt.Errorf("float decode: %w", err) + } + return floats, nil + } +} + +func (p *Provider) apiKey() (string, bool) { + return provider.ResolveAPIKey(p.secrets, p.cfg.APIKeyEnv) +} diff --git a/server/internal/embeddings/provider/voyage/voyage_test.go b/server/internal/embeddings/provider/voyage/voyage_test.go new file mode 100644 index 0000000..3472146 --- /dev/null +++ b/server/internal/embeddings/provider/voyage/voyage_test.go @@ -0,0 +1,654 @@ +package voyage + +import ( + "context" + "encoding/base64" + "encoding/json" + "fmt" + "io" + "net/http" + "net/http/httptest" + "strings" + "sync/atomic" + "testing" + "time" + "unicode/utf8" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" +) + +func fixedSecrets(key, value string) provider.SecretLookup { + return func(name string) (string, bool) { + if name == key { + return value, true + } + return "", false + } +} + +func stubServer(t *testing.T, status int, body string) (*httptest.Server, <-chan []byte) { + t.Helper() + got := make(chan []byte, 1) + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + if !strings.HasSuffix(r.URL.Path, "/v1/embeddings") { + http.NotFound(w, r) + return + } + raw, _ := io.ReadAll(r.Body) + select { + case got <- raw: + default: + } + w.WriteHeader(status) + _, _ = io.WriteString(w, body) + })) + t.Cleanup(srv.Close) + return srv, got +} + +func TestEmbedQuerySendsInputTypeQuery(t *testing.T) { + srv, gotBody := stubServer(t, http.StatusOK, `{ + "data": [{"index": 0, "embedding": [0.1, 0.2]}], + "model": "voyage-code-3", + "usage": {"total_tokens": 3} + }`) + // OutputDimension matches the 2-dim stub response so the per-vector + // dimension guard (H2) is satisfied; the assertion below still + // proves the configured dimension is forwarded in the request. + p := New(Config{ + BaseURL: srv.URL, + APIKeyEnv: "K", + Model: "voyage-code-3", + OutputDimension: 2, + OutputDtype: DtypeFloat, + }, fixedSecrets("K", "v"), nil) + + if _, err := p.EmbedQuery(context.Background(), "where is X"); err != nil { + t.Fatalf("EmbedQuery: %v", err) + } + var req embedRequest + _ = json.Unmarshal(<-gotBody, &req) + if req.InputType != "query" { + t.Errorf("input_type %q; expected query", req.InputType) + } + if req.OutputDimension != 2 { + t.Errorf("output_dimension %d", req.OutputDimension) + } +} + +func TestEmbedDocumentsSendsInputTypeDocument(t *testing.T) { + srv, gotBody := stubServer(t, http.StatusOK, `{ + "data": [{"index": 0, "embedding": [0.1]}], + "usage": {"total_tokens": 1} + }`) + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "m", OutputDtype: DtypeFloat, + }, fixedSecrets("K", "v"), nil) + _, _ = p.EmbedDocuments(context.Background(), []string{"x"}) + var req embedRequest + _ = json.Unmarshal(<-gotBody, &req) + if req.InputType != "document" { + t.Errorf("input_type %q; expected document", req.InputType) + } +} + +func TestInt8Dequantize(t *testing.T) { + // int8 vector [127, -127, 0, 64] dequantized to float ~ [1.0, -1.0, 0.0, ~0.504] + srv, _ := stubServer(t, http.StatusOK, `{ + "data": [{"index": 0, "embedding": [127, -127, 0, 64]}], + "usage": {"total_tokens": 1} + }`) + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "m", OutputDtype: DtypeInt8, + }, fixedSecrets("K", "v"), nil) + vecs, err := p.EmbedDocuments(context.Background(), []string{"x"}) + if err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + if len(vecs) != 1 || len(vecs[0]) != 4 { + t.Fatalf("shape wrong: %v", vecs) + } + v := vecs[0] + if v[0] < 0.999 || v[1] > -0.999 || v[2] != 0 || v[3] < 0.50 || v[3] > 0.51 { + t.Errorf("dequantized values out of range: %v", v) + } +} + +func TestIDFingerprintIncludesAll(t *testing.T) { + p := New(Config{ + Model: "voyage-code-3", APIKeyEnv: "K", + OutputDimension: 1024, OutputDtype: DtypeInt8, + }, fixedSecrets("K", "v"), nil) + want := "voyage:voyage-code-3:1024:int8" + if got := p.ID(); got != want { + t.Errorf("ID() = %q, want %q", got, want) + } +} + +// TestEmbedDocumentsSplitsOversizeBatch covers the transparent +// per-provider split: Voyage's voyage-code-* models cap at 128 +// inputs/request, so a 200-item EmbedDocuments call must produce +// TWO POSTs (128 + 72) and return all 200 vectors in input order. +func TestEmbedDocumentsSplitsOversizeBatch(t *testing.T) { + posts := 0 + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + posts++ + // Echo back as many embeddings as the request contained so + // the caller's input ↔ vector mapping is verifiable. + raw, _ := io.ReadAll(r.Body) + var req embedRequest + _ = json.Unmarshal(raw, &req) + if len(req.Input) > 128 { + t.Errorf("POST #%d carried %d inputs, expected <= 128", posts, len(req.Input)) + } + items := make([]map[string]any, len(req.Input)) + for i := range req.Input { + items[i] = map[string]any{"index": i, "embedding": []float32{float32(i)}} + } + body, _ := json.Marshal(map[string]any{ + "data": items, + "model": req.Model, + "usage": map[string]int{"total_tokens": 1}, + }) + w.WriteHeader(http.StatusOK) + _, _ = w.Write(body) + })) + t.Cleanup(srv.Close) + + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "voyage-code-3", + OutputDimension: 0, OutputDtype: DtypeFloat, + }, fixedSecrets("K", "v"), nil) + + texts := make([]string, 200) + for i := range texts { + texts[i] = "chunk" + } + vecs, err := p.EmbedDocuments(context.Background(), texts) + if err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + if got := len(vecs); got != 200 { + t.Fatalf("got %d vectors, want 200", got) + } + if posts != 2 { + t.Errorf("expected 2 POSTs (128 + 72), got %d", posts) + } +} + +// TestPlanBatches_SplitsByTokenBudget covers the second cap on per- +// request batch size: even when input count is under maxBatchSize, +// Voyage hard-limits the request to 120K tokens. Our estimator uses +// bytesPerToken=2 so a 240_000-byte text estimates to 120_000 tokens, +// strictly above the 100K budget. Mixing one huge text with several +// smaller ones should produce multiple batches. +func TestPlanBatches_SplitsByTokenBudget(t *testing.T) { + big := strings.Repeat("x", 240_000) // ~120_000 est tokens at bytesPerToken=2 + small := "tiny" + texts := []string{big, small, small, small, small, small} + + batches := planBatches(texts, defaultMaxBatchSize, defaultMaxTokensPerBatch) + if len(batches) < 2 { + t.Fatalf("expected at least 2 batches, got %d", len(batches)) + } + + got := 0 + for _, b := range batches { + got += len(b) + if est := sumEstimateTokens(b); est > defaultMaxTokensPerBatch && len(b) > 1 { + t.Errorf("batch with %d inputs exceeds token budget: ~%d tokens > %d", + len(b), est, defaultMaxTokensPerBatch) + } + } + if got != len(texts) { + t.Errorf("inputs lost across batches: got %d, want %d", got, len(texts)) + } +} + +// TestPlanBatches_RespectsCountCap verifies the legacy 128-input +// cap is still enforced when token estimates wouldn't trigger a +// split. 200 small texts → at least 2 batches (128 + 72). +func TestPlanBatches_RespectsCountCap(t *testing.T) { + texts := make([]string, 200) + for i := range texts { + texts[i] = "chunk" + } + batches := planBatches(texts, defaultMaxBatchSize, defaultMaxTokensPerBatch) + if len(batches) != 2 { + t.Fatalf("expected 2 batches (128 + 72), got %d", len(batches)) + } + if len(batches[0]) != defaultMaxBatchSize { + t.Errorf("first batch has %d inputs, want %d", len(batches[0]), defaultMaxBatchSize) + } + if len(batches[1]) != 72 { + t.Errorf("second batch has %d inputs, want 72", len(batches[1])) + } +} + +// TestEmbedDocumentsSplitsByTokenBudget exercises the end-to-end +// flow: an oversize batch should turn into multiple POSTs to the +// upstream server, even when the input count alone wouldn't trigger +// the count-based split. +func TestEmbedDocumentsSplitsByTokenBudget(t *testing.T) { + var hits int32 + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + atomic.AddInt32(&hits, 1) + raw, _ := io.ReadAll(r.Body) + var req embedRequest + _ = json.Unmarshal(raw, &req) + items := make([]map[string]any, len(req.Input)) + for i := range req.Input { + items[i] = map[string]any{"index": i, "embedding": []float32{0.1}} + } + body, _ := json.Marshal(map[string]any{ + "data": items, + "model": req.Model, + "usage": map[string]int{"total_tokens": 1}, + }) + w.WriteHeader(http.StatusOK) + _, _ = w.Write(body) + })) + t.Cleanup(srv.Close) + + // Two big texts ~100K est tokens each (at bytesPerToken=2) → + // should produce >= 2 POSTs. MaxInputBytes set high so the + // per-input sliding-window split doesn't trigger; we want to + // exercise the batch-level token cap specifically. + big := strings.Repeat("x", 200_000) + texts := []string{big, big} + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "voyage-code-3", OutputDtype: DtypeFloat, + MaxInputBytes: 1_000_000, + }, fixedSecrets("K", "v"), nil) + vecs, err := p.EmbedDocuments(context.Background(), texts) + if err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + if got := len(vecs); got != 2 { + t.Fatalf("got %d vectors, want 2", got) + } + if atomic.LoadInt32(&hits) < 2 { + t.Errorf("expected at least 2 POSTs due to token-budget split, got %d", hits) + } +} + +// TestSplitOversizeInput exercises the byte-window splitter +// directly. Covers: small inputs pass through as a singleton +// slice; oversize ASCII split into N windows; UTF-8 boundary +// respected (no codepoint cut in half). +func TestSplitOversizeInput(t *testing.T) { + t.Run("under cap returns singleton", func(t *testing.T) { + got := splitOversizeInput("hello world", 100) + if len(got) != 1 || got[0] != "hello world" { + t.Errorf("got %v, want [hello world]", got) + } + }) + + t.Run("ASCII over cap splits to N windows", func(t *testing.T) { + text := strings.Repeat("x", 250) + got := splitOversizeInput(text, 100) + if len(got) != 3 { + t.Fatalf("got %d windows, want 3", len(got)) + } + joined := strings.Join(got, "") + if joined != text { + t.Errorf("rejoin mismatch: got len=%d, want len=%d", len(joined), len(text)) + } + for i, w := range got { + if len(w) > 100 { + t.Errorf("window %d has %d bytes, want <= 100", i, len(w)) + } + } + }) + + t.Run("UTF-8 multi-byte boundary respected", func(t *testing.T) { + // 100 Cyrillic letters (2 bytes each in UTF-8) = 200 bytes. + // Split cap = 50 bytes. If we split naively at byte 50 we'd + // cut a 2-byte rune in half — utf8.ValidString would fail. + text := strings.Repeat("щ", 100) + got := splitOversizeInput(text, 50) + for i, w := range got { + if !utf8.ValidString(w) { + t.Errorf("window %d is not valid UTF-8: bytes=%v", i, []byte(w)) + } + } + // Rejoin must reproduce the original byte-for-byte. + if joined := strings.Join(got, ""); joined != text { + t.Errorf("rejoin mismatch: lost %d bytes", len(text)-len(joined)) + } + }) +} + +// TestEmbedDocuments_AveragesOversizeInputWindows covers the +// end-to-end sliding-window behaviour: a 250-byte input with +// MaxInputBytes=100 must be POSTed as 3 separate windows; the +// returned vector must be the element-wise mean of the 3 window +// vectors. Mirrors the ollama TokenizeAndEmbed averaging logic. +func TestEmbedDocuments_AveragesOversizeInputWindows(t *testing.T) { + var postCount int32 + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + atomic.AddInt32(&postCount, 1) + raw, _ := io.ReadAll(r.Body) + var req embedRequest + _ = json.Unmarshal(raw, &req) + // Return a different constant vector per window position so + // the average is easy to assert. + items := make([]map[string]any, len(req.Input)) + for i := range req.Input { + // Each window's vector = [i+1.0] so 3 windows give + // [1.0, 2.0, 3.0] → average 2.0. + items[i] = map[string]any{"index": i, "embedding": []float32{float32(i + 1)}} + } + body, _ := json.Marshal(map[string]any{ + "data": items, + "model": req.Model, + "usage": map[string]int{"total_tokens": 1}, + }) + w.WriteHeader(http.StatusOK) + _, _ = w.Write(body) + })) + t.Cleanup(srv.Close) + + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "voyage-code-3", OutputDtype: DtypeFloat, + MaxInputBytes: 100, + }, fixedSecrets("K", "v"), nil) + + text := strings.Repeat("x", 250) // → 3 windows of 100/100/50 bytes + vecs, err := p.EmbedDocuments(context.Background(), []string{text}) + if err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + if len(vecs) != 1 { + t.Fatalf("got %d vectors, want 1 (averaged)", len(vecs)) + } + // Stub returned [1.0, 2.0, 3.0] for the three windows → mean is 2.0. + if got := vecs[0][0]; got < 1.99 || got > 2.01 { + t.Errorf("averaged vector[0] = %v, want ~2.0", got) + } +} + +// TestEmbedDocuments_BisectsOnBatchTooLarge covers the adaptive +// recovery path: when Voyage returns 400 "max allowed tokens per +// submitted batch is 120000. Your batch has N tokens" the provider +// splits the batch in half and retries. The stub here rejects any +// POST with more than 1 input on the first hit; the provider must +// bisect down to single-input POSTs and succeed. +func TestEmbedDocuments_BisectsOnBatchTooLarge(t *testing.T) { + var posts int32 + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + atomic.AddInt32(&posts, 1) + raw, _ := io.ReadAll(r.Body) + var req embedRequest + _ = json.Unmarshal(raw, &req) + // Reject any batch with > 1 input — forces the provider to + // bisect all the way to singletons. + if len(req.Input) > 1 { + w.WriteHeader(http.StatusBadRequest) + _, _ = w.Write([]byte(`{"detail":"Request failed. The max allowed tokens per submitted batch is 120000. Your batch has 200000 tokens after truncation."}`)) + return + } + w.WriteHeader(http.StatusOK) + _, _ = w.Write([]byte(`{"data":[{"index":0,"embedding":[0.1]}],"usage":{"total_tokens":1}}`)) + })) + t.Cleanup(srv.Close) + + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "voyage-code-3", OutputDtype: DtypeFloat, + // Pretend our config-time cap is high enough to NOT split via + // planBatches; we want to exercise the runtime 400 bisect path. + MaxInputsPerRequest: 1000, MaxTokensPerRequest: 10_000_000, + }, fixedSecrets("K", "v"), nil) + + vecs, err := p.EmbedDocuments(context.Background(), []string{"a", "b", "c", "d"}) + if err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + if len(vecs) != 4 { + t.Fatalf("got %d vectors, want 4", len(vecs)) + } + // One initial rejected POST (4 inputs), two more halves rejected + // (2 + 2), four singleton POSTs that succeed. Total = 7. + if got := atomic.LoadInt32(&posts); got < 7 { + t.Errorf("expected at least 7 POSTs (rejected + bisected + singletons), got %d", got) + } +} + +// TestEmbedDocuments_SingleInputTooLargeFailsClean covers the +// "no split possible" case: when a SINGLE chunk produces more +// tokens than the upstream cap, we cannot bisect further (would +// corrupt the chunk). The provider returns a clear error that +// points the operator at the chunker rather than retrying forever. +func TestEmbedDocuments_SingleInputTooLargeFailsClean(t *testing.T) { + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusBadRequest) + _, _ = w.Write([]byte(`{"detail":"Request failed. The max allowed tokens per submitted batch is 120000. Your batch has 187609 tokens after truncation."}`)) + })) + t.Cleanup(srv.Close) + + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "voyage-code-3", OutputDtype: DtypeFloat, + MaxInputsPerRequest: 1000, MaxTokensPerRequest: 10_000_000, + }, fixedSecrets("K", "v"), nil) + + _, err := p.EmbedDocuments(context.Background(), []string{"one big chunk"}) + if err == nil { + t.Fatal("expected error, got nil") + } + msg := err.Error() + if !strings.Contains(msg, "single chunk") || !strings.Contains(msg, "187609 tokens") { + t.Errorf("error should mention single-chunk + actual token count: %q", msg) + } + if !strings.Contains(msg, "Reduce the indexer's max chunk size") { + t.Errorf("error should hint at upstream chunker fix: %q", msg) + } +} + +func TestParseBatchTooLarge(t *testing.T) { + cases := []struct { + msg string + wantCap, wantAct int + wantOK bool + }{ + { + "voyage: status 400: {\"detail\":\"Request failed. The max allowed tokens per submitted batch is 120000. Your batch has 187609 tokens after truncation.\"}", + 120000, 187609, true, + }, + { + "voyage: status 429: rate limited", + 0, 0, false, + }, + { + "voyage: status 400: model not found", + 0, 0, false, + }, + } + for _, tc := range cases { + gotCap, gotAct, ok := parseBatchTooLarge(tc.msg) + if ok != tc.wantOK || gotCap != tc.wantCap || gotAct != tc.wantAct { + t.Errorf("parseBatchTooLarge(%q) = (%d, %d, %v), want (%d, %d, %v)", + tc.msg, gotCap, gotAct, ok, tc.wantCap, tc.wantAct, tc.wantOK) + } + } +} + +// TestRateLimitRPMThrottlesRequests verifies that when the operator +// configures RateLimitRPM, the provider actually waits between +// requests. 120 RPM = 1 request per 500ms (burst of 1), so 2 +// sequential requests on a fresh limiter take ~500ms total. +func TestRateLimitRPMThrottlesRequests(t *testing.T) { + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusOK) + _, _ = w.Write([]byte(`{"data":[{"index":0,"embedding":[0.1]}],"usage":{"total_tokens":1}}`)) + })) + t.Cleanup(srv.Close) + + // 120 RPM → 2 req/s → second call must wait ~500ms (after the + // burst-1 bucket drained on the first call). + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "voyage-3", OutputDtype: DtypeFloat, + RateLimitRPM: 120, + }, fixedSecrets("K", "v"), nil) + + ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) + defer cancel() + + // First call: instant (burst available). + if _, err := p.EmbedQuery(ctx, "first"); err != nil { + t.Fatalf("first EmbedQuery: %v", err) + } + // Second call: must wait. We don't measure precisely because rate + // limiter has sub-millisecond timing variance; ≥ 300ms is enough + // to know the throttle fired (well above any test-runtime noise). + start := time.Now() + if _, err := p.EmbedQuery(ctx, "second"); err != nil { + t.Fatalf("second EmbedQuery: %v", err) + } + elapsed := time.Since(start) + if elapsed < 300*time.Millisecond { + t.Errorf("expected second call to wait for RPM limiter (>= 300ms); elapsed=%s", elapsed) + } +} + +// TestRateLimitTPMThrottlesTokens verifies the token-budget bucket +// also forces a wait when consumption exceeds the per-minute rate. +// 600K TPM = 10K tokens/s, burst = maxTokensPerBatch (100K). Sending +// two batches of 60K tokens each should make the second wait while +// the bucket refills. +func TestRateLimitTPMThrottlesTokens(t *testing.T) { + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.WriteHeader(http.StatusOK) + _, _ = w.Write([]byte(`{"data":[{"index":0,"embedding":[0.1]}],"usage":{"total_tokens":1}}`)) + })) + t.Cleanup(srv.Close) + + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "voyage-3", OutputDtype: DtypeFloat, + // burst = maxTokensPerBatch (100K), refill rate 600K/min = 10K/s. + RateLimitTPM: 600_000, + // Disable the per-input byte-window split for this test — + // we want to send the full 180K-byte input in one POST so + // the token-budget bucket actually sees ~60K tokens at once. + MaxInputBytes: 1_000_000, + }, fixedSecrets("K", "v"), nil) + + ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) + defer cancel() + + // 120K bytes ≈ 60K est tokens (bytesPerToken=2) — half the burst budget. + big := strings.Repeat("x", 120_000) + + // First call drains 60K of the 100K-burst bucket: instant. + if _, err := p.EmbedQuery(ctx, big); err != nil { + t.Fatalf("first: %v", err) + } + // Second call wants another 60K but only 40K is left; needs to + // wait for 20K to refill at 10K/s = ~2s. + start := time.Now() + if _, err := p.EmbedQuery(ctx, big); err != nil { + t.Fatalf("second: %v", err) + } + elapsed := time.Since(start) + // Lower bound 1.5s — leaves margin for rate-limiter clock granularity. + if elapsed < 1500*time.Millisecond { + t.Errorf("expected second call to wait for TPM limiter (>= 1.5s); elapsed=%s", elapsed) + } +} + +func TestUsageDecodesWithoutPromptTokens(t *testing.T) { + // Voyage's usage object lacks prompt_tokens — make sure decode doesn't error. + srv, _ := stubServer(t, http.StatusOK, `{ + "data": [{"index": 0, "embedding": [0.1]}], + "model": "voyage-3", + "usage": {"total_tokens": 7} + }`) + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "m", OutputDtype: DtypeFloat, + }, fixedSecrets("K", "v"), nil) + if _, err := p.EmbedDocuments(context.Background(), []string{"x"}); err != nil { + t.Fatalf("decode: %v", err) + } +} + +// TestEmbed_RejectsWrongDimension guards H2: a configured +// output_dimension that the model silently ignores must be rejected +// loudly rather than writing a wrong-width vector into the store. +func TestEmbed_RejectsWrongDimension(t *testing.T) { + srv, _ := stubServer(t, http.StatusOK, `{ + "data": [{"index": 0, "embedding": [0.1, 0.2]}], + "usage": {"total_tokens": 1} + }`) + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "m", + OutputDimension: 1024, OutputDtype: DtypeFloat, + }, fixedSecrets("K", "v"), nil) + _, err := p.EmbedDocuments(context.Background(), []string{"x"}) + if err == nil { + t.Fatal("expected error on dimension mismatch, got nil") + } + if !strings.Contains(err.Error(), "want 1024") { + t.Errorf("error %q should mention the expected dimension", err) + } +} + +// TestEmbed_RejectsInconsistentWindowDims guards H2's averaging path: +// when an oversize input is split into windows and the API returns +// windows of differing width, the reassembly must error rather than +// panic with index-out-of-range. OutputDimension=0 so the per-vector +// check is skipped and the averaging guard is what catches it. +func TestEmbed_RejectsInconsistentWindowDims(t *testing.T) { + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + // One oversize input expands to 2 windows → 2 inputs in this + // POST; return vectors of different lengths for each. + _, _ = io.WriteString(w, `{ + "data": [ + {"index": 0, "embedding": [0.1, 0.2]}, + {"index": 1, "embedding": [0.1, 0.2, 0.3]} + ], + "usage": {"total_tokens": 2} + }`) + })) + t.Cleanup(srv.Close) + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "m", + OutputDtype: DtypeFloat, MaxInputBytes: 10, + }, fixedSecrets("K", "v"), nil) + // 20 bytes > MaxInputBytes(10) → splits into 2 windows. + _, err := p.EmbedDocuments(context.Background(), []string{strings.Repeat("a", 20)}) + if err == nil { + t.Fatal("expected error on inconsistent window dims, got nil") + } + if !strings.Contains(err.Error(), "inconsistent window dims") { + t.Errorf("error %q should mention inconsistent window dims", err) + } +} + +// TestInt8Dequantize_Base64 guards M4: int8 returned as a base64-packed +// byte string (rather than a JSON int array) is dequantized correctly. +func TestInt8Dequantize_Base64(t *testing.T) { + // int8 [127, -127, 0, 64] packed as raw signed bytes → base64. + ints := []int8{127, -127, 0, 64} + packed := make([]byte, len(ints)) + for i, v := range ints { + packed[i] = byte(v) + } + b64 := base64.StdEncoding.EncodeToString(packed) + srv, _ := stubServer(t, http.StatusOK, fmt.Sprintf(`{ + "data": [{"index": 0, "embedding": %q}], + "usage": {"total_tokens": 1} + }`, b64)) + p := New(Config{ + BaseURL: srv.URL, APIKeyEnv: "K", Model: "m", OutputDtype: DtypeInt8, + }, fixedSecrets("K", "v"), nil) + vecs, err := p.EmbedDocuments(context.Background(), []string{"x"}) + if err != nil { + t.Fatalf("EmbedDocuments: %v", err) + } + if len(vecs) != 1 || len(vecs[0]) != 4 { + t.Fatalf("shape wrong: %v", vecs) + } + v := vecs[0] + if v[0] < 0.999 || v[1] > -0.999 || v[2] != 0 || v[3] < 0.50 || v[3] > 0.51 { + t.Errorf("base64 int8 dequantized values out of range: %v", v) + } +} diff --git a/server/internal/embeddings/service.go b/server/internal/embeddings/service.go index ee3a7d2..581b0dc 100644 --- a/server/internal/embeddings/service.go +++ b/server/internal/embeddings/service.go @@ -2,45 +2,127 @@ package embeddings import ( "context" + "encoding/json" "errors" "fmt" - "io" "log/slog" "os" - "path/filepath" - "strings" + "sync" "time" "github.com/dvcdsys/code-index/server/internal/config" + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" + "github.com/dvcdsys/code-index/server/internal/embeddings/provider/ollama" + "github.com/dvcdsys/code-index/server/internal/vectorstore" + + // Blank imports trigger each provider package's init() which + // registers a Factory in the registry. Service builds the active + // provider purely by kind string — these imports are the wiring. + _ "github.com/dvcdsys/code-index/server/internal/embeddings/provider/openai" + _ "github.com/dvcdsys/code-index/server/internal/embeddings/provider/voyage" ) -// Service is the public embeddings API used by handlers. It composes the -// llama-server supervisor, the unix-socket client, the concurrency queue, and -// the per-model query-prefix policy. Handlers should call EmbedQuery for -// search inputs (applies prefix for asymmetric retrieval) and EmbedTexts for -// passages/chunks. +// Service is the public embeddings API used by handlers and the indexer. +// It composes: +// - the active embedding provider (ollama sidecar / OpenAI / Voyage / …) +// - a concurrency queue for backpressure // -// A Service with Disabled == true is a legal no-op used in tests; every -// method returns ErrDisabled. main.go constructs it via New when +// Concurrency. Embed* methods are safe under concurrent callers — they +// each acquire a slot from the queue and release it on return. Provider +// swaps (SwitchProvider) drain the queue first to avoid stranding +// in-flight requests on a torn-down child process. +// +// A Service with disabled == true is a legal no-op used in tests; every +// method returns ErrDisabled. main.go constructs it that way when // cfg.EmbeddingsEnabled is false. type Service struct { cfg *config.Config logger *slog.Logger - sup *supervisor queue *Queue - prefix string disabled bool + + // lifecycleMu serializes the two provider-lifecycle operations + // (SwitchProvider and Restart) against each other so they never + // interleave their s.current / s.queue mutations or both tear down a + // provider. Embed* methods do NOT take it — they run concurrently via + // the queue + an s.mu snapshot of current/queue. + lifecycleMu sync.Mutex + + // mu guards current AND the queue pointer — both are swapped at + // runtime (current by SwitchProvider/Restart, queue by Restart when + // the concurrency cap changes), so every read must snapshot them + // under the read lock rather than touching the fields directly. + mu sync.RWMutex + current provider.Provider + + // Vector-store reopen hooks, wired by main.go via AttachVectorStore. + // When set, SwitchProvider reopens the vector store under the new + // provider's identity slug and atomically swaps it into vsHolder, so + // a runtime provider switch moves to a dimension-isolated namespace + // without a process restart. All four are nil in tests that don't + // exercise the reopen path (SwitchProvider then only swaps the + // provider, matching the pre-unification behaviour). + vsHolder *vectorstore.Holder + vsDirFor func(components []string) string // cfg.ChromaDirFor + vsOpener func(dir string) (*vectorstore.Store, error) // vectorstore.Open + vsMigrate func() error // legacy flat-chroma → nested migration (idempotent) +} + +// AttachVectorStore wires the live vector-store reopen path used by +// SwitchProvider. main.go calls it once after constructing the Service +// and the shared Holder: +// +// dirFor — cfg.ChromaDirFor (maps identity path components to a dir) +// opener — vectorstore.Open +// migrate — optional idempotent legacy-dir migration run before each +// reopen (lets a switch back to ollama on a pre-unification +// box adopt its migrated dir without a restart); may be nil +// +// Passing the formula (dirFor) and opener as funcs keeps embeddings free +// of a hard dependency on config path layout and avoids an +// embeddings→storage import for the migration hook. +func (s *Service) AttachVectorStore( + holder *vectorstore.Holder, + dirFor func(components []string) string, + opener func(dir string) (*vectorstore.Store, error), + migrate func() error, +) { + if s == nil { + return + } + s.mu.Lock() + s.vsHolder = holder + s.vsDirFor = dirFor + s.vsOpener = opener + s.vsMigrate = migrate + s.mu.Unlock() +} + +// StoragePath returns the ACTIVE provider's vector-store path components +// (provider.Provider.StorageComponents), or nil when disabled / not yet +// built. Callers join them under ChromaPersistDir via cfg.ChromaDirFor. +// The dashboard's project-detail handler uses it to show the live chroma +// directory. +func (s *Service) StoragePath() []string { + if s == nil || s.disabled { + return nil + } + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur == nil { + return nil + } + return cur.StorageComponents() } -// New constructs a Service. If cfg.EmbeddingsEnabled is false it returns a -// disabled Service that reports ErrDisabled on every Embed* call but can -// still be Stop()-ed cleanly. Otherwise it resolves the GGUF path (env → -// cache → HF download), then starts the llama-server supervisor and blocks -// until the readiness probe succeeds. +// New constructs a Service from the env-derived config. The legacy +// entry point: builds an ollama provider with the env-supplied +// defaults and blocks until Start succeeds. main.go uses NewWithBoot +// to layer the DB-persisted provider selection on top of this. // -// ctx governs startup only. It is NOT stored on the Service — Stop has its -// own context so shutdown can be bounded independently of startup. +// ctx governs startup only; Stop has its own context. func New(ctx context.Context, cfg *config.Config, logger *slog.Logger) (*Service, error) { if logger == nil { logger = slog.Default() @@ -50,42 +132,221 @@ func New(ctx context.Context, cfg *config.Config, logger *slog.Logger) (*Service return &Service{cfg: cfg, logger: logger, disabled: true}, nil } - ggufPath, err := resolveGGUFPath(ctx, cfg, logger) + prov, err := buildOllamaFromConfig(cfg, logger) if err != nil { - return nil, fmt.Errorf("resolve gguf: %w", err) + return nil, fmt.Errorf("build ollama provider: %w", err) + } + if err := prov.Start(ctx); err != nil { + return nil, err } - supCfg := supervisorConfig{ - BinDir: cfg.LlamaBinDir, - GGUFPath: ggufPath, - SocketPath: cfg.LlamaSocketPath, - Transport: cfg.LlamaTransport, - CtxSize: cfg.LlamaCtxSize, - NGpuLayers: cfg.LlamaNGpuLayers, - NThreads: cfg.LlamaNThreads, - BatchSize: cfg.LlamaBatchSize, - StartupSec: cfg.LlamaStartupSec, - Model: cfg.EmbeddingModel, + return &Service{ + cfg: cfg, + logger: logger, + queue: NewQueue(cfg.MaxEmbeddingConcurrency, time.Duration(cfg.EmbeddingQueueTimeout)*time.Second), + current: prov, + }, nil +} + +// NewWithProvider constructs a Service around an already-built +// Provider. Used by main.go's boot path: it reads the persisted +// provider snapshot, calls provider.Build, then hands the result to +// this constructor. The Provider must already be Start()-ed. +func NewWithProvider(cfg *config.Config, prov provider.Provider, logger *slog.Logger) *Service { + if logger == nil { + logger = slog.Default() } + if !cfg.EmbeddingsEnabled { + return &Service{cfg: cfg, logger: logger, disabled: true} + } + return &Service{ + cfg: cfg, + logger: logger, + queue: NewQueue(cfg.MaxEmbeddingConcurrency, time.Duration(cfg.EmbeddingQueueTimeout)*time.Second), + current: prov, + } +} - sup, err := newSupervisor(ctx, supCfg, logger) +// BuildOllamaConfigFromEnv produces the ollama provider config blob +// derived from env (used by main.go to seed the persisted row on +// first boot and by tests that want a "live env-default" snapshot). +func BuildOllamaConfigFromEnv(cfg *config.Config) ([]byte, error) { + c := ollama.Config{ + Model: cfg.EmbeddingModel, + GGUFPath: cfg.GGUFPath, + CacheDir: cfg.GGUFCacheDir, + BootstrapPath: cfg.BootstrapGGUFPath, + BinDir: cfg.LlamaBinDir, + SocketPath: cfg.LlamaSocketPath, + Transport: cfg.LlamaTransport, + CtxSize: cfg.LlamaCtxSize, + NGpuLayers: cfg.LlamaNGpuLayers, + NThreads: cfg.LlamaNThreads, + BatchSize: cfg.LlamaBatchSize, + StartupSec: cfg.LlamaStartupSec, + } + return json.Marshal(c) +} + +// EnvSecrets returns the production SecretLookup: os.LookupEnv. main.go +// and the admin handlers pass it to provider.Build / Service.SwitchProvider. +func EnvSecrets() provider.SecretLookup { return envSecrets } + +// SwitchProvider replaces the active provider. Steps: +// 1. Build the new provider from kind + cfg. +// 2. Start it (validates config / connectivity). +// 3. Drain the queue (block new acquires, wait up to 30s). +// 4. Swap current to new under the mutex. +// 5. Stop the old provider on a separate goroutine so a slow SIGTERM +// does not hold the admin request. +// +// If step 2 fails, the old provider stays active and the error is +// returned to the caller. If step 3 times out we proceed anyway, +// favouring availability — in-flight calls finish on the old +// provider and the new takes over for everything subsequent. +func (s *Service) SwitchProvider(ctx context.Context, kind string, cfgBytes []byte) error { + if s == nil || s.disabled { + return ErrDisabled + } + // Serialize against Restart so the two lifecycle ops never interleave + // their s.current / s.queue mutations. + s.lifecycleMu.Lock() + defer s.lifecycleMu.Unlock() + + newProv, err := provider.Build(ctx, kind, cfgBytes, envSecrets, s.logger) if err != nil { - return nil, err + return fmt.Errorf("build %s provider: %w", kind, err) + } + if err := newProv.Start(ctx); err != nil { + return fmt.Errorf("start %s provider: %w", kind, err) + } + + q := s.currentQueue() + q.BlockNew() + // Keep the queue blocked across BOTH the provider swap and the + // vector-store reopen below; resume only on the way out (all return + // paths). A blocked queue fails Acquire fast with a 503, so no embed + // runs in the window where s.current is the NEW provider while the + // Holder still points at the OLD store — that window is exactly what + // would write new-dimension vectors into the previous provider's + // collection. + defer q.Resume() + drainCtx, drainCancel := context.WithTimeout(ctx, 30*time.Second) + if derr := q.WaitDrain(drainCtx); derr != nil { + s.logger.Warn("embeddings: drain timed out during switch; proceeding anyway", + "in_flight", q.InFlight(), "err", derr, + ) } + drainCancel() + + s.mu.Lock() + old := s.current + s.current = newProv + s.mu.Unlock() + + // Reopen the vector store under the new provider's identity slug so + // its (possibly different-dimension) vectors land in their own + // namespace instead of colliding with the previous provider's + // collection. If the reopen fails we roll the provider swap BACK to + // the old provider (whose store the Holder still points at) and stop + // the new provider we started: fail closed, never leave a + // new-provider / old-store pairing that corrupts the old collection. + if err := s.reopenVectorStore(newProv); err != nil { + s.mu.Lock() + s.current = old + s.mu.Unlock() + go stopProviderAsync(s.logger, newProv) + s.logger.Error("embeddings: provider switch rolled back — vector store reopen failed", + "kind", kind, "err", err) + return err + } + + if old != nil { + go stopProviderAsync(s.logger, old) + } + s.logger.Info("embeddings: switched provider", "kind", kind, "id", newProv.ID()) + return nil +} - return &Service{ - cfg: cfg, - logger: logger, - sup: sup, - queue: NewQueue(cfg.MaxEmbeddingConcurrency, time.Duration(cfg.EmbeddingQueueTimeout)*time.Second), - prefix: ResolveQueryPrefix(cfg.EmbeddingModel), - }, nil +// stopProviderAsync stops a provider in the background with a bounded +// timeout, logging (not failing) on error. Used to release the provider +// displaced by a switch — or the half-started new provider on a switch +// rollback — without blocking the caller. +func stopProviderAsync(logger *slog.Logger, p provider.Provider) { + if p == nil { + return + } + stopCtx, cancel := context.WithTimeout(context.Background(), 30*time.Second) + defer cancel() + if err := p.Stop(stopCtx); err != nil { + logger.Warn("embeddings: provider Stop returned error", "kind", p.Kind(), "err", err) + } } -// Config returns the *config.Config the service was constructed with. The -// pointer is shared; callers that mutate it in place must understand they -// are racing the supervisor — only the dashboard restart path is supposed -// to do this, and it does so behind queue.BlockNew + sup.Restart. +// reopenVectorStore opens a fresh *vectorstore.Store under the directory +// derived from prov's identity path components and atomically swaps it +// into the shared Holder. No-op when AttachVectorStore was never called +// (tests). +func (s *Service) reopenVectorStore(prov provider.Provider) error { + s.mu.RLock() + holder, dirFor, opener, migrate := s.vsHolder, s.vsDirFor, s.vsOpener, s.vsMigrate + s.mu.RUnlock() + if holder == nil || dirFor == nil || opener == nil { + return nil // reopen path not wired (e.g. unit tests) + } + if migrate != nil { + // Idempotent legacy flat→nested migration — lets a switch back to + // ollama on a pre-unification box adopt its migrated dir without + // a restart. + if err := migrate(); err != nil { + s.logger.Warn("embeddings: chroma legacy-dir migration failed during switch (continuing)", "err", err) + } + } + dir := dirFor(prov.StorageComponents()) + newStore, err := opener(dir) + if err != nil { + s.logger.Error("embeddings: provider switched but vector store reopen failed; keeping previous store until restart", + "dir", dir, "err", err) + return fmt.Errorf("reopen vector store at %s: %w", dir, err) + } + holder.Swap(newStore) + s.logger.Info("embeddings: vector store reopened under new provider namespace", "dir", dir) + return nil +} + +// buildOllamaFromConfig assembles an ollama.Provider out of the env- +// derived *config.Config. Bridges the legacy bootstrap path until the +// provider config persists into runtime_settings (Phase 6). +func buildOllamaFromConfig(cfg *config.Config, logger *slog.Logger) (provider.Provider, error) { + c := ollama.Config{ + Model: cfg.EmbeddingModel, + GGUFPath: cfg.GGUFPath, + CacheDir: cfg.GGUFCacheDir, + BootstrapPath: cfg.BootstrapGGUFPath, + BinDir: cfg.LlamaBinDir, + SocketPath: cfg.LlamaSocketPath, + Transport: cfg.LlamaTransport, + CtxSize: cfg.LlamaCtxSize, + NGpuLayers: cfg.LlamaNGpuLayers, + NThreads: cfg.LlamaNThreads, + BatchSize: cfg.LlamaBatchSize, + StartupSec: cfg.LlamaStartupSec, + } + b, err := json.Marshal(c) + if err != nil { + return nil, fmt.Errorf("marshal ollama config: %w", err) + } + return provider.Build(context.Background(), provider.KindOllama, b, envSecrets, logger) +} + +// envSecrets resolves env-var names via os.LookupEnv. Production +// SecretLookup; tests pass their own to avoid touching the process +// environment. +func envSecrets(name string) (string, bool) { + return os.LookupEnv(name) +} + +// Config returns the *config.Config the service was constructed with. func (s *Service) Config() *config.Config { if s == nil { return nil @@ -94,483 +355,305 @@ func (s *Service) Config() *config.Config { } // CacheDirFromService returns the GGUF cache directory the dashboard's -// /admin/models handler should walk. Returns "" when the EmbeddingsQuerier -// isn't a *Service (test fakes) or when the service is disabled. +// /admin/models handler should walk. Returns "" when the +// EmbeddingsQuerier isn't a *Service whose active provider is ollama +// (e.g. test fakes, openai/voyage active). func CacheDirFromService(q any) string { s, ok := q.(*Service) - if !ok || s == nil || s.cfg == nil { + if !ok || s == nil { return "" } - return s.cfg.GGUFCacheDir + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur == nil { + return "" + } + ol, ok := cur.(*ollama.Provider) + if !ok { + return "" + } + return ol.CacheDir() } -// Stop tears the supervisor down within the ctx deadline. Safe to call on a -// disabled or partially-initialised Service. +// Stop tears the current provider down within ctx. Safe on a disabled +// or never-started Service. func (s *Service) Stop(ctx context.Context) error { - if s == nil || s.disabled || s.sup == nil { + if s == nil || s.disabled { return nil } - return s.sup.Stop(ctx) + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur == nil { + return nil + } + return cur.Stop(ctx) } -// Status returns a snapshot of the sidecar process state for the dashboard. -// Returns SupervisorStatus{State: "disabled"} when the service was started -// with embeddings turned off — the dashboard renders a banner in that case -// and disables the runtime-config save buttons. -func (s *Service) Status() SupervisorStatus { +// Status returns a snapshot for the dashboard. State="disabled" when +// embeddings were turned off at boot. +func (s *Service) Status() provider.Status { if s == nil || s.disabled { - return SupervisorStatus{State: "disabled"} + return provider.Status{State: provider.StateDisabled} } - if s.sup == nil { - return SupervisorStatus{State: "failed", LastError: "supervisor not initialised"} + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur == nil { + return provider.Status{State: provider.StateFailed, LastError: "provider not initialised"} } - st := s.sup.Status() - if s.queue != nil { - // Annotate with in-flight count so the UI can show "draining (N)" - // during a restart cycle. - st.InFlight = s.queue.InFlight() + st := cur.Status() + if q := s.currentQueue(); q != nil { + st.InFlight = q.InFlight() } return st } -// Restart drains the embedding queue, stops the current sidecar child, and -// spawns a new one with the new config. cfg is the freshly-resolved -// runtimecfg-on-top-of-env Config snapshot — Restart does not consult any -// stored boot config. +// CurrentKind reports the kind of the active provider, or "" when +// disabled / not yet built. Used by /status and admin endpoints. +func (s *Service) CurrentKind() string { + if s == nil || s.disabled { + return "" + } + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur == nil { + return "" + } + return cur.Kind() +} + +// EmbeddingModel returns the active provider's fingerprint ID(). Used +// by repojobs to detect drift against projects.indexed_with_model. +func (s *Service) EmbeddingModel() string { + if s == nil || s.disabled { + return "" + } + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur == nil { + return "" + } + return cur.ID() +} + +// Restart applies runtime-config changes to the live Service. // -// On success, the new sidecar is ready to serve embeddings before this -// returns. On failure, the supervisor enters the "failed" state and the -// queue is reopened (so callers get the existing ErrSupervisor / ErrBusy -// rather than a permanent block). +// Provider-aware: +// - When the active provider is ollama, this is the legacy +// "respawn the sidecar with new flags" path: drain queue, build +// a new ollama provider with cfg, stop the old child, start the +// new one. +// - When the active provider is HTTP-only (openai / voyage), there +// is no sidecar to respawn. The only runtime-config field that +// still applies is max_embedding_concurrency (the Service-level +// queue depth). We rebuild the queue if it changed and leave the +// provider untouched. +// +// SwitchProvider is the right call for swapping the active provider +// itself; Restart only touches knobs that the runtime_config row +// owns. func (s *Service) Restart(ctx context.Context, cfg *config.Config) error { if s == nil || s.disabled { return ErrDisabled } - if s.sup == nil { - return ErrSupervisor - } - - // Drain: refuse new acquires, then wait for in-flight to settle. 30s - // matches the documented restart UX in the dashboard plan; longer values - // would let a stuck embedding call block the operator's intentional - // restart indefinitely. - s.queue.BlockNew() - defer s.queue.Resume() + // Serialize against SwitchProvider so the two lifecycle ops never + // interleave their s.current / s.queue mutations. + s.lifecycleMu.Lock() + defer s.lifecycleMu.Unlock() + + // Snapshot the live queue, block + drain it. The queue stays blocked + // through the respawn below and is resumed via the deferred + // activeQ.Resume() once s.current is valid again (activeQ is oldQ, or + // the new queue when the concurrency cap changed). + oldQ := s.currentQueue() + oldQ.BlockNew() drainCtx, drainCancel := context.WithTimeout(ctx, 30*time.Second) - if err := s.queue.WaitDrain(drainCtx); err != nil { + if err := oldQ.WaitDrain(drainCtx); err != nil { drainCancel() s.logger.Warn("embeddings: drain timed out, proceeding with restart anyway", - "in_flight", s.queue.InFlight(), "err", err, + "in_flight", oldQ.InFlight(), "err", err, ) } else { drainCancel() } - // Resolve the (possibly new) GGUF path before tearing down the current - // child — if resolution fails, we stay on the running sidecar instead of - // crashing it for a config we can't honour. - ggufPath, err := resolveGGUFPath(ctx, cfg, s.logger) + // activeQ is the queue that stays live through the respawn below and + // must be resumed on the way out. When the concurrency cap changes we + // install a NEW queue — it must also start blocked, otherwise embed + // callers acquire a slot on it during the sidecar respawn (when + // s.current is briefly nil) and get ErrSupervisor instead of a clean + // drain-style 503. Resuming oldQ would be a no-op (it's discarded). + activeQ := oldQ + if cfg.MaxEmbeddingConcurrency != cap(oldQ.slots) { + newQ := NewQueue(cfg.MaxEmbeddingConcurrency, time.Duration(cfg.EmbeddingQueueTimeout)*time.Second) + newQ.BlockNew() + s.mu.Lock() + s.queue = newQ + s.mu.Unlock() + activeQ = newQ + } + defer activeQ.Resume() + + // Snapshot the live provider's kind under the read lock — we don't + // want to swap an ollama for a voyage just because the runtime- + // config form was submitted. + s.mu.RLock() + curKind := "" + if s.current != nil { + curKind = s.current.Kind() + } + s.mu.RUnlock() + + if curKind != provider.KindOllama { + // HTTP-only provider: queue (re)built above is all there is to + // do. The cfg blob is persisted by the caller; we just stash + // the new *config.Config snapshot so subsequent /status / cfg + // reads return the live values. + s.mu.Lock() + s.cfg = cfg + s.mu.Unlock() + s.logger.Info("embeddings: restart applied to remote provider (queue only)", + "kind", curKind, "concurrency", cfg.MaxEmbeddingConcurrency, + ) + return nil + } + + // Ollama path: rebuild + respawn the sidecar with the supplied + // llama tuning fields. + newProv, err := buildOllamaFromConfig(cfg, s.logger) if err != nil { - return fmt.Errorf("resolve gguf for restart: %w", err) - } - - // Update queue concurrency / prefix to match the new model. The buffered - // slot channel can't be resized in place; we swap the queue, but only - // AFTER drain so no caller is mid-Acquire/Release on the old channel. - if cfg.MaxEmbeddingConcurrency != cap(s.queue.slots) { - s.queue = NewQueue(cfg.MaxEmbeddingConcurrency, time.Duration(cfg.EmbeddingQueueTimeout)*time.Second) - // New queue starts unblocked; that's fine because we hold the - // *previous* queue's blocked state via deferred Resume. The previous - // queue is now garbage and won't see any callers. - } - s.prefix = ResolveQueryPrefix(cfg.EmbeddingModel) - - supCfg := supervisorConfig{ - BinDir: cfg.LlamaBinDir, - GGUFPath: ggufPath, - SocketPath: cfg.LlamaSocketPath, - Transport: cfg.LlamaTransport, - CtxSize: cfg.LlamaCtxSize, - NGpuLayers: cfg.LlamaNGpuLayers, - NThreads: cfg.LlamaNThreads, - BatchSize: cfg.LlamaBatchSize, - StartupSec: cfg.LlamaStartupSec, - Model: cfg.EmbeddingModel, - } - return s.sup.Restart(ctx, supCfg) + return fmt.Errorf("rebuild ollama provider: %w", err) + } + s.mu.Lock() + old := s.current + s.current = nil + s.mu.Unlock() + if old != nil { + stopCtx, stopCancel := context.WithTimeout(ctx, 30*time.Second) + _ = old.Stop(stopCtx) + stopCancel() + } + if err := newProv.Start(ctx); err != nil { + s.logger.Error("embeddings: restart Start failed; provider remains down", "err", err) + return err + } + s.mu.Lock() + s.current = newProv + s.cfg = cfg + s.mu.Unlock() + return nil } -// Ready reports whether the embeddings pipeline is currently able to serve a -// request. Returns nil when the model is loaded and the supervisor is healthy, -// ErrDisabled when embeddings are turned off, or ErrSupervisor/ErrNotReady -// when the sidecar has died or is still warming up. m5 — /api/v1/status uses -// this to populate model_loaded rather than hard-coding `true`. +// Ready reports whether the embeddings pipeline can serve a request. func (s *Service) Ready(ctx context.Context) error { if s == nil || s.disabled { return ErrDisabled } - if s.sup == nil { + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur == nil { return ErrSupervisor } - if s.sup.dead.Load() { + err := cur.Ready(ctx) + if errors.Is(err, provider.ErrUnrecoverable) { return ErrSupervisor } - return s.sup.Ready(ctx) + if errors.Is(err, provider.ErrNotReady) { + return ErrNotReady + } + return err } -// EmbedQuery prepends the model's asymmetric-retrieval prefix and returns a -// single vector. Mirrors Python `embed_query`. +// EmbedQuery delegates to the active provider after acquiring a queue +// slot. The provider applies its own query-side transform (ollama +// prefix, voyage input_type=query, openai pass-through). func (s *Service) EmbedQuery(ctx context.Context, query string) ([]float32, error) { - if s.disabled { + if s == nil || s.disabled { return nil, ErrDisabled } - text := s.prefix + query - vecs, err := s.embedBatch(ctx, []string{text}) + cur, q, err := s.acquireProvider(ctx) if err != nil { return nil, err } - return vecs[0], nil + slotStart := time.Now() + defer q.Release(slotStart) + return cur.EmbedQuery(ctx, query) } -// EmbedTexts embeds passages unchanged (no prefix). Mirrors Python -// `embed_texts`. Returned vectors follow input order. +// EmbedTexts embeds passages unchanged. func (s *Service) EmbedTexts(ctx context.Context, texts []string) ([][]float32, error) { - if s.disabled { + if s == nil || s.disabled { return nil, ErrDisabled } - return s.embedBatch(ctx, texts) -} - -// embedBatch is the shared path used by both EmbedQuery and EmbedTexts. It -// acquires a queue slot, waits for the supervisor to be ready, and issues the -// HTTP call. Prefix logic stays in the callers so the queue accounting is -// identical regardless of whether the caller was a query or a passage batch. -func (s *Service) embedBatch(ctx context.Context, texts []string) ([][]float32, error) { - if s.sup.dead.Load() { - return nil, ErrSupervisor - } if len(texts) == 0 { return nil, nil } - - // Block on queue slot first — this is the backpressure surface that maps - // to HTTP 503 + Retry-After. - slotStart := time.Now() - if err := s.queue.Acquire(ctx); err != nil { - return nil, err - } - defer s.queue.Release(slotStart) - - // Make sure the child process finished its (re)start before issuing the - // call. For a healthy steady-state Service this is a no-op. - readyCtx, cancel := context.WithTimeout(ctx, 5*time.Second) - err := s.sup.Ready(readyCtx) - cancel() + cur, q, err := s.acquireProvider(ctx) if err != nil { - if errors.Is(err, ErrSupervisor) { - return nil, ErrSupervisor - } - return nil, fmt.Errorf("wait ready: %w", err) + return nil, err } - - return s.sup.client.Embeddings(ctx, texts) + slotStart := time.Now() + defer q.Release(slotStart) + return cur.EmbedDocuments(ctx, texts) } -// TokenizeAndEmbed is the token-aware embedding pipeline. For each text it: -// 1. Calls /tokenize to get token IDs (CLS + content + SEP). -// 2. Splits sequences longer than cfg.LlamaCtxSize at token boundaries, -// preserving CLS/SEP on each window. -// 3. Embeds all sequences in a single /v1/embeddings call using pre-tokenized -// IDs — no re-tokenization happens inside the model server. -// 4. Averages sub-window vectors back to one vector per original text. -// -// The entire operation holds one queue slot so back-pressure accounting matches -// EmbedTexts. Returns ErrDisabled / ErrSupervisor / ErrBusy on the same -// conditions as EmbedTexts. +// TokenizeAndEmbed runs the token-aware embedding pipeline. For +// providers that don't support native tokenization +// (SupportsTokenize() == false) this is identical to EmbedTexts — +// callers must chunk inputs themselves before reaching here. func (s *Service) TokenizeAndEmbed(ctx context.Context, texts []string) ([][]float32, error) { - if s.disabled { + if s == nil || s.disabled { return nil, ErrDisabled } - if s.sup.dead.Load() { - return nil, ErrSupervisor - } if len(texts) == 0 { return nil, nil } - - slotStart := time.Now() - if err := s.queue.Acquire(ctx); err != nil { - return nil, err - } - defer s.queue.Release(slotStart) - - readyCtx, cancel := context.WithTimeout(ctx, 5*time.Second) - err := s.sup.Ready(readyCtx) - cancel() - if err != nil { - if errors.Is(err, ErrSupervisor) { - return nil, ErrSupervisor - } - return nil, fmt.Errorf("wait ready: %w", err) - } - - maxTokens := s.cfg.LlamaCtxSize - - // Phase 1: tokenize each text. Accumulate flat sequences slice and a - // span table that records which flat sequences belong to each text. - type span struct{ start, length int } - spans := make([]span, len(texts)) - var sequences [][]int - - for i, text := range texts { - ids, err := s.sup.client.Tokenize(ctx, text) - if err != nil { - return nil, fmt.Errorf("tokenize text[%d]: %w", i, err) - } - - if len(ids) == 0 { - // Empty result: placeholder — embed will return a zero vector. - spans[i] = span{start: len(sequences), length: 1} - sequences = append(sequences, []int{}) - continue - } - - if len(ids) <= maxTokens { - spans[i] = span{start: len(sequences), length: 1} - sequences = append(sequences, ids) - continue - } - - // Sequence exceeds context window — split at token boundaries. - // ids[0] is CLS, ids[len-1] is SEP (add_special=true). - cls := ids[0] - sep := ids[len(ids)-1] - content := ids[1 : len(ids)-1] - windowSize := maxTokens - 2 // reserve 2 slots for CLS + SEP - - spanStart := len(sequences) - for start := 0; start < len(content); start += windowSize { - end := start + windowSize - if end > len(content) { - end = len(content) - } - window := make([]int, 0, end-start+2) - window = append(window, cls) - window = append(window, content[start:end]...) - window = append(window, sep) - sequences = append(sequences, window) - } - spans[i] = span{start: spanStart, length: len(sequences) - spanStart} - } - - // Phase 2: single batch embed call with all pre-tokenized sequences. - allVecs, err := s.sup.client.EmbedBatchTokenIDs(ctx, sequences) + cur, q, err := s.acquireProvider(ctx) if err != nil { return nil, err } - - // Phase 3: re-assemble — average sub-window vectors for split texts. - result := make([][]float32, len(texts)) - for i, sp := range spans { - if sp.length == 1 { - result[i] = allVecs[sp.start] - continue - } - // Average sp.length vectors element-wise. - dim := len(allVecs[sp.start]) - avg := make([]float32, dim) - for k := 0; k < sp.length; k++ { - v := allVecs[sp.start+k] - for d := range avg { - avg[d] += v[d] - } - } - n := float32(sp.length) - for d := range avg { - avg[d] /= n - } - result[i] = avg - } - return result, nil -} - -// embedRaw skips the queue *and* the prefix logic. It exists as a test helper -// for the parity gate: the reference file stores the exact text that was fed -// to the model, so the gate must not re-apply the prefix. This method is -// deliberately lowercase (package-private) — production handlers must go -// through EmbedQuery / EmbedTexts. -func (s *Service) embedRaw(ctx context.Context, texts []string) ([][]float32, error) { - if s.disabled { - return nil, ErrDisabled - } - if s.sup.dead.Load() { - return nil, ErrSupervisor - } - if len(texts) == 0 { - return nil, nil - } - return s.sup.client.Embeddings(ctx, texts) -} - -// resolveGGUFPath walks the precedence chain: -// 1. CIX_GGUF_PATH (absolute path env override, validated by Stat). -// 2. cfg.EmbeddingModel as absolute path — when the dashboard's "Local -// path" mode wrote it through to the runtime_settings row. -// 3. Cached file under cfg.GGUFCacheDir//*.gguf when -// cfg.EmbeddingModel is an HF repo ID. -// 4. CIX_BOOTSTRAP_GGUF_PATH one-shot import — copies the file into -// the cache layout, then behaves like step 3 forever after. -// 5. HuggingFace download into the same cix cache (this is the path -// that actually writes to disk). -// -// PR-E removed the implicit `bench/results/reference_gguf_path.txt` dev -// fallback that used to short-circuit step 2 — operators must now make -// the choice explicitly via env or the dashboard. Only step 5 is -// expensive; all others are stat-only or one-time copies. -func resolveGGUFPath(ctx context.Context, cfg *config.Config, logger *slog.Logger) (string, error) { - if cfg.GGUFPath != "" { - if _, err := os.Stat(cfg.GGUFPath); err != nil { - return "", fmt.Errorf("CIX_GGUF_PATH=%s: %w", cfg.GGUFPath, err) - } - return cfg.GGUFPath, nil - } - // PR-E — the dashboard's "Local path" mode writes an absolute path into - // embedding_model. Treat it as such instead of trying to interpret it - // as an HF repo id (which would fail the slash check or, worse, send - // the path to api.huggingface.co). - if filepath.IsAbs(cfg.EmbeddingModel) { - if _, err := os.Stat(cfg.EmbeddingModel); err != nil { - return "", fmt.Errorf("embedding model path %s: %w", cfg.EmbeddingModel, err) - } - return cfg.EmbeddingModel, nil - } - // HF repo ids look like "/" — exactly one slash, no leading "/". - if !strings.Contains(cfg.EmbeddingModel, "/") { - return "", fmt.Errorf("embedding model %q is neither an absolute path nor an HF repo id (owner/repo)", cfg.EmbeddingModel) - } - - // Cache-hit short-circuit: if we already downloaded a .gguf from this repo - // into the cache, use it — HF downloader would do the same stat first, - // but doing it here keeps the service silent in the happy path. - if cached := findCachedGGUF(cfg.GGUFCacheDir, cfg.EmbeddingModel); cached != "" { - logger.Info("using cached gguf", "path", cached) - return cached, nil - } - - // CIX_BOOTSTRAP_GGUF_PATH — one-time import path. Used so a fresh - // container with a freshly-mounted cache volume doesn't have to - // re-download a 280 MB GGUF the operator already has on disk. Once - // the file lands in the cache layout, the next boot satisfies the - // findCachedGGUF branch above and the bootstrap path is never read - // again (idempotent — repeated boots with the same env are no-ops). - if cfg.BootstrapGGUFPath != "" { - imported, err := importBootstrapGGUF(cfg.GGUFCacheDir, cfg.EmbeddingModel, cfg.BootstrapGGUFPath, logger) - if err != nil { - logger.Warn("bootstrap gguf import failed; falling through to HF download", - "src", cfg.BootstrapGGUFPath, "err", err) - } else if imported != "" { - return imported, nil - } + slotStart := time.Now() + defer q.Release(slotStart) + if cur.SupportsTokenize() { + return cur.TokenizeAndEmbed(ctx, texts) } - - return DownloadGGUF(ctx, cfg.EmbeddingModel, cfg.GGUFCacheDir, logger) + return cur.EmbedDocuments(ctx, texts) } -// importBootstrapGGUF copies srcPath into // -// atomically (write to .partial, fsync, rename). Returns the final path -// on success, "" if the source is missing (caller falls through to HF -// download), or an error for IO problems we should surface to the operator. -// -// safe_repo derived from the HF repo id (`owner/repo` → `owner__repo`) -// to match DownloadGGUF's layout exactly — so subsequent boots' cache -// scan finds the imported file under the same name HF would have used. -func importBootstrapGGUF(cacheDir, repo, srcPath string, logger *slog.Logger) (string, error) { - if cacheDir == "" || repo == "" { - return "", nil - } - srcInfo, err := os.Stat(srcPath) - if err != nil { - // Missing file is not a hard error — the operator may have set - // the env optimistically with a path that lives on a host they - // haven't mounted yet. Let the caller fall through to download. - if os.IsNotExist(err) { - return "", nil - } - return "", fmt.Errorf("stat bootstrap gguf %s: %w", srcPath, err) - } - if srcInfo.IsDir() { - return "", fmt.Errorf("bootstrap gguf %s is a directory, expected file", srcPath) - } - - safeRepo := strings.ReplaceAll(repo, "/", "__") - targetDir := filepath.Join(cacheDir, safeRepo) - if err := os.MkdirAll(targetDir, 0o755); err != nil { - return "", fmt.Errorf("mkdir cache dir: %w", err) - } - finalPath := filepath.Join(targetDir, filepath.Base(srcPath)) - - // Idempotency: if a previous boot already imported the same file, - // trust it — re-importing would be wasted IO and could race with a - // concurrent boot of a sibling container against a shared volume. - if _, err := os.Stat(finalPath); err == nil { - return finalPath, nil - } - - logger.Info("importing bootstrap gguf into cache", - "src", srcPath, "dst", finalPath, "size", srcInfo.Size()) - - src, err := os.Open(srcPath) - if err != nil { - return "", fmt.Errorf("open bootstrap gguf: %w", err) - } - defer src.Close() - - partial := finalPath + ".partial" - dst, err := os.OpenFile(partial, os.O_CREATE|os.O_WRONLY|os.O_TRUNC, 0o644) - if err != nil { - return "", fmt.Errorf("create cache target: %w", err) - } - - if _, err := io.Copy(dst, src); err != nil { - _ = dst.Close() - _ = os.Remove(partial) - return "", fmt.Errorf("copy bootstrap gguf: %w", err) - } - if err := dst.Sync(); err != nil { - _ = dst.Close() - _ = os.Remove(partial) - return "", fmt.Errorf("fsync bootstrap gguf: %w", err) - } - if err := dst.Close(); err != nil { - _ = os.Remove(partial) - return "", fmt.Errorf("close bootstrap gguf: %w", err) - } - if err := os.Rename(partial, finalPath); err != nil { - _ = os.Remove(partial) - return "", fmt.Errorf("atomic rename bootstrap gguf: %w", err) - } - logger.Info("bootstrap gguf imported", "path", finalPath) - return finalPath, nil +// currentQueue returns the active queue under the read lock. Restart +// swaps the queue pointer when the concurrency cap changes, so callers +// must snapshot it rather than reading s.queue directly (otherwise the +// read races the swap). +func (s *Service) currentQueue() *Queue { + s.mu.RLock() + q := s.queue + s.mu.RUnlock() + return q } -// findCachedGGUF looks for a previously-downloaded .gguf under the standard -// cache layout produced by DownloadGGUF. Returns "" on any miss (including -// IO errors) so the caller proceeds to the download path. -func findCachedGGUF(cacheDir, repo string) string { - safeRepo := strings.ReplaceAll(repo, "/", "__") - dir := cacheDir + string(os.PathSeparator) + safeRepo - entries, err := os.ReadDir(dir) - if err != nil { - return "" - } - for _, e := range entries { - if e.IsDir() { - continue - } - name := e.Name() - if len(name) > 5 && strings.EqualFold(name[len(name)-5:], ".gguf") { - return dir + string(os.PathSeparator) + name - } - } - return "" +// acquireProvider acquires a queue slot and returns the active provider +// snapshot AND the queue the slot was taken from. The caller must +// Release on the RETURNED queue (not s.queue, which Restart may have +// swapped meanwhile) — deferred at the call site so the slot is released +// even on provider error. +func (s *Service) acquireProvider(ctx context.Context) (provider.Provider, *Queue, error) { + q := s.currentQueue() + if err := q.Acquire(ctx); err != nil { + return nil, nil, err + } + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur == nil { + // We hold the slot but have nothing to call — release it before + // returning the error so subsequent callers aren't starved. + q.Release(time.Now()) + return nil, nil, ErrSupervisor + } + return cur, q, nil } diff --git a/server/internal/embeddings/switch_provider_test.go b/server/internal/embeddings/switch_provider_test.go new file mode 100644 index 0000000..969cd01 --- /dev/null +++ b/server/internal/embeddings/switch_provider_test.go @@ -0,0 +1,317 @@ +package embeddings + +import ( + "context" + "errors" + "io" + "log/slog" + "os" + "path/filepath" + "strings" + "sync" + "testing" + "time" + + "github.com/dvcdsys/code-index/server/internal/config" + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" + "github.com/dvcdsys/code-index/server/internal/vectorstore" +) + +// fakeProv is a minimal provider.Provider for exercising the vector-store +// reopen path. StorageComponents derives the nested namespace by splitting +// the id on ":" and slugging each field — enough to give distinct +// providers distinct on-disk paths. +type fakeProv struct{ id string } + +func (f fakeProv) Kind() string { return "fake" } +func (f fakeProv) ID() string { return f.id } +func (f fakeProv) Dimension() int { return 0 } +func (f fakeProv) SupportsTokenize() bool { return false } +func (f fakeProv) Start(context.Context) error { return nil } +func (f fakeProv) Stop(context.Context) error { return nil } +func (f fakeProv) Ready(context.Context) error { return nil } +func (f fakeProv) Status() provider.Status { return provider.Status{} } +func (f fakeProv) EmbedQuery(context.Context, string) ([]float32, error) { return nil, nil } +func (f fakeProv) EmbedDocuments(context.Context, []string) ([][]float32, error) { + return nil, nil +} +func (f fakeProv) TokenizeAndEmbed(context.Context, []string) ([][]float32, error) { + return nil, nil +} +func (f fakeProv) StorageComponents() []string { + parts := strings.Split(f.id, ":") + for i := range parts { + parts[i] = provider.StorageSlug(parts[i]) + } + return parts +} + +func quiet() *slog.Logger { return slog.New(slog.NewTextHandler(io.Discard, nil)) } + +// nestedDirFor mirrors cfg.ChromaDirFor: join the identity components under +// a chroma container dir. +func nestedDirFor(base string) func([]string) string { + return func(comps []string) string { + return filepath.Join(append([]string{base}, comps...)...) + } +} + +// fakeFactory registers a test-only provider kind so SwitchProvider can be +// driven end-to-end (it builds the new provider through the registry). +// Build echoes the config bytes into the provider ID so the derived +// storage path is deterministic per test. +type fakeFactory struct{} + +func (fakeFactory) Kind() string { return "fake-switch" } +func (fakeFactory) SchemaJSON() []byte { return []byte("{}") } +func (fakeFactory) SecretEnvVars() []string { return nil } +func (fakeFactory) Build(cfg []byte, _ provider.SecretLookup, _ *slog.Logger) (provider.Provider, error) { + return fakeProv{id: "fake:" + string(cfg)}, nil +} + +func init() { provider.Register(fakeFactory{}) } + +// TestSwitchProvider_RollbackOnReopenFailure guards the switch-atomicity +// fix: when the vector-store reopen fails, the provider swap is rolled +// back to the old provider (never a new-provider / old-store pairing that +// would write wrong-dimension vectors into the old collection), and the +// queue is resumed so the service keeps serving. +func TestSwitchProvider_RollbackOnReopenFailure(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") + const project = "/proj" + initial, err := vectorstore.Open(filepath.Join(base, "ollama", "m")) + if err != nil { + t.Fatal(err) + } + if err := initial.UpsertChunks(context.Background(), project, + []vectorstore.Chunk{{Content: "x", FilePath: "a.go", StartLine: 1, EndLine: 1, Language: "go"}}, + [][]float32{{1, 0}}); err != nil { + t.Fatal(err) + } + holder := vectorstore.NewHolder(initial) + + oldProv := fakeProv{id: "ollama:m"} + s := &Service{logger: quiet(), queue: NewQueue(2, time.Second), current: oldProv} + s.AttachVectorStore( + holder, + nestedDirFor(base), + func(string) (*vectorstore.Store, error) { return nil, errors.New("boom") }, // reopen always fails + nil, + ) + + if err := s.SwitchProvider(context.Background(), "fake-switch", []byte("newid")); err == nil { + t.Fatal("expected switch to fail on reopen error") + } + + // Rolled back to the OLD provider — not the half-applied new one. + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur.ID() != oldProv.ID() { + t.Errorf("after rollback current.ID() = %q, want %q", cur.ID(), oldProv.ID()) + } + // Holder still serves the old store (reopen never Swapped). + if got := holder.Count(project); got != 1 { + t.Errorf("holder count = %d, want 1 (old store retained)", got) + } + // Queue must be resumed, not left blocked, so the service keeps serving. + if err := s.queue.Acquire(context.Background()); err != nil { + t.Errorf("queue should be resumed after rollback, Acquire = %v", err) + } else { + s.queue.Release(time.Now()) + } +} + +// TestSwitchProvider_SuccessSwapsProviderAndStore is the happy path: the +// live provider becomes the new one, the holder reopens into the new +// (empty) namespace, and the queue ends unblocked. +func TestSwitchProvider_SuccessSwapsProviderAndStore(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") + const project = "/proj" + initial, err := vectorstore.Open(filepath.Join(base, "ollama", "m")) + if err != nil { + t.Fatal(err) + } + if err := initial.UpsertChunks(context.Background(), project, + []vectorstore.Chunk{{Content: "x", FilePath: "a.go", StartLine: 1, EndLine: 1, Language: "go"}}, + [][]float32{{1, 0}}); err != nil { + t.Fatal(err) + } + holder := vectorstore.NewHolder(initial) + + s := &Service{logger: quiet(), queue: NewQueue(2, time.Second), current: fakeProv{id: "ollama:m"}} + s.AttachVectorStore(holder, nestedDirFor(base), vectorstore.Open, nil) + + if err := s.SwitchProvider(context.Background(), "fake-switch", []byte("newid")); err != nil { + t.Fatalf("switch: %v", err) + } + s.mu.RLock() + cur := s.current + s.mu.RUnlock() + if cur.ID() != "fake:newid" { + t.Errorf("current.ID() = %q, want %q", cur.ID(), "fake:newid") + } + if got := holder.Count(project); got != 0 { + t.Errorf("after switch holder Count = %d, want 0 (new empty namespace)", got) + } + // New provider's nested namespace was created on disk. + if !dirExists(filepath.Join(base, "fake", "newid")) { + t.Errorf("new nested chroma dir chroma/fake/newid should exist") + } + if err := s.queue.Acquire(context.Background()); err != nil { + t.Errorf("queue should be unblocked after switch, Acquire = %v", err) + } else { + s.queue.Release(time.Now()) + } +} + +func TestServiceStoragePath(t *testing.T) { + s := &Service{logger: quiet(), current: fakeProv{id: "voyage:voyage-code-3:2048:float"}} + if got := strings.Join(s.StoragePath(), "/"); got != "voyage/voyage_code_3/2048/float" { + t.Errorf("StoragePath = %q", got) + } + // Disabled / no provider → empty. + if got := (&Service{logger: quiet(), disabled: true}).StoragePath(); len(got) != 0 { + t.Errorf("disabled StoragePath = %v, want empty", got) + } + if got := (&Service{logger: quiet()}).StoragePath(); len(got) != 0 { + t.Errorf("nil-provider StoragePath = %v, want empty", got) + } +} + +func TestReopenVectorStore_SwapsToNewNamespace(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") + const project = "/proj" + + // Initial store has one chunk for the project, under ollama's namespace. + oldDir := filepath.Join(base, "ollama", "m") + initial, err := vectorstore.Open(oldDir) + if err != nil { + t.Fatal(err) + } + if err := initial.UpsertChunks(context.Background(), project, + []vectorstore.Chunk{{Content: "x", FilePath: "a.go", StartLine: 1, EndLine: 1, Language: "go"}}, + [][]float32{{1, 0}}); err != nil { + t.Fatal(err) + } + holder := vectorstore.NewHolder(initial) + if holder.Count(project) != 1 { + t.Fatalf("precondition: initial holder count != 1") + } + + s := &Service{logger: quiet()} + s.AttachVectorStore(holder, nestedDirFor(base), vectorstore.Open, nil) + + // Switch to a new identity → reopen into a fresh, empty namespace. + if err := s.reopenVectorStore(fakeProv{id: "voyage:voyage-code-3:2048:float"}); err != nil { + t.Fatalf("reopen: %v", err) + } + if got := holder.Count(project); got != 0 { + t.Errorf("after reopen Count = %d, want 0 (new empty namespace)", got) + } + // New nested dir created on disk; old dir still present (reuse on switch back). + if !dirExists(filepath.Join(base, "voyage", "voyage_code_3", "2048", "float")) { + t.Errorf("new nested chroma dir should exist") + } + if !dirExists(oldDir) { + t.Errorf("old chroma dir should be preserved") + } +} + +func TestReopenVectorStore_OpenerFailureKeepsOldStore(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") + const project = "/proj" + initial, err := vectorstore.Open(filepath.Join(base, "ollama", "m")) + if err != nil { + t.Fatal(err) + } + if err := initial.UpsertChunks(context.Background(), project, + []vectorstore.Chunk{{Content: "x", FilePath: "a.go", StartLine: 1, EndLine: 1, Language: "go"}}, + [][]float32{{1, 0}}); err != nil { + t.Fatal(err) + } + holder := vectorstore.NewHolder(initial) + + s := &Service{logger: quiet()} + s.AttachVectorStore( + holder, + nestedDirFor(base), + func(string) (*vectorstore.Store, error) { return nil, errors.New("boom") }, + nil, + ) + + err = s.reopenVectorStore(fakeProv{id: "voyage:m:2048:float"}) + if err == nil { + t.Fatal("expected reopen error") + } + // Holder must still serve the OLD store (no Swap on failure). + if got := holder.Count(project); got != 1 { + t.Errorf("after failed reopen Count = %d, want 1 (old store retained)", got) + } +} + +func TestReopenVectorStore_NoopWhenUnwired(t *testing.T) { + // A Service without AttachVectorStore must not panic / error. + s := &Service{logger: quiet()} + if err := s.reopenVectorStore(fakeProv{id: "voyage:m:2048:float"}); err != nil { + t.Errorf("unwired reopen should be a no-op, got %v", err) + } +} + +func dirExists(path string) bool { + st, err := os.Stat(path) + return err == nil && st.IsDir() +} + +// TestRestart_ConcurrentWithEmbeds_NoRace guards H1: Restart swaps the +// s.queue pointer when the concurrency cap changes, while Embed* callers +// read it to acquire/release slots. Run under -race with many embedders +// hammering the queue while a restarter repeatedly swaps it. A remote +// (non-ollama) fake provider keeps Restart on the queue-only path with no +// sidecar to manage. +func TestRestart_ConcurrentWithEmbeds_NoRace(t *testing.T) { + s := &Service{ + logger: quiet(), + queue: NewQueue(2, time.Second), + current: fakeProv{id: "fake:m"}, + } + + var wg sync.WaitGroup + stop := make(chan struct{}) + + for i := 0; i < 6; i++ { + wg.Add(1) + go func() { + defer wg.Done() + for { + select { + case <-stop: + return + default: + _, _ = s.EmbedTexts(context.Background(), []string{"x"}) + } + } + }() + } + + wg.Add(1) + go func() { + defer wg.Done() + for i := 0; i < 300; i++ { + // Alternate the cap so every other Restart actually swaps the + // queue pointer (the racy write H1 fixes). + n := 2 + (i % 3) // 2, 3, 4 + _ = s.Restart(context.Background(), &config.Config{ + MaxEmbeddingConcurrency: n, + EmbeddingQueueTimeout: 1, + }) + } + close(stop) + }() + + wg.Wait() +} diff --git a/server/internal/embeddingscfg/embeddingscfg.go b/server/internal/embeddingscfg/embeddingscfg.go new file mode 100644 index 0000000..0e869ef --- /dev/null +++ b/server/internal/embeddingscfg/embeddingscfg.go @@ -0,0 +1,118 @@ +// Package embeddingscfg persists the pluggable-provider selection + +// per-provider config blob in runtime_settings. It sits alongside +// runtimecfg (which still owns the ollama-tuning subset for backward +// compat) but exposes a separate Service so the admin endpoints have +// a single clean surface for provider switching. +// +// Resolution flow at boot: +// +// row.embedding_provider IS NULL → seed from CIX_EMBEDDING_PROVIDER +// (default "ollama") + env-derived +// ollama config blob, write to DB. +// row.embedding_provider IS NOT NULL → load from DB verbatim. Env +// vars (other than API-key envs +// read live by providers) are +// ignored — DB is authoritative. +package embeddingscfg + +import ( + "context" + "database/sql" + "encoding/json" + "errors" + "fmt" + "time" +) + +// Snapshot is what's currently persisted. Returned by Get; supplied +// to Service.Save for atomic updates. +type Snapshot struct { + Kind string // "ollama" | "openai" | "voyage" + Config json.RawMessage // provider-specific blob; opaque to this package +} + +// Service is the persistence-only layer. Stateless aside from the +// *sql.DB it wraps. +type Service struct { + db *sql.DB +} + +// New constructs a Service over the given *sql.DB. +func New(db *sql.DB) *Service { return &Service{db: db} } + +// Get returns the persisted provider selection. (Snapshot{}, false, nil) +// means the runtime_settings row has no provider yet — the caller +// should seed it from env. +func (s *Service) Get(ctx context.Context) (Snapshot, bool, error) { + var ( + kind sql.NullString + cfg sql.NullString + ) + err := s.db.QueryRowContext(ctx, ` + SELECT embedding_provider, embedding_provider_config + FROM runtime_settings WHERE id = 1 + `).Scan(&kind, &cfg) + if errors.Is(err, sql.ErrNoRows) { + return Snapshot{}, false, nil + } + if err != nil { + return Snapshot{}, false, fmt.Errorf("select embedding_provider: %w", err) + } + if !kind.Valid || kind.String == "" { + return Snapshot{}, false, nil + } + snap := Snapshot{Kind: kind.String} + if cfg.Valid && cfg.String != "" { + snap.Config = json.RawMessage(cfg.String) + } + return snap, true, nil +} + +// Save persists kind + config bytes. Inserts the row when absent +// (CHECK(id=1) keeps it single-row). updated_by labels the actor for +// audit; pass "" when the change is server-internal (bootstrap seed). +func (s *Service) Save(ctx context.Context, snap Snapshot, updatedBy string) error { + if snap.Kind == "" { + return errors.New("embeddingscfg: kind is required") + } + if len(snap.Config) > 0 { + if !json.Valid(snap.Config) { + return errors.New("embeddingscfg: config is not valid JSON") + } + } + now := time.Now().UTC().Format(time.RFC3339Nano) + + // Try UPDATE first; if no rows affected the row doesn't exist yet + // and we INSERT with the required CHECK(id=1). + res, err := s.db.ExecContext(ctx, ` + UPDATE runtime_settings + SET embedding_provider = ?, embedding_provider_config = ?, + updated_at = ?, updated_by = ? + WHERE id = 1 + `, snap.Kind, string(snap.Config), now, nullableUpdater(updatedBy)) + if err != nil { + return fmt.Errorf("update embedding_provider: %w", err) + } + n, _ := res.RowsAffected() + if n > 0 { + return nil + } + + _, err = s.db.ExecContext(ctx, ` + INSERT INTO runtime_settings ( + id, embedding_provider, embedding_provider_config, + updated_at, updated_by + ) VALUES (1, ?, ?, ?, ?) + `, snap.Kind, string(snap.Config), now, nullableUpdater(updatedBy)) + if err != nil { + return fmt.Errorf("insert embedding_provider: %w", err) + } + return nil +} + +func nullableUpdater(v string) any { + if v == "" { + return nil + } + return v +} diff --git a/server/internal/embeddingscfg/embeddingscfg_test.go b/server/internal/embeddingscfg/embeddingscfg_test.go new file mode 100644 index 0000000..fcde989 --- /dev/null +++ b/server/internal/embeddingscfg/embeddingscfg_test.go @@ -0,0 +1,122 @@ +package embeddingscfg + +import ( + "context" + "encoding/json" + "testing" + + "github.com/dvcdsys/code-index/server/internal/db" +) + +// openTestService opens an in-memory DB (all migrations applied, including +// migration 12 which adds the embedding_provider columns) and wraps it. +func openTestService(t *testing.T) *Service { + t.Helper() + d, err := db.Open(":memory:") + if err != nil { + t.Fatalf("db.Open: %v", err) + } + t.Cleanup(func() { d.Close() }) + return New(d) +} + +// TestGet_FreshDB_NoRow covers the "no provider persisted yet" path: the +// runtime_settings row has no embedding_provider, so Get reports has=false +// and the caller is expected to seed from env. +func TestGet_FreshDB_NoRow(t *testing.T) { + s := openTestService(t) + snap, has, err := s.Get(context.Background()) + if err != nil { + t.Fatalf("Get: %v", err) + } + if has { + t.Fatalf("has = true on a fresh DB, want false (snap=%+v)", snap) + } +} + +// TestSave_Insert_ThenGet covers the INSERT branch (no row yet) and a +// round-trip read of kind + config. +func TestSave_Insert_ThenGet(t *testing.T) { + s := openTestService(t) + cfg := json.RawMessage(`{"model":"voyage-code-3","output_dimension":2048}`) + if err := s.Save(context.Background(), Snapshot{Kind: "voyage", Config: cfg}, "admin-1"); err != nil { + t.Fatalf("Save: %v", err) + } + snap, has, err := s.Get(context.Background()) + if err != nil { + t.Fatalf("Get: %v", err) + } + if !has { + t.Fatalf("has = false after Save, want true") + } + if snap.Kind != "voyage" { + t.Errorf("Kind = %q, want voyage", snap.Kind) + } + if string(snap.Config) != string(cfg) { + t.Errorf("Config = %q, want %q", snap.Config, cfg) + } +} + +// TestSave_Update_Overwrites covers the UPDATE branch (row already exists): +// a second Save must overwrite the persisted kind + config rather than +// inserting a duplicate or being ignored. +func TestSave_Update_Overwrites(t *testing.T) { + s := openTestService(t) + ctx := context.Background() + if err := s.Save(ctx, Snapshot{Kind: "ollama", Config: json.RawMessage(`{"model":"a"}`)}, ""); err != nil { + t.Fatalf("first Save: %v", err) + } + newCfg := json.RawMessage(`{"model":"text-embedding-3-large","dimensions":256}`) + if err := s.Save(ctx, Snapshot{Kind: "openai", Config: newCfg}, "admin-2"); err != nil { + t.Fatalf("second Save: %v", err) + } + snap, has, err := s.Get(ctx) + if err != nil { + t.Fatalf("Get: %v", err) + } + if !has || snap.Kind != "openai" { + t.Fatalf("after overwrite: has=%v kind=%q, want has=true kind=openai", has, snap.Kind) + } + if string(snap.Config) != string(newCfg) { + t.Errorf("Config = %q, want %q", snap.Config, newCfg) + } +} + +// TestSave_EmptyConfig covers a provider persisted with no config blob: +// json.Valid is skipped (len 0) and Get reports has=true with a nil Config. +func TestSave_EmptyConfig(t *testing.T) { + s := openTestService(t) + ctx := context.Background() + if err := s.Save(ctx, Snapshot{Kind: "ollama"}, ""); err != nil { + t.Fatalf("Save: %v", err) + } + snap, has, err := s.Get(ctx) + if err != nil { + t.Fatalf("Get: %v", err) + } + if !has || snap.Kind != "ollama" { + t.Fatalf("has=%v kind=%q, want has=true kind=ollama", has, snap.Kind) + } + if len(snap.Config) != 0 { + t.Errorf("Config = %q, want empty", snap.Config) + } +} + +// TestSave_RejectsEmptyKind: a Snapshot with no kind is a programming error +// and must be refused before touching the DB. +func TestSave_RejectsEmptyKind(t *testing.T) { + s := openTestService(t) + if err := s.Save(context.Background(), Snapshot{Config: json.RawMessage(`{}`)}, ""); err == nil { + t.Fatal("Save with empty kind succeeded, want error") + } +} + +// TestSave_RejectsInvalidJSON: a non-empty config that isn't valid JSON must +// be refused so a malformed blob never lands in the DB for boot to choke on. +func TestSave_RejectsInvalidJSON(t *testing.T) { + s := openTestService(t) + err := s.Save(context.Background(), Snapshot{Kind: "voyage", Config: json.RawMessage(`{not json`)}, "") + if err == nil { + t.Fatal("Save with invalid JSON config succeeded, want error") + } +} diff --git a/server/internal/httpapi/admin_embeddings.go b/server/internal/httpapi/admin_embeddings.go new file mode 100644 index 0000000..c4529e0 --- /dev/null +++ b/server/internal/httpapi/admin_embeddings.go @@ -0,0 +1,306 @@ +// admin_embeddings.go implements the pluggable-embedding-provider +// admin surface: +// +// GET /api/v1/admin/embedding-providers — registered kinds + schemas + env-key readiness +// GET /api/v1/admin/embedding-providers/active — currently active kind + persisted config +// PUT /api/v1/admin/embedding-providers/active — atomic switch (validate → persist → swap) +// POST /api/v1/admin/embedding-providers/{kind}/test — pre-save sanity check using submitted config +// +// All routes are admin-only (mustBeAdmin). The handlers below are +// mounted directly onto the chi router in router.go — they are not +// part of the OpenAPI-generated handler set so the openapi.yaml / +// regenerated openapi.gen.go can be updated independently. +package httpapi + +import ( + "encoding/json" + "errors" + "io" + "net/http" + "os" + + "github.com/dvcdsys/code-index/server/internal/embeddings" + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" + "github.com/dvcdsys/code-index/server/internal/embeddingscfg" + "github.com/go-chi/chi/v5" +) + +// providerInfoPayload is the per-kind entry in GET /embedding-providers. +type providerInfoPayload struct { + Kind string `json:"kind"` + Schema json.RawMessage `json:"schema"` + SecretEnvs []secretEnvPayload `json:"secret_envs"` +} + +// secretEnvPayload tells the dashboard which env-var names a provider +// reads and whether they are currently set on the server. Used to +// render the "set CIX_VOYAGE_API_KEY before saving" banner. +type secretEnvPayload struct { + Name string `json:"name"` + Set bool `json:"set"` +} + +// activeProviderPayload is the GET /embedding-providers/active body. +// Config is the persisted JSON blob, returned verbatim — providers +// store env-key NAMES rather than values, so this is safe to render +// to admin clients. +type activeProviderPayload struct { + Kind string `json:"kind"` + Config json.RawMessage `json:"config"` + // ID is Provider.ID() — surfaced so the UI can compare against + // each project's indexed_with_model and render the stale-model + // badge without going through /status. + ID string `json:"id"` +} + +// switchProviderRequest is the PUT /embedding-providers/active body. +type switchProviderRequest struct { + Kind string `json:"kind"` + Config json.RawMessage `json:"config"` +} + +// testProviderResponse is what /test returns on success. +type testProviderResponse struct { + OK bool `json:"ok"` + Dimension int `json:"dimension,omitempty"` +} + +// ListEmbeddingProviders — GET /api/v1/admin/embedding-providers. +func (s *Server) ListEmbeddingProviders(w http.ResponseWriter, r *http.Request) { + if _, ok := s.mustBeAdmin(w, r); !ok { + return + } + kinds := provider.Kinds() + out := make([]providerInfoPayload, 0, len(kinds)) + for _, kind := range kinds { + f, ok := provider.Lookup(kind) + if !ok { + continue + } + envs := f.SecretEnvVars() + envPayload := make([]secretEnvPayload, 0, len(envs)) + for _, name := range envs { + _, present := os.LookupEnv(name) + envPayload = append(envPayload, secretEnvPayload{Name: name, Set: present}) + } + out = append(out, providerInfoPayload{ + Kind: kind, + Schema: f.SchemaJSON(), + SecretEnvs: envPayload, + }) + } + writeJSON(w, http.StatusOK, map[string]any{"providers": out}) +} + +// GetActiveEmbeddingProvider — GET /api/v1/admin/embedding-providers/active. +func (s *Server) GetActiveEmbeddingProvider(w http.ResponseWriter, r *http.Request) { + if _, ok := s.mustBeAdmin(w, r); !ok { + return + } + embedSvc, ok := s.Deps.EmbeddingSvc.(*embeddings.Service) + if !ok || embedSvc == nil { + writeError(w, http.StatusServiceUnavailable, "embeddings service not available") + return + } + if s.Deps.EmbeddingsCfg == nil { + writeError(w, http.StatusServiceUnavailable, "embedding provider store not available") + return + } + snap, has, err := s.Deps.EmbeddingsCfg.Get(r.Context()) + if err != nil { + writeError(w, http.StatusInternalServerError, "load active provider: "+err.Error()) + return + } + if !has { + // No persisted row → fall back to the live provider's kind and + // the env-derived config. This handles fresh installs where + // the DB hasn't been seeded yet. + snap = embeddingscfg.Snapshot{Kind: embedSvc.CurrentKind()} + } + writeJSON(w, http.StatusOK, activeProviderPayload{ + Kind: snap.Kind, + Config: snap.Config, + ID: embedSvc.EmbeddingModel(), + }) +} + +// SwitchEmbeddingProvider — PUT /api/v1/admin/embedding-providers/active. +// +// Validate-then-persist: SwitchProvider builds + Start()s the new +// provider and swaps the live Service over (failing without side effects +// if the config is bad), and only on success is the new provider +// persisted. On any failure the previously-active provider stays the +// live AND persisted one. +func (s *Server) SwitchEmbeddingProvider(w http.ResponseWriter, r *http.Request) { + user, ok := s.mustBeAdmin(w, r) + if !ok { + return + } + embedSvc, ok := s.Deps.EmbeddingSvc.(*embeddings.Service) + if !ok || embedSvc == nil { + writeError(w, http.StatusServiceUnavailable, "embeddings service not available") + return + } + if s.Deps.EmbeddingsCfg == nil { + writeError(w, http.StatusServiceUnavailable, "embedding provider store not available") + return + } + + body, err := io.ReadAll(io.LimitReader(r.Body, 64*1024)) + if err != nil { + writeError(w, http.StatusBadRequest, "read body: "+err.Error()) + return + } + var req switchProviderRequest + if err := json.Unmarshal(body, &req); err != nil { + writeError(w, http.StatusBadRequest, "decode body: "+err.Error()) + return + } + if req.Kind == "" { + writeError(w, http.StatusBadRequest, "kind is required") + return + } + if _, ok := provider.Lookup(req.Kind); !ok { + writeError(w, http.StatusBadRequest, "unknown provider kind: "+req.Kind) + return + } + + // Special case for ollama: the per-kind form in the dashboard + // does NOT carry ollama tuning fields (those live in the + // runtime-config sections + env). When the admin switches back + // to ollama from a remote provider, the dashboard sends an empty + // blob; we synthesize the config here from the live runtime-cfg + // snapshot applied to the env-derived defaults so the next + // Start() has everything it needs (model, GGUF cache dir, llama + // bin dir, transport, …). + cfgBytes := req.Config + if req.Kind == provider.KindOllama && (len(cfgBytes) == 0 || string(cfgBytes) == "{}" || string(cfgBytes) == "null") { + envCfg := embedSvc.Config() + if envCfg == nil { + writeError(w, http.StatusInternalServerError, "ollama config: live cfg unavailable") + return + } + // Merge the latest runtime-cfg overrides on top of env so a + // recent PUT /admin/runtime-config (which doesn't auto-apply + // while a remote provider is active) takes effect on switch. + if s.Deps.RuntimeCfg != nil { + snap, snapErr := s.Deps.RuntimeCfg.Get(r.Context()) + if snapErr == nil { + snap.ApplyTo(envCfg) + } + } + built, buildErr := embeddings.BuildOllamaConfigFromEnv(envCfg) + if buildErr != nil { + writeError(w, http.StatusInternalServerError, "build ollama config: "+buildErr.Error()) + return + } + cfgBytes = built + } + if len(cfgBytes) == 0 { + writeError(w, http.StatusBadRequest, "config is required") + return + } + + // Resolve the actor for the audit field. mustBeAdmin returns a nil + // authContext when CIX_AUTH_DISABLED=true (a legitimate deployment + // mode), so guard before dereferencing — matches the ac != nil + // pattern used elsewhere (e.g. tunnels.go). + actorID := "" + if user != nil { + actorID = user.User.ID + } + + // Validate-then-persist: apply the swap live FIRST (SwitchProvider + // builds + Start()s the new provider before touching live state, so a + // bad config / unreachable endpoint / missing key fails here without + // changing anything). Only persist once the swap succeeded. Persisting + // first would let a transient switch failure leave the DB pointing at a + // provider that won't start — which the boot path then re-adopts, + // turning one failed switch into a delayed boot brick. + if err := embedSvc.SwitchProvider(r.Context(), req.Kind, cfgBytes); err != nil { + writeError(w, http.StatusInternalServerError, "switch provider: "+err.Error()) + return + } + + if err := s.Deps.EmbeddingsCfg.Save(r.Context(), embeddingscfg.Snapshot{ + Kind: req.Kind, + Config: cfgBytes, + }, actorID); err != nil { + // The live provider IS switched but we couldn't persist it. Report + // the inconsistency loudly; a container restart will revert to the + // previous persisted provider (safe — both old config and old + // vectors are intact), so this degrades rather than corrupts. + s.Deps.Logger.Error("embedding provider switched live but persist failed; will revert on restart", + "kind", req.Kind, "err", err) + writeError(w, http.StatusInternalServerError, "provider switched but persist failed (will revert on restart): "+err.Error()) + return + } + + writeJSON(w, http.StatusAccepted, activeProviderPayload{ + Kind: req.Kind, + Config: cfgBytes, + ID: embedSvc.EmbeddingModel(), + }) +} + +// TestEmbeddingProvider — POST /api/v1/admin/embedding-providers/{kind}/test. +// +// Builds a throw-away provider from the submitted config (without +// persisting it), Starts it, then Stops it. Returns the detected +// dimension and a success flag, or a typed error message the +// dashboard can render verbatim. +func (s *Server) TestEmbeddingProvider(w http.ResponseWriter, r *http.Request) { + if _, ok := s.mustBeAdmin(w, r); !ok { + return + } + kind := chi.URLParam(r, "kind") + if kind == "" { + writeError(w, http.StatusBadRequest, "kind is required") + return + } + if _, ok := provider.Lookup(kind); !ok { + writeError(w, http.StatusBadRequest, "unknown provider kind: "+kind) + return + } + // The /test endpoint Start()s a throw-away provider. For the HTTP-only + // providers that is a harmless one-shot connect probe, but ollama's + // Start() spawns a real llama-server child — on the SAME socket path as + // the live sidecar, which would conflict with the running process. The + // dashboard never tests ollama (it is configured via the runtime-config + // form, not this endpoint), so reject it here rather than risk spawning + // a competing sidecar from an ad-hoc admin call. + if kind == provider.KindOllama { + writeError(w, http.StatusBadRequest, + "ollama is configured via the runtime-config form, not the provider test endpoint; testing it here would spawn a competing llama-server sidecar") + return + } + + body, err := io.ReadAll(io.LimitReader(r.Body, 64*1024)) + if err != nil { + writeError(w, http.StatusBadRequest, "read body: "+err.Error()) + return + } + if len(body) == 0 { + writeError(w, http.StatusBadRequest, "config body is required") + return + } + + prov, err := provider.Build(r.Context(), kind, body, embeddings.EnvSecrets(), s.Deps.Logger) + if err != nil { + writeError(w, http.StatusBadRequest, "build provider: "+err.Error()) + return + } + if err := prov.Start(r.Context()); err != nil { + // Distinguish missing-key from other failures — the dashboard + // renders these differently (banner vs error toast). + if errors.Is(err, provider.ErrMissingAPIKey) { + writeError(w, http.StatusBadRequest, err.Error()) + return + } + writeError(w, http.StatusBadGateway, err.Error()) + return + } + dim := prov.Dimension() + _ = prov.Stop(r.Context()) + writeJSON(w, http.StatusOK, testProviderResponse{OK: true, Dimension: dim}) +} diff --git a/server/internal/httpapi/admin_embeddings_test.go b/server/internal/httpapi/admin_embeddings_test.go new file mode 100644 index 0000000..d629d3c --- /dev/null +++ b/server/internal/httpapi/admin_embeddings_test.go @@ -0,0 +1,198 @@ +package httpapi + +import ( + "bytes" + "encoding/json" + "net/http" + "net/http/httptest" + "testing" + + "github.com/dvcdsys/code-index/server/internal/embeddingscfg" +) + +// TestListEmbeddingProviders_AdminSeesAllThree confirms the registered +// kinds (ollama, openai, voyage) show up with their schemas and the +// secret-env readiness flags. +func TestListEmbeddingProviders_AdminSeesAllThree(t *testing.T) { + f := newAdminFixture(t) + f.Deps.EmbeddingsCfg = embeddingscfg.New(f.Deps.DB) + f.Router = NewRouter(f.Deps) + cookie := adminCookie(t, f) + + req := withCookie(httptest.NewRequest(http.MethodGet, "/api/v1/admin/embedding-providers", nil), cookie) + rr := httptest.NewRecorder() + f.Router.ServeHTTP(rr, req) + if rr.Code != http.StatusOK { + t.Fatalf("status = %d (%s)", rr.Code, rr.Body.String()) + } + + var body struct { + Providers []struct { + Kind string `json:"kind"` + Schema json.RawMessage `json:"schema"` + SecretEnvs []struct { + Name string `json:"name"` + Set bool `json:"set"` + } `json:"secret_envs"` + } `json:"providers"` + } + if err := json.Unmarshal(rr.Body.Bytes(), &body); err != nil { + t.Fatalf("decode: %v", err) + } + got := map[string]bool{} + for _, p := range body.Providers { + got[p.Kind] = true + if len(p.Schema) == 0 { + t.Errorf("kind %s: empty schema", p.Kind) + } + } + for _, want := range []string{"ollama", "openai", "voyage"} { + if !got[want] { + t.Errorf("missing kind %q in providers list", want) + } + } +} + +func TestListEmbeddingProviders_ViewerForbidden(t *testing.T) { + f := newAdminFixture(t) + f.Deps.EmbeddingsCfg = embeddingscfg.New(f.Deps.DB) + f.Router = NewRouter(f.Deps) + cookie := viewerCookie(t, f) + + req := withCookie(httptest.NewRequest(http.MethodGet, "/api/v1/admin/embedding-providers", nil), cookie) + rr := httptest.NewRecorder() + f.Router.ServeHTTP(rr, req) + if rr.Code != http.StatusForbidden { + t.Fatalf("status = %d, want 403 (body=%s)", rr.Code, rr.Body.String()) + } +} + +// The following three tests close the per-endpoint 403 gating gap +// required by the project's auth rule: a non-admin (viewer) caller must +// be rejected on EVERY admin embedding-provider route, not just the list. + +func TestGetActiveEmbeddingProvider_ViewerForbidden(t *testing.T) { + f := newAdminFixture(t) + f.Deps.EmbeddingsCfg = embeddingscfg.New(f.Deps.DB) + f.Router = NewRouter(f.Deps) + cookie := viewerCookie(t, f) + + req := withCookie(httptest.NewRequest(http.MethodGet, + "/api/v1/admin/embedding-providers/active", nil), cookie) + rr := httptest.NewRecorder() + f.Router.ServeHTTP(rr, req) + if rr.Code != http.StatusForbidden { + t.Fatalf("status = %d, want 403 (body=%s)", rr.Code, rr.Body.String()) + } +} + +func TestSwitchEmbeddingProvider_ViewerForbidden(t *testing.T) { + f := newAdminFixture(t) + f.Deps.EmbeddingsCfg = embeddingscfg.New(f.Deps.DB) + f.Router = NewRouter(f.Deps) + cookie := viewerCookie(t, f) + + body, _ := json.Marshal(map[string]any{"kind": "ollama", "config": map[string]any{}}) + req := withCookie(httptest.NewRequest(http.MethodPut, + "/api/v1/admin/embedding-providers/active", bytes.NewReader(body)), cookie) + rr := httptest.NewRecorder() + f.Router.ServeHTTP(rr, req) + if rr.Code != http.StatusForbidden { + t.Fatalf("status = %d, want 403 (body=%s)", rr.Code, rr.Body.String()) + } +} + +func TestTestEmbeddingProvider_ViewerForbidden(t *testing.T) { + f := newAdminFixture(t) + f.Deps.EmbeddingsCfg = embeddingscfg.New(f.Deps.DB) + f.Router = NewRouter(f.Deps) + cookie := viewerCookie(t, f) + + req := withCookie(httptest.NewRequest(http.MethodPost, + "/api/v1/admin/embedding-providers/voyage/test", bytes.NewReader([]byte(`{}`))), cookie) + rr := httptest.NewRecorder() + f.Router.ServeHTTP(rr, req) + if rr.Code != http.StatusForbidden { + t.Fatalf("status = %d, want 403 (body=%s)", rr.Code, rr.Body.String()) + } +} + +func TestSwitchEmbeddingProvider_RejectsUnknownKind(t *testing.T) { + f := newAdminFixture(t) + f.Deps.EmbeddingsCfg = embeddingscfg.New(f.Deps.DB) + f.Router = NewRouter(f.Deps) + cookie := adminCookie(t, f) + + body, _ := json.Marshal(map[string]any{ + "kind": "nonexistent", + "config": map[string]any{}, + }) + req := withCookie(httptest.NewRequest(http.MethodPut, + "/api/v1/admin/embedding-providers/active", bytes.NewReader(body)), cookie) + rr := httptest.NewRecorder() + f.Router.ServeHTTP(rr, req) + // EmbeddingSvc is nil in this fixture — handler should still validate + // kind first when EmbeddingSvc is missing it returns 503 instead. + // Accept either as long as we don't get 200. + if rr.Code == http.StatusOK || rr.Code == http.StatusAccepted { + t.Fatalf("unknown kind accepted: status %d (%s)", rr.Code, rr.Body.String()) + } +} + +func TestTestEmbeddingProvider_BadKind(t *testing.T) { + f := newAdminFixture(t) + f.Deps.EmbeddingsCfg = embeddingscfg.New(f.Deps.DB) + f.Router = NewRouter(f.Deps) + cookie := adminCookie(t, f) + + body := []byte(`{}`) + req := withCookie(httptest.NewRequest(http.MethodPost, + "/api/v1/admin/embedding-providers/garbage/test", bytes.NewReader(body)), cookie) + rr := httptest.NewRecorder() + f.Router.ServeHTTP(rr, req) + if rr.Code != http.StatusBadRequest { + t.Fatalf("status = %d, want 400 (body=%s)", rr.Code, rr.Body.String()) + } +} + +// TestTestEmbeddingProvider_RejectsOllama guards the /test endpoint against +// kind=ollama: Start()ing a throw-away ollama provider would spawn a second +// llama-server child on the live sidecar's socket. The endpoint must reject +// it (400) without ever building/starting a provider. +func TestTestEmbeddingProvider_RejectsOllama(t *testing.T) { + f := newAdminFixture(t) + f.Deps.EmbeddingsCfg = embeddingscfg.New(f.Deps.DB) + f.Router = NewRouter(f.Deps) + cookie := adminCookie(t, f) + + req := withCookie(httptest.NewRequest(http.MethodPost, + "/api/v1/admin/embedding-providers/ollama/test", bytes.NewReader([]byte(`{}`))), cookie) + rr := httptest.NewRecorder() + f.Router.ServeHTTP(rr, req) + if rr.Code != http.StatusBadRequest { + t.Fatalf("status = %d, want 400 (body=%s)", rr.Code, rr.Body.String()) + } +} + +func TestTestEmbeddingProvider_MissingAPIKey(t *testing.T) { + f := newAdminFixture(t) + f.Deps.EmbeddingsCfg = embeddingscfg.New(f.Deps.DB) + f.Router = NewRouter(f.Deps) + cookie := adminCookie(t, f) + + // Submit a voyage config naming an env var that almost certainly isn't + // set in CI. Expect a 400 with ErrMissingAPIKey wrapped in the body. + body, _ := json.Marshal(map[string]any{ + "api_key_env": "CIX_TEST_MISSING_VOYAGE_KEY_XYZ", + "model": "voyage-3", + "output_dimension": 1024, + "output_dtype": "float", + }) + req := withCookie(httptest.NewRequest(http.MethodPost, + "/api/v1/admin/embedding-providers/voyage/test", bytes.NewReader(body)), cookie) + rr := httptest.NewRecorder() + f.Router.ServeHTTP(rr, req) + if rr.Code != http.StatusBadRequest { + t.Fatalf("status = %d, want 400 (body=%s)", rr.Code, rr.Body.String()) + } +} diff --git a/server/internal/httpapi/admin_server.go b/server/internal/httpapi/admin_server.go index b579bc3..b6b49d5 100644 --- a/server/internal/httpapi/admin_server.go +++ b/server/internal/httpapi/admin_server.go @@ -19,6 +19,7 @@ import ( "time" "github.com/dvcdsys/code-index/server/internal/embeddings" + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" "github.com/dvcdsys/code-index/server/internal/runtimecfg" "github.com/google/uuid" @@ -261,17 +262,22 @@ func (s *Server) GetSidecarStatus(w http.ResponseWriter, r *http.Request) { } st := embedSvc.Status() + // "ready" must be true for a healthy provider of EITHER flavour: a + // running ollama sidecar (StateRunning) OR an operational HTTP-only + // provider (StateRemote — openai/voyage, which have no managed + // process to be "running"). Comparing only against "running" + // regressed every remote provider to a permanently not-ready status. body := map[string]any{ "state": st.State, - "ready": st.Ready, + "ready": st.State == provider.StateRunning || st.State == provider.StateRemote, "in_flight": st.InFlight, "restart_in_flight": restartInFlight.Load(), } if st.PID > 0 { body["pid"] = st.PID } - if st.Uptime > 0 { - body["uptime_seconds"] = int(st.Uptime.Seconds()) + if st.UptimeSeconds > 0 { + body["uptime_seconds"] = st.UptimeSeconds } if st.Model != "" { body["model"] = st.Model diff --git a/server/internal/httpapi/router.go b/server/internal/httpapi/router.go index 2be7bf5..59fad07 100644 --- a/server/internal/httpapi/router.go +++ b/server/internal/httpapi/router.go @@ -13,6 +13,7 @@ import ( "github.com/dvcdsys/code-index/server/internal/apikeys" "github.com/dvcdsys/code-index/server/internal/embeddings" + "github.com/dvcdsys/code-index/server/internal/embeddingscfg" "github.com/dvcdsys/code-index/server/internal/gitrepos" "github.com/dvcdsys/code-index/server/internal/githubtokens" "github.com/dvcdsys/code-index/server/internal/groups" @@ -73,14 +74,21 @@ type Deps struct { // tests). Phase 5 uses it for semantic search. EmbeddingSvc EmbeddingsQuerier // VectorStore is the chromem-go backed vector store (Phase 4). Nil-safe: - // semantic search returns empty results when absent. - VectorStore *vectorstore.Store + // semantic search returns empty results when absent. Typed as the + // vectorstore.Interface so production can supply a *vectorstore.Holder + // (swappable on provider switch) while tests pass a raw *Store. + VectorStore vectorstore.Interface // Indexer drives the three-phase index protocol (Phase 5). Nil-safe: the // indexing endpoints return 503 when absent. Indexer *indexer.Service // RuntimeCfg backs the dashboard's /admin/runtime-config endpoints. Nil // in router-only tests; admin handlers return 503 when absent. RuntimeCfg *runtimecfg.Service + // EmbeddingsCfg persists the pluggable-provider selection + config + // blob in runtime_settings. Read by the /embedding-providers admin + // handlers; nil in router-only tests (those handlers 503 when + // absent). + EmbeddingsCfg *embeddingscfg.Service // VersionCheck polls GitHub for newer server releases. Nil = feature // off; GetStatus then omits the version-check fields entirely. VersionCheck *versioncheck.Service @@ -184,5 +192,14 @@ func NewRouter(d Deps) http.Handler { // one chi route per OpenAPI operation, dispatching to Server methods. openapi.HandlerFromMux(srv, r) + // Embedding-provider admin endpoints — mounted directly because + // they are not yet in openapi.yaml. The handlers each gate on + // mustBeAdmin; nothing reaches them without an admin session / + // API key. + r.Get("/api/v1/admin/embedding-providers", srv.ListEmbeddingProviders) + r.Get("/api/v1/admin/embedding-providers/active", srv.GetActiveEmbeddingProvider) + r.Put("/api/v1/admin/embedding-providers/active", srv.SwitchEmbeddingProvider) + r.Post("/api/v1/admin/embedding-providers/{kind}/test", srv.TestEmbeddingProvider) + return r } diff --git a/server/internal/httpapi/search.go b/server/internal/httpapi/search.go index 62242f8..060354d 100644 --- a/server/internal/httpapi/search.go +++ b/server/internal/httpapi/search.go @@ -374,7 +374,7 @@ func groupByFile(items []searchResultItem) []fileGroupResult { // applyPostLangFilter=true). func fetchVectorResults( ctx context.Context, - store *vectorstore.Store, + store vectorstore.Interface, projectPath string, qEmb []float32, n int, diff --git a/server/internal/httpapi/server.go b/server/internal/httpapi/server.go index 6030acf..0276bbc 100644 --- a/server/internal/httpapi/server.go +++ b/server/internal/httpapi/server.go @@ -72,31 +72,47 @@ func (s *Server) GetStatus(w http.ResponseWriter, r *http.Request) { _ = s.Deps.DB.QueryRowContext(r.Context(), `SELECT COUNT(*) FROM index_runs WHERE status = 'running'`).Scan(&activeJobs) } + // Provider-aware status. Footer uses these fields: + // embedding_provider — active provider kind ("ollama"/"openai"/"voyage") + // embedding_provider_manages_process — true only for ollama; the + // footer renders a red/green dot based on liveness when true, + // and a permanent green dot otherwise (HTTP-only providers + // have no managed process to "die"). + // embedding_model — Provider.ID() of the live active provider + // model_loaded — true when the active provider reports Ready; + // for non-managed providers we don't ping per /status poll so + // it stays true unless the env-key check inside Ready fails. + providerKind := "" + managesProcess := false modelLoaded := false - if s.Deps.EmbeddingSvc != nil { - readyCtx, cancel := context.WithTimeout(r.Context(), 500*time.Millisecond) - modelLoaded = s.Deps.EmbeddingSvc.Ready(readyCtx) == nil - cancel() - } - // PR-E — embedding_model must reflect the LIVE config (after any - // dashboard runtime override + restart), not the boot-time value - // stamped into Deps. Fall back to Deps when the service is a fake or - // disabled, so test fixtures still get a stable string. model := s.Deps.EmbeddingModel if es, ok := s.Deps.EmbeddingSvc.(*embeddings.Service); ok && es != nil { - if cfg := es.Config(); cfg != nil && cfg.EmbeddingModel != "" { - model = cfg.EmbeddingModel + providerKind = es.CurrentKind() + if id := es.EmbeddingModel(); id != "" { + model = id } + st := es.Status() + managesProcess = st.ManagesProcess + readyCtx, cancel := context.WithTimeout(r.Context(), 500*time.Millisecond) + modelLoaded = es.Ready(readyCtx) == nil + cancel() + } else if s.Deps.EmbeddingSvc != nil { + // Test fakes that only satisfy EmbeddingsQuerier. + readyCtx, cancel := context.WithTimeout(r.Context(), 500*time.Millisecond) + modelLoaded = s.Deps.EmbeddingSvc.Ready(readyCtx) == nil + cancel() } resp := map[string]any{ - "status": "ok", - "backend": s.Deps.Backend, - "server_version": s.Deps.ServerVersion, - "api_version": s.Deps.APIVersion, - "model_loaded": modelLoaded, - "embedding_model": model, - "projects": projectCount, - "active_indexing_jobs": activeJobs, + "status": "ok", + "backend": s.Deps.Backend, + "server_version": s.Deps.ServerVersion, + "api_version": s.Deps.APIVersion, + "model_loaded": modelLoaded, + "embedding_model": model, + "embedding_provider": providerKind, + "embedding_provider_manages_process": managesProcess, + "projects": projectCount, + "active_indexing_jobs": activeJobs, } // Version-check fields — folded in only when the service is wired. // `update_available` is always present (false when unknown) so the @@ -237,7 +253,7 @@ func (s *Server) enrichProjectStorage(out *openapi.Project, p *projects.Project) if cfg == nil { return } - sqlitePath := cfg.DynamicSQLitePath() + sqlitePath := cfg.SQLitePath if sqlitePath != "" { out.SqlitePath = ptrString(sqlitePath) if info, err := os.Stat(sqlitePath); err == nil { @@ -245,9 +261,11 @@ func (s *Server) enrichProjectStorage(out *openapi.Project, p *projects.Project) out.SqliteSizeBytes = &sz } } - if cfg.ChromaPersistDir != "" { + // Chroma dir is namespaced by the ACTIVE provider's identity path, so + // the displayed path tracks whatever provider is live now. + if comps := es.StoragePath(); cfg.ChromaPersistDir != "" && len(comps) > 0 { col := vectorstore.CollectionName(p.HostPath) - dir := filepath.Join(cfg.DynamicChromaPersistDir(), col) + dir := filepath.Join(cfg.ChromaDirFor(comps), col) out.ChromaPath = ptrString(dir) if sz, ok := dirSizeBytes(dir); ok { out.ChromaSizeBytes = &sz @@ -597,7 +615,7 @@ func (s *Server) GetProjectSummary(w http.ResponseWriter, r *http.Request, path } writeJSON(w, http.StatusOK, openapi.ProjectSummary{ - PathHash: projects.HashPath(p.HostPath), + PathHash: p.PathHash, HostPath: p.HostPath, Status: p.Status, Languages: langs, @@ -1028,7 +1046,7 @@ func projectToOpenAPI(p *projects.Project) openapi.Project { } } out := openapi.Project{ - PathHash: projects.HashPath(p.HostPath), + PathHash: p.PathHash, HostPath: p.HostPath, ContainerPath: p.ContainerPath, Languages: langs, diff --git a/server/internal/indexer/indexer.go b/server/internal/indexer/indexer.go index 1d0d008..c08afb1 100644 --- a/server/internal/indexer/indexer.go +++ b/server/internal/indexer/indexer.go @@ -90,7 +90,7 @@ type TokenAwareEmbedder interface { // Service owns sessions and wires dependencies for the three-phase protocol. type Service struct { db *sql.DB - vs *vectorstore.Store + vs vectorstore.Interface emb Embedder logger *slog.Logger @@ -115,11 +115,19 @@ type Service struct { // SetEmbeddingModel from main; empty string keeps the column NULL so // unit tests that skip the setter don't need to know about drift. embeddingModel string + + // embeddingModelLookup, when non-nil, takes precedence over the static + // embeddingModel string above. Used by main.go to bind the indexer + // to a live function (embeddings.Service.EmbeddingModel) so a provider + // switch made at runtime is reflected in the next FinishIndexing write + // without requiring a process restart. Tests typically use the static + // SetEmbeddingModel API and leave this nil. + embeddingModelLookup func() string } // New constructs a Service. All deps are required except logger (falls back to // slog.Default). -func New(db *sql.DB, vs *vectorstore.Store, emb Embedder, logger *slog.Logger) *Service { +func New(db *sql.DB, vs vectorstore.Interface, emb Embedder, logger *slog.Logger) *Service { if logger == nil { logger = slog.Default() } @@ -151,17 +159,34 @@ func (s *Service) SetEmbedIncludePath(v bool) { // projects.indexed_with_model at FinishIndexing. Called from main once the // runtime config is resolved; empty string disables the write (the column // stays NULL — desired for tests that don't care about drift tracking). +// +// In production this is superseded by SetEmbeddingModelLookup, which binds +// the indexer to a live function so provider switches at runtime take +// effect without a process restart. The static setter remains for tests. func (s *Service) SetEmbeddingModel(model string) { s.embeddingModel = model } -// EmbeddingModel returns the identifier most recently passed to -// SetEmbeddingModel. Used by callers (repojobs) that need to compare -// the live model against projects.indexed_with_model to decide whether -// an incremental reindex is safe (same model = vectors comparable) or -// whether a full reindex is required (model change = embedding-space -// drift, all vectors must be regenerated). +// SetEmbeddingModelLookup binds the indexer to a live function returning +// the current Provider.ID() — typically embeddings.Service.EmbeddingModel. +// When set, this takes precedence over SetEmbeddingModel so a runtime +// provider switch (admin PUT /admin/embedding-providers/active) flows into +// the next FinishIndexing write without a process restart. +func (s *Service) SetEmbeddingModelLookup(fn func() string) { + s.embeddingModelLookup = fn +} + +// EmbeddingModel returns the current embedding-model fingerprint. Prefers +// the live lookup when one is bound (production); falls back to the static +// string set via SetEmbeddingModel (tests). Used by callers (repojobs) that +// need to compare the live model against projects.indexed_with_model to +// decide whether an incremental reindex is safe (same model = vectors +// comparable) or whether a full reindex is required (model change = +// embedding-space drift, all vectors must be regenerated). func (s *Service) EmbeddingModel() string { + if s.embeddingModelLookup != nil { + return s.embeddingModelLookup() + } return s.embeddingModel } @@ -757,13 +782,17 @@ func (s *Service) FinishIndexing( // projects whose vectors were produced under a different model than the // one currently loaded in the sidecar. NULLIF keeps the column NULL when // SetEmbeddingModel was never called (tests / pre-PR-E codepaths). + // Reads through EmbeddingModel() so live provider switches (set via + // SetEmbeddingModelLookup) are honoured at write time — the value goes + // into the row in its prefixed form ("ollama:" / "voyage:..."), + // matching the format the drift-detector and dashboard compare against. if _, err := s.db.ExecContext(ctx, `UPDATE projects SET stats = ?, languages = ?, status = 'indexed', last_indexed_at = ?, updated_at = ?, indexed_with_model = NULLIF(?, '') WHERE host_path = ?`, - statsJSON, langsJSON, now, now, s.embeddingModel, projectPath, + statsJSON, langsJSON, now, now, s.EmbeddingModel(), projectPath, ); err != nil { return "", 0, 0, fmt.Errorf("update project stats: %w", err) } diff --git a/server/internal/projects/projects.go b/server/internal/projects/projects.go index 3871804..47f1e7d 100644 --- a/server/internal/projects/projects.go +++ b/server/internal/projects/projects.go @@ -55,7 +55,15 @@ type Stats struct { // Project is the full project record returned from the database. type Project struct { - HostPath string + HostPath string + // PathHash is the STORED path_hash column — the canonical URL identity + // the dashboard links to and GetByHash resolves against. It is returned + // verbatim rather than recomputed from HostPath: a project's host_path + // and its stored path_hash can legitimately diverge (e.g. a local + // project whose host_path is the bare filesystem path while path_hash + // is keyed as sha1("local:{machine}:{path}")), and recomputing from + // host_path would yield a hash that no lookup matches → 404. + PathHash string ContainerPath string Languages []string Settings Settings @@ -242,7 +250,7 @@ func findOverlap(ctx context.Context, db *sql.DB, candidate string) (string, err // Get retrieves a project by its host_path. Returns ErrNotFound if absent. func Get(ctx context.Context, db *sql.DB, hostPath string) (*Project, error) { row := db.QueryRowContext(ctx, - `SELECT host_path, container_path, languages, settings, stats, status, created_at, updated_at, last_indexed_at, indexed_with_model, owner_user_id, display_path, machine_id, machine_label + `SELECT host_path, container_path, languages, settings, stats, status, created_at, updated_at, last_indexed_at, indexed_with_model, owner_user_id, display_path, machine_id, machine_label, path_hash FROM projects WHERE host_path = ?`, hostPath, ) return scanProject(hostPath, row) @@ -270,7 +278,7 @@ func GetByHash(ctx context.Context, db *sql.DB, pathHash string) (*Project, erro // List returns all projects ordered by created_at descending. func List(ctx context.Context, db *sql.DB) ([]Project, error) { rows, err := db.QueryContext(ctx, - `SELECT host_path, container_path, languages, settings, stats, status, created_at, updated_at, last_indexed_at, indexed_with_model, owner_user_id, display_path, machine_id, machine_label + `SELECT host_path, container_path, languages, settings, stats, status, created_at, updated_at, last_indexed_at, indexed_with_model, owner_user_id, display_path, machine_id, machine_label, path_hash FROM projects ORDER BY created_at DESC`, ) if err != nil { @@ -368,12 +376,13 @@ func scanProject(hostPath string, row *sql.Row) (*Project, error) { displayPath *string machineID *string machineLabel *string + pathHash *string ) err := row.Scan( &hp, &containerPath, &langsJSON, &settingsJSON, &statsJSON, &status, &createdAt, &updatedAt, &lastIndexedAt, &indexedWithModel, &ownerUserID, - &displayPath, &machineID, &machineLabel, + &displayPath, &machineID, &machineLabel, &pathHash, ) if errors.Is(err, sql.ErrNoRows) { return nil, fmt.Errorf("%w: %s", ErrNotFound, hostPath) @@ -381,7 +390,7 @@ func scanProject(hostPath string, row *sql.Row) (*Project, error) { if err != nil { return nil, fmt.Errorf("scan project row: %w", err) } - return buildProject(hp, containerPath, langsJSON, settingsJSON, statsJSON, status, createdAt, updatedAt, lastIndexedAt, indexedWithModel, ownerUserID, displayPath, machineID, machineLabel) + return buildProject(hp, containerPath, langsJSON, settingsJSON, statsJSON, status, createdAt, updatedAt, lastIndexedAt, indexedWithModel, ownerUserID, displayPath, machineID, machineLabel, pathHash) } func scanProjectRow(rows *sql.Rows) (*Project, error) { @@ -396,19 +405,20 @@ func scanProjectRow(rows *sql.Rows) (*Project, error) { displayPath *string machineID *string machineLabel *string + pathHash *string ) if err := rows.Scan( &hostPath, &containerPath, &langsJSON, &settingsJSON, &statsJSON, &status, &createdAt, &updatedAt, &lastIndexedAt, &indexedWithModel, &ownerUserID, - &displayPath, &machineID, &machineLabel, + &displayPath, &machineID, &machineLabel, &pathHash, ); err != nil { return nil, fmt.Errorf("scan project: %w", err) } - return buildProject(hostPath, containerPath, langsJSON, settingsJSON, statsJSON, status, createdAt, updatedAt, lastIndexedAt, indexedWithModel, ownerUserID, displayPath, machineID, machineLabel) + return buildProject(hostPath, containerPath, langsJSON, settingsJSON, statsJSON, status, createdAt, updatedAt, lastIndexedAt, indexedWithModel, ownerUserID, displayPath, machineID, machineLabel, pathHash) } -func buildProject(hostPath, containerPath, langsJSON, settingsJSON, statsJSON, status, createdAt, updatedAt string, lastIndexedAt, indexedWithModel, ownerUserID, displayPath, machineID, machineLabel *string) (*Project, error) { +func buildProject(hostPath, containerPath, langsJSON, settingsJSON, statsJSON, status, createdAt, updatedAt string, lastIndexedAt, indexedWithModel, ownerUserID, displayPath, machineID, machineLabel, pathHash *string) (*Project, error) { var langs []string if err := json.Unmarshal([]byte(langsJSON), &langs); err != nil { langs = nil @@ -428,8 +438,17 @@ func buildProject(hostPath, containerPath, langsJSON, settingsJSON, statsJSON, s if displayPath != nil && *displayPath != "" { dp = *displayPath } + // Fall back to the host-path hash only when the stored column is + // absent (pre-m7 rows backfill it on Open, so this is belt-and-braces). + ph := "" + if pathHash != nil && *pathHash != "" { + ph = *pathHash + } else { + ph = hashPath(hostPath) + } return &Project{ HostPath: hostPath, + PathHash: ph, ContainerPath: containerPath, Languages: langs, Settings: settings, diff --git a/server/internal/projects/projects_test.go b/server/internal/projects/projects_test.go index 5d9a18e..296f8d0 100644 --- a/server/internal/projects/projects_test.go +++ b/server/internal/projects/projects_test.go @@ -47,6 +47,61 @@ func TestCreateAndGet(t *testing.T) { } } +// TestGet_ReturnsStoredPathHashNotRecomputed guards the dashboard 404 +// regression: a project whose host_path and stored path_hash legitimately +// diverge — e.g. a local project keyed as sha1("local:{machine}:{path}") +// while host_path stays the bare filesystem path — must surface the STORED +// hash, because that is what GetByHash resolves against. Recomputing the +// hash from host_path would hand the dashboard a link no lookup matches → +// "project not found". +func TestGet_ReturnsStoredPathHashNotRecomputed(t *testing.T) { + d := openTestDB(t) + ctx := context.Background() + + const host = "/Users/me/proj" + const stored = "deadbeefcafe0001" // intentionally != hashPath(host) + if hashPath(host) == stored { + t.Fatal("precondition: stored hash must differ from the bare host-path hash") + } + now := "2026-01-01T00:00:00Z" + if _, err := d.ExecContext(ctx, + `INSERT INTO projects (host_path, container_path, languages, settings, stats, status, created_at, updated_at, path_hash, display_path, machine_id) + VALUES (?, ?, '[]', '{}', '{}', 'indexed', ?, ?, ?, ?, ?)`, + host, host, now, now, stored, host, "machine-xyz", + ); err != nil { + t.Fatalf("seed insert: %v", err) + } + + got, err := Get(ctx, d, host) + if err != nil { + t.Fatalf("Get: %v", err) + } + if got.PathHash != stored { + t.Errorf("Get PathHash = %q, want stored %q (must not recompute from host_path)", got.PathHash, stored) + } + + list, err := List(ctx, d) + if err != nil { + t.Fatalf("List: %v", err) + } + if len(list) != 1 || list[0].PathHash != stored { + t.Errorf("List PathHash = %+v, want [%q]", list, stored) + } + + // The stored hash must resolve back to the project (the dashboard + // click path: link hash → GetByHash → detail). + byHash, err := GetByHash(ctx, d, stored) + if err != nil { + t.Fatalf("GetByHash(stored): %v", err) + } + if byHash.HostPath != host { + t.Errorf("GetByHash HostPath = %q, want %q", byHash.HostPath, host) + } + if byHash.PathHash != stored { + t.Errorf("GetByHash PathHash = %q, want %q", byHash.PathHash, stored) + } +} + // Create preserves the host_path verbatim — matching Python which does not // normalise. Stripping trailing slashes here would silently change the stored // value and break subsequent lookups that hash the caller's original path. @@ -257,7 +312,6 @@ func TestHashPath_MatchesPython(t *testing.T) { } } - // TestCreate_MachineNamespacingAvoidsCollision verifies that the same // filesystem path indexed from two different machines becomes two distinct // projects (different identity key + hash), while the same machine+path diff --git a/server/internal/repojobs/repojobs.go b/server/internal/repojobs/repojobs.go index 3bbbda9..a957fc8 100644 --- a/server/internal/repojobs/repojobs.go +++ b/server/internal/repojobs/repojobs.go @@ -114,7 +114,7 @@ type Deps struct { GitRepos *gitrepos.Service GithubTokens *githubtokens.Service Indexer *indexer.Service - VectorStore *vectorstore.Store + VectorStore vectorstore.Interface DataDir string // root for cloned repos: /repos// Logger *slog.Logger // DefaultPollIntervalSeconds / MinPollIntervalSeconds resolve the poll diff --git a/server/internal/storage/chromamigrate.go b/server/internal/storage/chromamigrate.go new file mode 100644 index 0000000..78ae18e --- /dev/null +++ b/server/internal/storage/chromamigrate.go @@ -0,0 +1,102 @@ +package storage + +import ( + "fmt" + "log/slog" + "os" + "path/filepath" + "strings" + + "github.com/dvcdsys/code-index/server/internal/embeddings/provider" +) + +// MigrateFlatChromaToNested moves pre-unification FLAT vector-store dirs +// into the unified NESTED layout. Old ollama-only builds wrote one dir per +// model as a flat sibling of the chroma container: +// +// _ e.g. /data/chroma_nomic_embed_text +// +// The unified scheme namespaces by provider identity as nested dirs INSIDE +// the container, one path segment per identity field: +// +// //[/] e.g. /data/chroma/ollama/nomic_embed_text +// +// Because the old build only ever ran ollama, EVERY flat sibling is a +// legacy ollama model dir — there is no prefix-guessing (that ambiguity is +// exactly what the nested layout removes: the provider kind is now its own +// directory level, never glued onto the model slug). We move each +// _/ollama//. +// +// StorageSlug(ModelSafeName(m)) == StorageSlug(m) (ModelSafeName collapses +// only a subset — '/','-' — of the runes StorageSlug collapses, and '_' is +// preserved), so the destination matches the dir the running server +// resolves for "ollama:" — even for a model whose name normalises to +// a kind-looking slug, e.g. "openai-community/x" → ollama/openai_community_x +// (correct), where the old flat scheme silently orphaned it. +// +// Idempotent: once moved, the dirs live inside / and no longer match +// the flat sibling pattern, so a re-run is a cheap no-op. The caller is +// responsible for first relocating any legacy Python ChromaDB store that +// occupies itself (vectorstore.DetectLegacyAndBackup), so this can +// safely use as the container. +// +// LEGACY-MIGRATION (remove next release): one-time shim. Drop once every +// deployment has booted on the nested layout. +func MigrateFlatChromaToNested(chromaBase string, logger *slog.Logger) error { + if logger == nil { + logger = slog.Default() + } + if chromaBase == "" { + return nil + } + if err := os.MkdirAll(chromaBase, 0o755); err != nil { + return fmt.Errorf("create chroma container %s: %w", chromaBase, err) + } + + parent := filepath.Dir(chromaBase) + prefix := filepath.Base(chromaBase) + "_" // e.g. "chroma_" + + entries, err := os.ReadDir(parent) + if err != nil { + if os.IsNotExist(err) { + return nil // nothing indexed yet + } + return fmt.Errorf("read chroma parent %s: %w", parent, err) + } + + for _, e := range entries { + if !e.IsDir() { + continue // skip files (e.g. *.tar backups) + } + name := e.Name() + if !strings.HasPrefix(name, prefix) { + continue + } + if strings.Contains(name, ".python-backup.") { + continue // a backup DetectLegacyAndBackup created — leave it + } + suffix := strings.TrimPrefix(name, prefix) + if suffix == "" { + continue // exactly the base name + trailing "_" + } + + src := filepath.Join(parent, name) + // Every flat sibling is a legacy ollama model dir. Canonicalise the + // suffix through StorageSlug so it matches the path the server + // resolves for this identity ("ollama:"). + dst := filepath.Join(chromaBase, provider.KindOllama, provider.StorageSlug(suffix)) + if fileExists(dst) { + logger.Warn("storage: skipping flat→nested chroma move, target already exists (no clobber)", + "src", src, "dst", dst) + continue + } + if err := os.MkdirAll(filepath.Dir(dst), 0o755); err != nil { + return fmt.Errorf("create nested parent %s: %w", filepath.Dir(dst), err) + } + logger.Info("storage: migrating legacy flat chroma dir to nested layout", "src", src, "dst", dst) + if err := os.Rename(src, dst); err != nil { + return fmt.Errorf("rename %s -> %s: %w", src, dst, err) + } + } + return nil +} diff --git a/server/internal/storage/chromamigrate_test.go b/server/internal/storage/chromamigrate_test.go new file mode 100644 index 0000000..d7ccd4c --- /dev/null +++ b/server/internal/storage/chromamigrate_test.go @@ -0,0 +1,147 @@ +package storage + +import ( + "os" + "path/filepath" + "testing" +) + +func mkdir(t *testing.T, path string) { + t.Helper() + if err := os.MkdirAll(path, 0o755); err != nil { + t.Fatalf("mkdir %s: %v", path, err) + } +} + +func TestMigrateFlatChromaToNested(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") // container; flat legacy dirs are chroma_ + + // Legacy flat ollama dir (should move into chroma/ollama/). + mkdir(t, base+"_awhiteside_coderankembed_q8_0_gguf") + // A file that matches the prefix but is not a dir (ignored). + if err := os.WriteFile(base+"_backup.tar", []byte("x"), 0o644); err != nil { + t.Fatal(err) + } + + if err := MigrateFlatChromaToNested(base, quietLogger()); err != nil { + t.Fatalf("migrate: %v", err) + } + + if dirExists(base + "_awhiteside_coderankembed_q8_0_gguf") { + t.Errorf("legacy flat dir should have been moved away") + } + if !dirExists(filepath.Join(base, "ollama", "awhiteside_coderankembed_q8_0_gguf")) { + t.Errorf("expected nested dir chroma/ollama/awhiteside_coderankembed_q8_0_gguf") + } + if !fileExists(base + "_backup.tar") { + t.Errorf("non-dir entry must be ignored, not moved") + } +} + +// TestMigrateFlatChromaToNested_KindLookingModelName is the #8 regression: +// a legacy ollama model whose name normalises to a known-kind-looking slug +// (e.g. "openai-community/x" → "openai_community_x") was SKIPPED by the old +// prefix heuristic and silently orphaned. With the nested layout the +// provider kind is its own path segment, so the dir is correctly moved to +// chroma/ollama/openai_community_x — exactly where the server resolves +// "ollama:openai-community/x". +func TestMigrateFlatChromaToNested_KindLookingModelName(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") + mkdir(t, base+"_openai_community_x") // ModelSafeName("openai-community/x") + + if err := MigrateFlatChromaToNested(base, quietLogger()); err != nil { + t.Fatalf("migrate: %v", err) + } + if !dirExists(filepath.Join(base, "ollama", "openai_community_x")) { + t.Errorf("kind-looking legacy ollama dir must move to chroma/ollama/openai_community_x, not be skipped") + } + if dirExists(base + "_openai_community_x") { + t.Errorf("legacy flat dir should have been moved away") + } +} + +// TestMigrateFlatChromaToNested_StrictNormalizesSpecialChars: the legacy +// suffix was written by ModelSafeName (only '/'->'_' and '-'->'_'), so a +// model like "nomic-embed-text:v1.5" left a dir holding a '.'/':'. The +// running server resolves via StorageSlug (every non-[a-z0-9_] -> '_'), so +// the migration must canonicalise identically or the vectors are orphaned. +func TestMigrateFlatChromaToNested_StrictNormalizesSpecialChars(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") + mkdir(t, base+"_nomic_embed_text_v1.5") // '.' survives in the legacy name + + if err := MigrateFlatChromaToNested(base, quietLogger()); err != nil { + t.Fatalf("migrate: %v", err) + } + if !dirExists(filepath.Join(base, "ollama", "nomic_embed_text_v1_5")) { + t.Errorf("expected strict-normalized nested dir chroma/ollama/nomic_embed_text_v1_5") + } + if dirExists(base + "_nomic_embed_text_v1.5") { + t.Errorf("legacy dir should have been moved away") + } +} + +func TestMigrateFlatChromaToNested_NoClobber(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") + // Both the legacy source and its would-be nested target already exist. + mkdir(t, base+"_model_x") + mkdir(t, filepath.Join(base, "ollama", "model_x")) + if err := os.WriteFile(filepath.Join(base+"_model_x", "marker"), []byte("src"), 0o644); err != nil { + t.Fatal(err) + } + + if err := MigrateFlatChromaToNested(base, quietLogger()); err != nil { + t.Fatalf("migrate: %v", err) + } + // Source left in place (not clobbered into the existing target). + if !dirExists(base + "_model_x") { + t.Errorf("source dir must be preserved when target already exists") + } + if _, err := os.Stat(filepath.Join(base+"_model_x", "marker")); err != nil { + t.Errorf("source contents must be intact: %v", err) + } +} + +func TestMigrateFlatChromaToNested_Idempotent(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") + mkdir(t, base+"_legacy_model") + + for i := 0; i < 2; i++ { + if err := MigrateFlatChromaToNested(base, quietLogger()); err != nil { + t.Fatalf("run %d: %v", i, err) + } + } + if !dirExists(filepath.Join(base, "ollama", "legacy_model")) { + t.Errorf("expected nested dir after idempotent runs") + } + if dirExists(base + "_legacy_model") { + t.Errorf("legacy flat dir should be gone after first run") + } +} + +func TestMigrateFlatChromaToNested_IgnoresPythonBackup(t *testing.T) { + dir := t.TempDir() + base := filepath.Join(dir, "chroma") + backup := base + "_model.python-backup.20250101-000000" + mkdir(t, backup) + + if err := MigrateFlatChromaToNested(base, quietLogger()); err != nil { + t.Fatalf("migrate: %v", err) + } + // A python-backup dir must be left exactly where it is. + if !dirExists(backup) { + t.Errorf("python-backup dir must not be moved") + } + if dirExists(filepath.Join(base, "ollama")) { + t.Errorf("python-backup dir must not be treated as a legacy ollama namespace") + } +} + +func dirExists(path string) bool { + info, err := os.Stat(path) + return err == nil && info.IsDir() +} diff --git a/server/internal/storage/dbmigrate.go b/server/internal/storage/dbmigrate.go new file mode 100644 index 0000000..932ac19 --- /dev/null +++ b/server/internal/storage/dbmigrate.go @@ -0,0 +1,145 @@ +// Package storage holds boot-time, OS-level migrations of on-disk storage +// artefacts (the SQLite system DB file and the chromem-go vector-store +// directories). These run BEFORE the DB / vector store are opened, so +// they operate on plain files rather than live handles — which is why +// they live here rather than in internal/db or internal/vectorstore. +// +// Background. Earlier builds (a Python-era artefact ported 1:1 to Go) +// namespaced BOTH the SQLite DB and the chroma dir by the embedding model +// name. That was only safe while the model was a fixed env var; runtime +// model/provider switching turned it into a bug (vectors of different +// dimensions colliding in one collection) and a footgun (a model change +// silently spawning a parallel DB with empty accounts). The unified +// design keeps ONE model-independent system DB and namespaces ONLY the +// vector store by the active provider identity. These migrations move +// existing deployments onto that layout without a reindex. +package storage + +import ( + "database/sql" + "fmt" + "log/slog" + "os" + "time" + + "github.com/dvcdsys/code-index/server/internal/db" + + _ "modernc.org/sqlite" +) + +// AdoptLegacyModelDB makes the model-independent system DB at target the +// canonical store, adopting a legacy per-model DB file when present. +// +// target is the literal cfg.SQLitePath (no model suffix). legacy is the +// old per-model filename (cfg.LegacyDynamicSQLitePath()). The function is +// idempotent and safe to run on every boot. +// +// Cases: +// - legacy == target, or legacy missing → nothing to do. +// - target missing → adopt legacy (checkpoint its WAL, rename it in). +// - target present AND a real unified DB (has schema_migrations AND +// users) → leave it; if a legacy file also lingers, warn that it is +// now stale. +// - target present but a pre-auth FOSSIL (lacks those tables) → move the +// fossil aside to .pre-unify- (with its WAL/SHM +// sidecars) and adopt legacy in its place. +// +// On adoption the legacy WAL is drained via PRAGMA wal_checkpoint(TRUNCATE) +// and only the main .db file is renamed; the regenerable -wal/-shm +// sidecars are removed rather than carried across (they are +// host/inode-sensitive). +// +// LEGACY-MIGRATION (remove next release): one-time adoption shim. Once +// every deployment has booted on the unified layout, delete this function +// and its call in cmd/cix-server/main.go (and Config.ModelSafeName / +// Config.LegacyDynamicSQLitePath, which exist only to feed it). +func AdoptLegacyModelDB(target, legacy string, logger *slog.Logger) error { + if logger == nil { + logger = slog.Default() + } + if target == "" || legacy == "" || legacy == target { + return nil + } + if !fileExists(legacy) { + return nil // fresh install or already adopted + } + + if fileExists(target) { + real, err := db.HasTables(target, "schema_migrations", "users") + if err != nil { + return fmt.Errorf("inspect target db %s: %w", target, err) + } + if real { + logger.Warn("storage: legacy per-model DB still present but target is already the unified system DB; leaving legacy untouched (safe to delete manually)", + "target", target, "legacy", legacy) + return nil + } + // Fossil occupying the target path — move it aside. + aside := fmt.Sprintf("%s.pre-unify-%s", target, time.Now().UTC().Format("20060102-150405")) + logger.Warn("storage: moving pre-unification fossil DB aside to free the system DB path", + "fossil", target, "moved_to", aside) + if err := renameWithSidecars(target, aside); err != nil { + return fmt.Errorf("move fossil aside: %w", err) + } + } + + logger.Info("storage: adopting legacy per-model DB as the model-independent system DB", + "legacy", legacy, "target", target) + if err := checkpointWAL(legacy); err != nil { + return fmt.Errorf("checkpoint legacy db %s: %w", legacy, err) + } + if err := os.Rename(legacy, target); err != nil { + return fmt.Errorf("rename %s -> %s: %w", legacy, target, err) + } + // The legacy WAL was truncated by the checkpoint; remove any leftover + // regenerable sidecars so they cannot shadow the moved DB. + removeIfExists(legacy + "-wal") + removeIfExists(legacy + "-shm") + return nil +} + +// checkpointWAL opens path with a single connection, drains its WAL into +// the main database file via wal_checkpoint(TRUNCATE), and closes. After +// this the main .db file is self-contained and safe to rename. +func checkpointWAL(path string) error { + // WAL + busy_timeout via the DSN; single connection so the checkpoint + // is not racing a sibling connection. + dsn := "file:" + path + "?_pragma=journal_mode(WAL)&_pragma=busy_timeout(5000)" + sdb, err := sql.Open(db.DriverName, dsn) + if err != nil { + return fmt.Errorf("open: %w", err) + } + defer sdb.Close() + sdb.SetMaxOpenConns(1) + if _, err := sdb.Exec(`PRAGMA wal_checkpoint(TRUNCATE)`); err != nil { + return fmt.Errorf("wal_checkpoint: %w", err) + } + return nil +} + +// renameWithSidecars renames a SQLite DB file and its -wal/-shm sidecars +// (best effort for the sidecars — they may not exist). +func renameWithSidecars(from, to string) error { + if err := os.Rename(from, to); err != nil { + return err + } + for _, sfx := range []string{"-wal", "-shm"} { + if fileExists(from + sfx) { + if err := os.Rename(from+sfx, to+sfx); err != nil { + return fmt.Errorf("rename sidecar %s: %w", from+sfx, err) + } + } + } + return nil +} + +func fileExists(path string) bool { + _, err := os.Stat(path) + return err == nil +} + +func removeIfExists(path string) { + if fileExists(path) { + _ = os.Remove(path) + } +} diff --git a/server/internal/storage/dbmigrate_test.go b/server/internal/storage/dbmigrate_test.go new file mode 100644 index 0000000..7ff731e --- /dev/null +++ b/server/internal/storage/dbmigrate_test.go @@ -0,0 +1,180 @@ +package storage + +import ( + "database/sql" + "io" + "log/slog" + "path/filepath" + "testing" + + "github.com/dvcdsys/code-index/server/internal/db" + + _ "modernc.org/sqlite" +) + +func quietLogger() *slog.Logger { + return slog.New(slog.NewTextHandler(io.Discard, nil)) +} + +// makeUnifiedDB creates a real system DB at path via db.OpenWith (so it +// has schema_migrations + users), then writes a sentinel row so we can +// prove the exact file survived an adoption. +func makeUnifiedDB(t *testing.T, path, sentinel string) { + t.Helper() + d, err := db.OpenWith(db.OpenOptions{Path: path}) + if err != nil { + t.Fatalf("OpenWith(%s): %v", path, err) + } + if _, err := d.Exec(`CREATE TABLE IF NOT EXISTS sentinel (v TEXT)`); err != nil { + t.Fatalf("create sentinel: %v", err) + } + if _, err := d.Exec(`INSERT INTO sentinel (v) VALUES (?)`, sentinel); err != nil { + t.Fatalf("insert sentinel: %v", err) + } + if err := d.Close(); err != nil { + t.Fatalf("close: %v", err) + } +} + +// makeFossilDB creates a pre-auth fossil: a bare DB with only a projects +// table, no users / schema_migrations. +func makeFossilDB(t *testing.T, path string) { + t.Helper() + sdb, err := sql.Open("sqlite", "file:"+path+"?_pragma=journal_mode(WAL)") + if err != nil { + t.Fatalf("open fossil: %v", err) + } + if _, err := sdb.Exec(`CREATE TABLE projects (host_path TEXT)`); err != nil { + t.Fatalf("create fossil projects: %v", err) + } + if err := sdb.Close(); err != nil { + t.Fatalf("close fossil: %v", err) + } +} + +func readSentinel(t *testing.T, path string) string { + t.Helper() + sdb, err := sql.Open("sqlite", "file:"+path+"?_pragma=journal_mode(WAL)") + if err != nil { + t.Fatalf("open %s: %v", path, err) + } + defer sdb.Close() + var v string + if err := sdb.QueryRow(`SELECT v FROM sentinel LIMIT 1`).Scan(&v); err != nil { + t.Fatalf("read sentinel from %s: %v", path, err) + } + return v +} + +func TestAdoptLegacyModelDB_AdoptIntoAbsentTarget(t *testing.T) { + dir := t.TempDir() + target := filepath.Join(dir, "projects.db") + legacy := filepath.Join(dir, "projects_model.db") + makeUnifiedDB(t, legacy, "hello") + + if err := AdoptLegacyModelDB(target, legacy, quietLogger()); err != nil { + t.Fatalf("adopt: %v", err) + } + if fileExists(legacy) { + t.Errorf("legacy should have been renamed away") + } + if !fileExists(target) { + t.Fatalf("target should exist after adoption") + } + if got := readSentinel(t, target); got != "hello" { + t.Errorf("sentinel = %q, want hello (wrong file adopted?)", got) + } +} + +func TestAdoptLegacyModelDB_NoopWhenTargetIsUnified(t *testing.T) { + dir := t.TempDir() + target := filepath.Join(dir, "projects.db") + legacy := filepath.Join(dir, "projects_model.db") + makeUnifiedDB(t, target, "target-data") + makeUnifiedDB(t, legacy, "legacy-data") + + if err := AdoptLegacyModelDB(target, legacy, quietLogger()); err != nil { + t.Fatalf("adopt: %v", err) + } + // Target untouched, legacy left in place (stale). + if got := readSentinel(t, target); got != "target-data" { + t.Errorf("target sentinel = %q, want target-data (must not be overwritten)", got) + } + if !fileExists(legacy) { + t.Errorf("legacy should be left in place when target is already unified") + } +} + +func TestAdoptLegacyModelDB_FossilMovedAside(t *testing.T) { + dir := t.TempDir() + target := filepath.Join(dir, "projects.db") + legacy := filepath.Join(dir, "projects_model.db") + makeFossilDB(t, target) + makeUnifiedDB(t, legacy, "real") + + if err := AdoptLegacyModelDB(target, legacy, quietLogger()); err != nil { + t.Fatalf("adopt: %v", err) + } + if got := readSentinel(t, target); got != "real" { + t.Errorf("target sentinel = %q, want real (legacy not adopted over fossil)", got) + } + // Fossil moved aside to a *.pre-unify-* file. + matches, _ := filepath.Glob(target + ".pre-unify-*") + if len(matches) == 0 { + t.Errorf("expected fossil moved aside to %s.pre-unify-*", target) + } + if fileExists(legacy) { + t.Errorf("legacy should have been adopted (renamed away)") + } +} + +func TestAdoptLegacyModelDB_Idempotent(t *testing.T) { + dir := t.TempDir() + target := filepath.Join(dir, "projects.db") + legacy := filepath.Join(dir, "projects_model.db") + makeUnifiedDB(t, legacy, "once") + + for i := 0; i < 2; i++ { + if err := AdoptLegacyModelDB(target, legacy, quietLogger()); err != nil { + t.Fatalf("adopt run %d: %v", i, err) + } + } + if got := readSentinel(t, target); got != "once" { + t.Errorf("sentinel = %q, want once", got) + } +} + +func TestAdoptLegacyModelDB_LegacyEqualsTarget(t *testing.T) { + dir := t.TempDir() + target := filepath.Join(dir, "projects.db") + makeUnifiedDB(t, target, "same") + if err := AdoptLegacyModelDB(target, target, quietLogger()); err != nil { + t.Fatalf("adopt: %v", err) + } + if got := readSentinel(t, target); got != "same" { + t.Errorf("sentinel = %q, want same", got) + } +} + +// TestAdoptLegacyModelDB_WALDrainedAndSidecarsGone writes rows under WAL, +// adopts, and asserts the data survives the checkpoint+rename and the +// legacy -wal/-shm sidecars are removed. +func TestAdoptLegacyModelDB_WALDrainedAndSidecarsGone(t *testing.T) { + dir := t.TempDir() + target := filepath.Join(dir, "projects.db") + legacy := filepath.Join(dir, "projects_model.db") + makeUnifiedDB(t, legacy, "wal-data") + + if err := AdoptLegacyModelDB(target, legacy, quietLogger()); err != nil { + t.Fatalf("adopt: %v", err) + } + if got := readSentinel(t, target); got != "wal-data" { + t.Errorf("sentinel = %q, want wal-data", got) + } + if fileExists(legacy + "-wal") { + t.Errorf("legacy -wal sidecar should be gone") + } + if fileExists(legacy + "-shm") { + t.Errorf("legacy -shm sidecar should be gone") + } +} diff --git a/server/internal/vectorstore/holder.go b/server/internal/vectorstore/holder.go new file mode 100644 index 0000000..2ab1a0d --- /dev/null +++ b/server/internal/vectorstore/holder.go @@ -0,0 +1,122 @@ +package vectorstore + +import ( + "context" + "errors" + "sync" +) + +// Interface is the vector-store surface consumed by the indexer, the +// search handlers, and repojobs. Both *Store (direct) and *Holder +// (swappable) satisfy it, so a caller can hold either: tests pass a raw +// *Store, while production passes a *Holder so the active store can be +// reopened under a new directory on a provider switch without rewiring +// every holder. The method set mirrors *Store exactly. +type Interface interface { + UpsertChunks(ctx context.Context, projectPath string, chunks []Chunk, embeddings [][]float32) error + Search(ctx context.Context, projectPath string, queryEmbedding []float32, limit int, where map[string]string) ([]SearchResult, error) + DeleteByFile(ctx context.Context, projectPath, filePath string) error + DeleteCollection(projectPath string) error + Count(projectPath string) int +} + +// Compile-time assertions that both implementations satisfy Interface. +var ( + _ Interface = (*Store)(nil) + _ Interface = (*Holder)(nil) +) + +// Holder is a concurrency-safe, swappable wrapper around a *Store. It +// exists so the active vector store can be reopened under a new on-disk +// directory at runtime — e.g. when an admin switches the embedding +// provider (PUT /admin/embedding-providers/active), the new provider's +// vectors live in a different, dimension-isolated namespace, so the +// Service reopens a *Store at the new path and atomically Swap()s it in. +// +// All read/write proxies take the RLock; Swap takes the Lock. A search +// in flight during a Swap therefore either completes against the old +// store or starts against the new one — never observes a torn pointer. +// Every current holder of a raw *Store (indexer, httpapi Deps, repojobs) +// holds a *Holder instead and calls the identical method set. +// +// Discarding the old *Store after Swap needs no Close: chromem-go persists +// each write synchronously to disk and keeps no background goroutines or +// open file handles, so the replaced store is simply reclaimed by GC. +type Holder struct { + mu sync.RWMutex + store *Store +} + +// errNotInitialised is returned by write proxies when the Holder has no +// store (only possible before the first Swap / NewHolder(nil)). +var errNotInitialised = errors.New("vectorstore: holder has no active store") + +// NewHolder wraps an initial store (which may be nil; callers must Swap a +// real store in before writes succeed). +func NewHolder(s *Store) *Holder { return &Holder{store: s} } + +// Swap installs newStore as the active store and returns the previous one +// (nil on first install). The caller may discard the returned store; no +// Close is required (see type doc). +func (h *Holder) Swap(newStore *Store) (old *Store) { + h.mu.Lock() + old = h.store + h.store = newStore + h.mu.Unlock() + return old +} + +// current returns the active store under the read lock. +func (h *Holder) current() *Store { + h.mu.RLock() + s := h.store + h.mu.RUnlock() + return s +} + +// UpsertChunks proxies to the active store. +func (h *Holder) UpsertChunks(ctx context.Context, projectPath string, chunks []Chunk, embeddings [][]float32) error { + s := h.current() + if s == nil { + return errNotInitialised + } + return s.UpsertChunks(ctx, projectPath, chunks, embeddings) +} + +// Search proxies to the active store. A nil store yields (nil, nil), +// matching the empty-collection contract so callers degrade to no +// results rather than erroring. +func (h *Holder) Search(ctx context.Context, projectPath string, queryEmbedding []float32, limit int, where map[string]string) ([]SearchResult, error) { + s := h.current() + if s == nil { + return nil, nil + } + return s.Search(ctx, projectPath, queryEmbedding, limit, where) +} + +// DeleteByFile proxies to the active store. +func (h *Holder) DeleteByFile(ctx context.Context, projectPath, filePath string) error { + s := h.current() + if s == nil { + return errNotInitialised + } + return s.DeleteByFile(ctx, projectPath, filePath) +} + +// DeleteCollection proxies to the active store. +func (h *Holder) DeleteCollection(projectPath string) error { + s := h.current() + if s == nil { + return errNotInitialised + } + return s.DeleteCollection(projectPath) +} + +// Count proxies to the active store; a nil store reports 0. +func (h *Holder) Count(projectPath string) int { + s := h.current() + if s == nil { + return 0 + } + return s.Count(projectPath) +} diff --git a/server/internal/vectorstore/holder_test.go b/server/internal/vectorstore/holder_test.go new file mode 100644 index 0000000..27efbbd --- /dev/null +++ b/server/internal/vectorstore/holder_test.go @@ -0,0 +1,113 @@ +package vectorstore + +import ( + "context" + "sync" + "testing" +) + +func storeWithOneChunk(t *testing.T, project string) *Store { + t.Helper() + s, err := Open(t.TempDir()) + if err != nil { + t.Fatalf("open store: %v", err) + } + chunks := []Chunk{{ + Content: "hello", FilePath: "a.go", StartLine: 1, EndLine: 2, Language: "go", + }} + embs := [][]float32{{1, 0, 0, 0}} + if err := s.UpsertChunks(context.Background(), project, chunks, embs); err != nil { + t.Fatalf("upsert: %v", err) + } + return s +} + +func emptyStore(t *testing.T) *Store { + t.Helper() + s, err := Open(t.TempDir()) + if err != nil { + t.Fatalf("open store: %v", err) + } + return s +} + +func TestHolderProxyAndSwap(t *testing.T) { + const project = "/proj" + a := storeWithOneChunk(t, project) + b := emptyStore(t) + + h := NewHolder(a) + if got := h.Count(project); got != 1 { + t.Fatalf("Count via holder = %d, want 1", got) + } + + old := h.Swap(b) + if old != a { + t.Errorf("Swap should return the previous store") + } + if got := h.Count(project); got != 0 { + t.Errorf("Count after swap to empty store = %d, want 0", got) + } +} + +func TestHolderNilGuards(t *testing.T) { + h := NewHolder(nil) + if got := h.Count("/p"); got != 0 { + t.Errorf("nil-store Count = %d, want 0", got) + } + res, err := h.Search(context.Background(), "/p", []float32{1, 0}, 5, nil) + if err != nil || res != nil { + t.Errorf("nil-store Search = (%v, %v), want (nil, nil)", res, err) + } + if err := h.UpsertChunks(context.Background(), "/p", nil, nil); err == nil { + t.Errorf("nil-store UpsertChunks should error") + } + if err := h.DeleteByFile(context.Background(), "/p", "f"); err == nil { + t.Errorf("nil-store DeleteByFile should error") + } + if err := h.DeleteCollection("/p"); err == nil { + t.Errorf("nil-store DeleteCollection should error") + } +} + +// TestHolderConcurrentSwap runs under -race: many goroutines Search/Count +// while another goroutine repeatedly Swaps between two valid stores. The +// RWMutex must guarantee no torn pointer / data race. +func TestHolderConcurrentSwap(t *testing.T) { + const project = "/proj" + a := storeWithOneChunk(t, project) + b := storeWithOneChunk(t, project) + h := NewHolder(a) + + var wg sync.WaitGroup + stop := make(chan struct{}) + + // Readers. + for i := 0; i < 8; i++ { + wg.Add(1) + go func() { + defer wg.Done() + for { + select { + case <-stop: + return + default: + _ = h.Count(project) + _, _ = h.Search(context.Background(), project, []float32{1, 0, 0, 0}, 1, nil) + } + } + }() + } + // Swapper. + wg.Add(1) + go func() { + defer wg.Done() + stores := []*Store{a, b} + for i := 0; i < 2000; i++ { + h.Swap(stores[i%2]) + } + close(stop) + }() + + wg.Wait() +} diff --git a/server/internal/vectorstore/store.go b/server/internal/vectorstore/store.go index 94630e6..374b3af 100644 --- a/server/internal/vectorstore/store.go +++ b/server/internal/vectorstore/store.go @@ -65,7 +65,7 @@ func collectionName(projectPath string) string { // CollectionName is the exported alias for the per-project chromem-go // collection identifier. The dashboard's project-detail card uses it to -// resolve the on-disk directory under cfg.DynamicChromaPersistDir(). +// resolve the on-disk directory under cfg.ChromaDirFor(activeComponents). func CollectionName(projectPath string) string { return collectionName(projectPath) } // docID format: "{md5hex(filePath)[:12]}:{startLine}-{endLine}:{idx}" diff --git a/skills/cix/SKILL.md b/skills/cix/SKILL.md index 37cc432..227693d 100644 --- a/skills/cix/SKILL.md +++ b/skills/cix/SKILL.md @@ -133,6 +133,40 @@ The watcher auto-reindexes on file change — manual `reindex` is rarely needed. `cix status` shows whether the watcher is running and the last-sync timestamp. +### Servers — talk to more than one cix backend + +`cix` can be configured with several **named servers** (e.g. a local +box and a remote corporate server). One is the **default**; every +command targets the default unless you pass `--server `. + +```bash +cix config show # lists servers; * marks the default +cix --server corporate search "rate limiter" # run any command against a named server +cix search "rate limiter" --server corporate # --server is global; either position works +``` + +Servers are managed through `cix config` (persisted in +`~/.cix/config.yaml`): + +```bash +cix config set server.corporate.url https://cix.corp.internal +cix config set server.corporate.key +cix config set default_server corporate # change which server is the default +cix config unset server.corporate # remove a server +cix config unset server.corporate.key # clear just its key +``` + +The legacy single-server keys still work and operate on the **default** +server, so existing setups keep working unchanged: +`cix config set api.url ` / `cix config set api.key `. The +`--api-url` / `--api-key` flags override the selected server's URL/key +for a single invocation. + +**Agent rule:** use the default server (no flag) unless the user names a +specific server. Only add `--server ` when the task explicitly +targets that named backend; never guess an alias — run `cix config show` +to see the configured names if unsure. + --- ## Search quality — what scores mean