From c80d23b2e6d3bb3ade5c942145e6320ab1c961b8 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Fri, 19 Jun 2026 17:03:12 +0200 Subject: [PATCH 01/33] Add design spec for databricks dbconnect init/sync Brainstormed design for porting the dbconnect-init.sh demo into a real CLI subcommand namespace with init + sync commands, a shared phase pipeline, full target resolution, a surgical TOML merge, and a stable --json schema. Co-authored-by: Isaac --- .../2026-06-19-dbconnect-init-sync-design.md | 323 ++++++++++++++++++ 1 file changed, 323 insertions(+) create mode 100644 docs/superpowers/specs/2026-06-19-dbconnect-init-sync-design.md diff --git a/docs/superpowers/specs/2026-06-19-dbconnect-init-sync-design.md b/docs/superpowers/specs/2026-06-19-dbconnect-init-sync-design.md new file mode 100644 index 00000000000..47d93f6bf47 --- /dev/null +++ b/docs/superpowers/specs/2026-06-19-dbconnect-init-sync-design.md @@ -0,0 +1,323 @@ +# `databricks dbconnect init` / `sync` — Design + +**Date:** 2026-06-19 +**Status:** Approved for planning +**Branch context:** Databricks CLI (Go) + +## Summary + +Promote the proven `dbconnect-init.sh` demo into a real CLI subcommand namespace, +`databricks dbconnect`, with two commands: `init` and `sync`. Starting from the +compute target the user already selected (cluster / serverless / job), the +command derives and provisions a matching local Python environment: the right +Python version, the right `databricks-connect` version, and dependency +constraints so local resolution matches the Databricks runtime — no version +guessing. + +The behavior is already implemented and verified as a 367-line bash script +(`dbconnect-init.sh` in the `databricks-vscode` repo). This design ports the +same phase pipeline to Go, with real API calls, a robust TOML merge, a +package-manager seam, and structured output. + +## Reference implementation (the spec) + +- **Script (source of truth for the pipeline):** + `/Users/grigory.panov/work/databricks-vscode/packages/databricks-vscode/resources/python/dbconnect-init.sh` +- **VS Code consumer (context on how it's invoked + the `--json` consumer):** + `/Users/grigory.panov/work/databricks-vscode/packages/databricks-vscode/src/language/VpexEnvironmentSetup.ts` + +The script logs each `=== Phase N ===` header; the Go port matches those +outcomes. We can diff Go behavior against a live run of the script. + +## Design decisions (resolved during brainstorming) + +1. **Constraint source of truth:** configurable base URL, defaulting to the + existing `databricks-environments` GitHub raw repo. Swap the default when an + official endpoint exists. (Overridable via flag + env var.) +2. **Default target resolution:** when no `--cluster`/`--serverless`/`--job` + flag is given, resolve from the **bundle's configured target** (the way + bundle commands do), NOT from the VS Code `vscode.overrides.json` artifact. + The standalone CLI does not read VS Code files. +3. **Package managers:** **uv only** in this PR, at full parity with the script. + A `PackageManager` interface is the seam; pip and conda land in later PRs as + additional files in the same package (no subpackages, no speculative stubs). +4. **`--json`:** a clean, documented, stable schema. VS Code adapts to it; we + are not bound by the current TypeScript interface (which today parses phase + headers from stdout, not JSON). +5. **TOML merge:** **surgical line edits** that preserve the user's formatting + and comments. There is no format-preserving TOML editor in Go (`go-toml/v2` + reformats just like the already-vendored `BurntSushi/toml`), so we use + BurntSushi only to READ the fetched values and validate structure, and apply + targeted line edits to write. No new dependency. +6. **Target resolution scope:** serverless is the working happy path; **cluster + and job compute resolution are also real** in this PR (SDK + `GetByClusterId` → `SparkVersion` → DBR → envKey, with nearest-supported + fallback). Unsupported runtimes return a clear error, never a crash or a + hard stub. + +## Architecture + +### Package layout + +``` +cmd/dbconnect/ + dbconnect.go New() *cobra.Command — "dbconnect" group, registers init + sync + init.go init subcommand: flag wiring + RunE -> pipeline.Run(Init) + sync.go sync subcommand: flag wiring + RunE -> pipeline.Run(Sync) + output.go text + --json rendering of the result/plan/errors + +libs/dbconnect/ + pipeline.go the shared phase pipeline (Mode = Init|Sync); orchestrates phases + target.go target resolution: flags + bundle target -> ResolvedTarget (envKey) + envkey.go DBR/serverless version -> envKey mapping (+ nearest-supported fallback) + constraints.go fetch constraint pyproject.toml (configurable base URL) + offline cache + merge.go surgical TOML merge of the 3 managed regions + pkgmanager.go PackageManager interface; uvManager implementation (uv.go) + result.go structured Result/Plan/Phase types (the --json schema) +``` + +**Rationale:** `cmd/dbconnect/` stays thin (Cobra wiring + rendering), mirroring +`cmd/psql/psql.go`. All logic lives in `libs/dbconnect/` so it is unit-testable +without a Cobra command. The `PackageManager` **interface** — not a directory +split — is what lets pip/conda land cleanly later; subpackages would create +import-cycle pressure (pipeline → pkgmanager → shared types) and would be +speculative scaffolding for deferred code. + +**Registration:** one line in `cmd/cmd.go`: +`cli.AddCommand(dbconnect.New())`, in the `development` ("Developer Tools") +group alongside `psql`. Hand-written workflow command — does NOT touch +`.codegen/` or run `generate-cligen`. + +### Control flow + +`init` and `sync` build the same `Pipeline` and call `Run(ctx)`. They differ in +exactly one phase behavior (Phase 3/4: write-fresh vs merge-into-existing), +selected by `Mode`. Every other phase is shared and runs once. + +## The phase pipeline + +`Pipeline.Run(ctx)` executes the script's phases in order. Each phase is a +method that returns an error and appends a `PhaseResult` to the accumulating +`Result`. `Mode` (Init|Sync) only changes Phase 3/4. + +| # | Phase | Go behavior | Δ from script | +|---|-------|-------------|---------------| +| 0 | Preflight | We *are* the CLI; auth comes from the resolved workspace client (`root.MustWorkspaceClient`). Discover uv from PATH + standard install locations (`~/.local/bin`, `$XDG_BIN_HOME`, Homebrew bins); bootstrap via the official installer if missing. Honor `UV_INDEX_URL` from `~/.config/pip/pip.conf` if unset. | No `databricks` binary probe. Auth via SDK, not `current-user me` shell-out. | +| 1 | Resolve target → envKey | Flags first (`--cluster`/`--serverless`/`--job`); else the bundle's configured target. Produce `ResolvedTarget{envKey, pythonVersion?}`. Preserve three-state messaging. | API calls, not file read. | +| 2 | Fetch constraints | GET `{baseURL}/{envKey}/pyproject.toml` via the CLI's HTTP client. Offline cache under the user cache dir; on network failure fall back to cache with a warning, else a clear error. | Configurable base URL + cache. | +| 3 | Baseline / idempotency | **Init:** write a fresh managed `pyproject.toml` (back up any existing to `.bak`). **Sync:** restore from `.bak` if present, else back it up, then merge. | Same idempotency model. | +| 4 | Merge managed regions | Surgical line edits to the 3 managed regions (see Merge section). | Robust merge, not regex. | +| 5 | Ensure Python | `PackageManager.EnsurePython(version)` — version from the resolved target, not hardcoded. | Version from target. | +| 6 | Provision | `PackageManager.Provision()` → `.venv` (uv: `uv sync`). | Interface seam. | +| 7 | Post-provision (pip seed) | `PackageManager.PostProvision()` — uv seeds pip into `.venv`; carries the script's full rationale comment (VS Code's `ms-python.vscode-python-envs` falls back to `python -m pip list` when its `uv --version` probe fails on the GUI PATH; uv venvs have no pip; `uv sync` strips pip, so seed runs after every sync). | uv-specific, behind the interface. | +| 8 | Validate | Assert `.venv` Python minor == target; `databricks-connect` matches the pin read from the fetched file. Populate `Result`. | Same asserts, structured output. | + +**`--check` (dry-run):** runs phases 0–2 (read-only: discover, resolve, fetch), +then computes and prints the plan + the unified diff that phase 4 would write, +and stops before any mutation. Mutating phases (3–8) are gated on `!check`. + +**Errors:** each phase wraps with `%w` and context. Structured errors carry a +stable `code` (e.g. `no_target_selected`, `cluster_unsupported`, +`constraint_fetch_failed`) so consumers branch on the code, never on message +text (repo rule: compare errors with sentinels, never `err.Error()` strings). + +**Cancellation:** phases respect `ctx`; long shell-outs (uv) run via +`libs/process` with the context so Ctrl-C / VS Code cancel terminates them. + +## Target resolution → envKey + +### Stage A — pick the target (ordered precedence, early-return style) + +1. `--cluster ` → SDK `w.Clusters.GetByClusterId(id)` → `SparkVersion` (the + DBR string, e.g. `15.4.x-scala2.12`). +2. `--serverless ` → serverless target, version `N`. +3. `--job ` → `w.Jobs.Get(id)`, read the job's compute (job cluster + `SparkVersion`, or serverless if the task is serverless). +4. No flag → the **bundle's configured target** (loaded the same way bundle + commands load it), read the selected target's compute. + +Flags are mutually exclusive (`cmd.MarkFlagsMutuallyExclusive`), rejected at +parse time (repo rule: reject incompatible inputs early with an actionable +error). + +### Stage B — three-state messaging (preserved from script lines 179–192) + +- **serverless selected** → proceed. +- **cluster selected** → resolve its DBR → envKey (implemented, not a stub). If + the DBR maps to no supported envKey, a clear "runtime X not yet supported" + error. +- **nothing selected** (bundle has no compute target) → actionable error: "No + compute target is selected. Select a cluster or serverless target, or pass + --cluster/--serverless/--job." + +### Stage C — version → envKey mapping (`envkey.go`) + +- **Serverless:** `vN` → `serverless/serverless-vN`. +- **Cluster/job DBR:** parse major.minor from `SparkVersion`, map to an envKey + via a small in-repo table, with **nearest-supported fallback** — if the exact + DBR isn't in the table, pick the closest supported one and warn, naming both. + +The table maps version → envKey *path* only. The constraint *pins* always come +from the fetched file, never from the table. + +## Surgical TOML merge (`merge.go`) + +**Goal:** touch only the 3 managed regions; preserve every byte the user owns +(comments, ordering, whitespace, their dependencies). + +**Read side (BurntSushi, already vendored):** parse the *fetched* env file into +a struct to extract the managed values authoritatively: +- `project.requires-python` (string) +- the `databricks-connect` pin from `dependency-groups.dev` +- `tool.uv.constraint-dependencies` ([]string) + +Also parse the *target* file with BurntSushi purely to validate it is +well-formed and to locate which regions exist before editing. We never write via +BurntSushi. + +**Write side (structured line edits)** — three idempotent transforms: + +1. **`requires-python`** — replace the value of the existing `requires-python =` + line under `[project]`, preserving indentation; if `[project]` exists but the + key doesn't, insert it. +2. **`databricks-connect` pin** — within `[dependency-groups].dev`, replace the + existing `"databricks-connect..."` element in place (preserve indentation and + trailing-comma style). +3. **`[tool.uv].constraint-dependencies`** — replace the whole managed block: + drop any existing block we previously wrote, append a freshly rendered one. + Bracketed with a discreet `# managed by databricks dbconnect` marker so + re-merges replace exactly our block without clobbering a user's own + `[tool.uv]` settings. + +**Edge cases the tests must cover** (where the script's regex breaks): +- multiline vs single-line arrays for the dev group and constraints +- single vs double quotes, trailing commas, comment lines inside arrays +- `[project]` present but no `requires-python` +- no `[tool.uv]` yet vs a pre-existing one (ours or the user's) +- CRLF files (Windows) — normalize on read, restore on write + +**Idempotency:** merging twice produces byte-identical output. + +**`--check` diff:** the merge produces the new content in memory; `--check` +renders a unified diff (old vs new) and writes nothing. + +## Output, flags & the `--json` schema + +### Flags (shared by both subcommands) + +| Flag | Type | Meaning | +|------|------|---------| +| `--cluster` | string | target a cluster (mutually exclusive) | +| `--serverless` | string | target serverless `vN` (mutually exclusive) | +| `--job` | string | target a job's compute (mutually exclusive) | +| `--check` | bool | dry-run: print plan + diff, mutate nothing | +| `--json` | bool | machine-readable output (wired via existing `cmdio` output plumbing) | +| `--constraint-source` | string | override the constraints base URL; default = `databricks-environments` repo. Also via env var. Advanced/hidden. | + +### `--json` schema (the documented contract) + +```jsonc +{ + "mode": "init" | "sync", + "check": false, + "target": { + "kind": "serverless" | "cluster" | "job", + "cluster_id": "…", + "spark_version": "15.4.x-…", + "env_key": "serverless/serverless-v4", + "python_version": "3.12", + "fallback": { "requested": "…", "resolved": "…" } + }, + "constraints": { + "source_url": "https://…/serverless-v4/pyproject.toml", + "from_cache": false, + "requires_python": ">=3.12", + "databricks_connect": "databricks-connect~=17.2.0", + "constraint_count": 42 + }, + "plan": { + "pyproject_path": "/abs/pyproject.toml", + "backup_path": "/abs/pyproject.toml.bak", + "diff": "--- …\n+++ …\n@@ …", + "changed_regions": ["requires-python", "databricks-connect", "tool.uv.constraint-dependencies"] + }, + "phases": [ + {"name": "preflight", "status": "ok", "detail": "uv 0.5.1"}, + {"name": "provision", "status": "ok"} + ], + "result": { + "status": "success" | "failed", + "venv_path": "/abs/.venv", + "python_version": "3.12", + "databricks_connect_installed": "17.2.0" + }, + "error": { "code": "no_target_selected", "message": "…" } +} +``` + +- Under `--check`, `plan` is computed and emitted; `phases` and `result` are + empty/omitted. +- `error` is present only on failure; `error.code` is an enumerated, documented, + stable set. + +**Text output** mirrors the script's `=== Phase N ===` headers and final success +summary (so VS Code's phase-regex narration keeps working). `--json` emits the +struct above and suppresses decorative phase logging. + +## Testing + +### Unit tests (`libs/dbconnect/`, table-driven) + +- **`merge_test.go`** — golden input pyproject + fetched constraints → expected + merged output, covering every edge case above. Idempotency test (merge twice → + identical). Diff test for `--check`. +- **`envkey_test.go`** — version→envKey incl. nearest-supported fallback and the + unsupported-runtime error. +- **`target_test.go`** — precedence (flag > bundle), mutual exclusivity, the + three states with their exact messages; SDK calls behind a small stubbed + interface. +- **`constraints_test.go`** — fetch success, cache hit on network failure, hard + failure with clear error; uses `httptest`. + +### Acceptance tests (`acceptance/dbconnect//`) + +Golden `output.txt` per the repo pattern (`acceptance/quickstart/`). uv and +network are unavailable in the sandbox, so these cover the deterministic, +mockable surface (resolution, messaging, merge, `--check`, `--json` shape) using +`libs/testserver` for the constraint fetch and stubbed compute: + +- `serverless-check` — `--serverless v4 --check`: plan + diff, no mutation. +- `serverless-json` — `--json` shape on the resolve+plan path. +- `no-target` — the "nothing selected" error + message. +- `cluster-unsupported` — a DBR with no envKey → clear error. +- `flag-conflict` — `--cluster x --serverless y` rejected at parse. + +Phases needing a live uv/`.venv` (5–8) are exercised by unit tests with the +package-manager interface stubbed. A full end-to-end uv run is validated +manually against the script ("diff against a live script run") and noted as a +manual check, not an acceptance test. + +### Build / quality gate + +`./task build`, `./task test`, `./task lint-q`, `./task fmt-q` all green. +`NEXT_CHANGELOG.md` entry under **CLI**: new `databricks dbconnect init` / +`sync` commands. + +## Out of scope (this PR) + +- pip & conda managers (interface only). +- Flipping the `--constraint-source` default to an official endpoint. +- Any new third-party dependency. + +## Risks to verify during planning + +1. **Cluster-DBR envKey data:** the `databricks-environments` repo currently + publishes `serverless/serverless-vN` paths. Full cluster/job resolution needs + real DBR→envKey paths (e.g. `cluster/dbr-15.4`). If the repo doesn't publish + them yet, the envKey table is the gap — surface it explicitly and decide + whether to seed the table from another source or narrow to the runtimes the + repo actually publishes. The nearest-supported fallback + "runtime X not yet + supported" error covers the rest gracefully. +2. **`--json` / `cmdio` wiring:** confirm the exact mechanism the CLI uses for + JSON output (global `--output json` vs a local `--json` flag) and follow the + existing convention rather than inventing a parallel switch. From 59ca8c434b0177dead73fd91bbfd242df865cd08 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Fri, 19 Jun 2026 17:09:30 +0200 Subject: [PATCH 02/33] Add implementation plan for databricks dbconnect init/sync Bite-sized, TDD task breakdown (11 tasks) covering the command scaffold, result types, envKey mapping, constraint fetch+cache, surgical TOML merge, target resolution, uv package manager, the phase pipeline, Cobra wiring, acceptance tests, and changelog. Co-authored-by: Isaac --- .../plans/2026-06-19-dbconnect-init-sync.md | 1003 +++++++++++++++++ 1 file changed, 1003 insertions(+) create mode 100644 docs/superpowers/plans/2026-06-19-dbconnect-init-sync.md diff --git a/docs/superpowers/plans/2026-06-19-dbconnect-init-sync.md b/docs/superpowers/plans/2026-06-19-dbconnect-init-sync.md new file mode 100644 index 00000000000..b33f49bbc91 --- /dev/null +++ b/docs/superpowers/plans/2026-06-19-dbconnect-init-sync.md @@ -0,0 +1,1003 @@ +# `databricks dbconnect init` / `sync` Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Add a `databricks dbconnect` command namespace with `init` and `sync` subcommands that provision a local Python `.venv` matched to the user's Databricks compute target (Python version, `databricks-connect` pin, and dependency constraints). + +**Architecture:** A thin Cobra layer in `cmd/dbconnect/` wires flags and rendering; all logic lives in a unit-testable `libs/dbconnect/` package built around a shared phase `Pipeline` (parameterized by `Mode = Init|Sync`) and a `PackageManager` interface (uv only in this PR). Target resolution uses the SDK Clusters/Jobs APIs and the bundle's configured target; constraints are fetched from a configurable base URL; the `pyproject.toml` merge is surgical (formatting/comment-preserving). + +**Tech Stack:** Go, Cobra, `github.com/databricks/databricks-sdk-go` (compute/jobs APIs), `github.com/BurntSushi/toml` (read-only parsing — already vendored), `libs/cmdio` (output), `libs/process` (uv shell-outs), `libs/cmdctx`/`cmd/root` (workspace client + bundle). + +## Global Constraints + +- **No new third-party dependency.** Use the already-vendored `github.com/BurntSushi/toml` for reading TOML; never use it to write the user's file. +- **Hand-written command, not codegen.** Do NOT touch `.codegen/` or run `generate-cligen`. +- **`--json` is the global `--output json` flag**, accessed via `root.OutputType(cmd)` returning `flags.Output` (`flags.OutputText`/`flags.OutputJSON`); render with `cmdio.Render(ctx, v)`. Do NOT add a custom `--json` flag. +- **Errors:** wrap with `%w`; compare with `errors.Is`/`errors.As` against sentinels, never `err.Error()` string content. Structured errors carry a stable `code`. +- **Env vars** in library/product code via `github.com/databricks/cli/libs/env` (`env.Get(ctx, ...)`/`env.Lookup(ctx, ...)`), not `os.Getenv`. +- **Logging** via `github.com/databricks/cli/libs/log` (`log.Warnf`/`Debugf`), stdout via `cmdio.LogString`. Paths printed with `filepath.ToSlash`. +- **Context** passed as an argument, never stored in a struct; never `context.Background()` outside `main`; tests use `t.Context()`. +- **Modern Go idioms:** `for i := range N`, `min`/`max` builtins, `switch` for same-decision alternatives, early-return for ordered precedence, collapse `if err != nil { return err }; return nil` to `return err`. +- **Test fixture hosts** use a reserved TLD (`.test`/`.invalid`). +- **Reference URLs in comments** when integrating an external tool/endpoint (uv installer, constraint repo, pip.conf). +- One focused PR; `NEXT_CHANGELOG.md` entry under **CLI**. + +## Constants (verbatim, used across tasks) + +- Default constraint base URL: `https://raw.githubusercontent.com/pietern/databricks-environments/main` +- Constraint base URL override env var: `DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE` +- envKey for serverless: `serverless/serverless-{vN}` (e.g. `serverless/serverless-v4`) +- envKey for clusters/jobs: `dbr/{spark_version}` where `{spark_version}` is the cluster's `SparkVersion` verbatim (e.g. `dbr/15.4.x-scala2.12`) +- Backup suffix: `.bak` +- Managed-block marker (start): `# managed by databricks dbconnect — do not edit` +- Managed-block marker (end): `# end managed by databricks dbconnect` +- uv installer URL (comment reference only): `https://astral.sh/uv/install.sh` + +## File Structure + +``` +cmd/dbconnect/ + dbconnect.go New() *cobra.Command; "dbconnect" group; registers init + sync + init.go newInitCommand(): flag wiring + RunE -> pipeline.Run(Init) + sync.go newSyncCommand(): flag wiring + RunE -> pipeline.Run(Sync) + output.go renderResult(ctx, cmd, *dbconnect.Result) — text vs JSON + +libs/dbconnect/ + result.go Mode, Result, Plan, TargetInfo, ConstraintInfo, PhaseResult, PipelineError, error codes + envkey.go EnvKeyForServerless, EnvKeyForSparkVersion, PythonMinorFromRequires + constraints.go Constraints struct; FetchConstraints(ctx, baseURL, envKey) (+ cache) + merge.go MergeManaged(target []byte, c Constraints) (merged []byte, regions []string, err error) + target.go TargetResolver, ResolveTarget(...) (*TargetInfo, error) + pkgmanager.go PackageManager interface + uv.go uvManager implements PackageManager + pipeline.go Pipeline struct + Run(ctx) + +acceptance/dbconnect/ + serverless-check/ , serverless-json/ , no-target/ , cluster-unsupported/ , flag-conflict/ +``` + +--- + +### Task 1: Scaffold the command namespace + registration + +**Files:** +- Create: `cmd/dbconnect/dbconnect.go` +- Create: `cmd/dbconnect/init.go` +- Create: `cmd/dbconnect/sync.go` +- Modify: `cmd/cmd.go` (import + `cli.AddCommand(dbconnect.New())`) +- Test: `acceptance/dbconnect/help/` (golden `output.txt`) + +**Interfaces:** +- Produces: `func New() *cobra.Command` (the `dbconnect` group); private `newInitCommand()`/`newSyncCommand() *cobra.Command`. + +- [ ] **Step 1: Create the namespace command.** `cmd/dbconnect/dbconnect.go`: + +```go +package dbconnect + +import "github.com/spf13/cobra" + +// New returns the `dbconnect` command group. +func New() *cobra.Command { + cmd := &cobra.Command{ + Use: "dbconnect", + Short: "Set up a local Python environment matched to your Databricks compute", + GroupID: "development", + Long: `Set up a local Python environment matched to your Databricks compute target. + +Derives the Python version, databricks-connect version, and dependency +constraints from the selected compute (cluster, serverless, or job) so that +local resolution matches the Databricks runtime.`, + } + cmd.AddCommand(newInitCommand()) + cmd.AddCommand(newSyncCommand()) + return cmd +} +``` + +- [ ] **Step 2: Create stub subcommands.** `cmd/dbconnect/init.go`: + +```go +package dbconnect + +import ( + "github.com/databricks/cli/cmd/root" + "github.com/spf13/cobra" +) + +func newInitCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "init", + Short: "Create a fresh pyproject.toml and provision a matched .venv", + } + cmd.PreRunE = root.MustWorkspaceClient + cmd.RunE = func(cmd *cobra.Command, args []string) error { + return nil + } + return cmd +} +``` + +`cmd/dbconnect/sync.go` is identical except `Use: "sync"`, `Short: "Merge managed dependencies into an existing pyproject.toml and re-provision"`, and `newSyncCommand`. + +- [ ] **Step 3: Register in `cmd/cmd.go`.** Add import `"github.com/databricks/cli/cmd/dbconnect"` (alphabetical, near the `psql` import) and, in the "other subcommands" block next to `cli.AddCommand(psql.New())`: + +```go + cli.AddCommand(dbconnect.New()) +``` + +- [ ] **Step 4: Build.** + +Run: `./task build` +Expected: builds clean. + +- [ ] **Step 5: Verify the command appears.** + +Run: `./bin/databricks dbconnect --help` +Expected: shows `init` and `sync` subcommands. + +- [ ] **Step 6: Add a help acceptance test.** Create `acceptance/dbconnect/help/script` containing: + +``` +$CLI dbconnect --help +$CLI dbconnect init --help +``` + +Then generate the golden output: + +Run: `go test ./acceptance -run 'TestAccept/dbconnect/help' -tail -test.v -update` +Expected: creates `acceptance/dbconnect/help/output.txt`; re-running without `-update` passes. + +- [ ] **Step 7: Commit.** + +```bash +git add cmd/dbconnect/ cmd/cmd.go acceptance/dbconnect/help/ +git commit -m "Add dbconnect command namespace scaffold" +``` + +--- + +### Task 2: Result types + error codes (`result.go`) + +**Files:** +- Create: `libs/dbconnect/result.go` +- Test: `libs/dbconnect/result_test.go` + +**Interfaces:** +- Produces: + - `type Mode int` with `const ( ModeInit Mode = iota; ModeSync )` and `func (m Mode) String() string` (`"init"`/`"sync"`). + - `type ErrorCode string` with consts: `ErrNoTargetSelected="no_target_selected"`, `ErrClusterUnsupported="cluster_unsupported"`, `ErrConstraintFetchFailed="constraint_fetch_failed"`, `ErrMergeFailed="merge_failed"`, `ErrProvisionFailed="provision_failed"`, `ErrValidationFailed="validation_failed"`, `ErrUvUnavailable="uv_unavailable"`. + - `type PipelineError struct { Code ErrorCode; Msg string; Err error }` with `func (e *PipelineError) Error() string` and `func (e *PipelineError) Unwrap() error`. + - `func NewError(code ErrorCode, err error, format string, args ...any) *PipelineError`. + - Structs (all with `json:"..."` tags matching the spec): `TargetInfo{Kind, ClusterID, SparkVersion, EnvKey, PythonVersion string; Fallback *FallbackInfo}`, `FallbackInfo{Requested, Resolved string}`, `ConstraintInfo{SourceURL string; FromCache bool; RequiresPython, DatabricksConnect string; ConstraintCount int}`, `Plan{PyprojectPath, BackupPath, Diff string; ChangedRegions []string}`, `PhaseResult{Name, Status, Detail string}`, `ResultDetail{Status, VenvPath, PythonVersion, DatabricksConnectInstalled string}`, `Result{Mode string; Check bool; Target *TargetInfo; Constraints *ConstraintInfo; Plan *Plan; Phases []PhaseResult; Result *ResultDetail; Error *PipelineError}`. + +- [ ] **Step 1: Write the failing test.** `libs/dbconnect/result_test.go`: + +```go +package dbconnect + +import ( + "errors" + "testing" + + "github.com/stretchr/testify/assert" +) + +func TestPipelineErrorWrapsAndExposesCode(t *testing.T) { + base := errors.New("boom") + err := NewError(ErrConstraintFetchFailed, base, "fetch %s", "x") + assert.Equal(t, "fetch x: boom", err.Error()) + assert.Equal(t, ErrConstraintFetchFailed, err.Code) + assert.True(t, errors.Is(err, base)) +} + +func TestModeString(t *testing.T) { + assert.Equal(t, "init", ModeInit.String()) + assert.Equal(t, "sync", ModeSync.String()) +} +``` + +- [ ] **Step 2: Run test to verify it fails.** + +Run: `go test ./libs/dbconnect/ -run 'TestPipelineError|TestModeString' -v` +Expected: FAIL (undefined symbols). + +- [ ] **Step 3: Implement `result.go`** with the types from the Interfaces block. `NewError` formats `Msg` via `fmt.Sprintf(format, args...)`; `Error()` returns `Msg` plus `": "+Err.Error()` when `Err != nil`; `Unwrap()` returns `Err`. + +- [ ] **Step 4: Run tests to verify they pass.** + +Run: `go test ./libs/dbconnect/ -run 'TestPipelineError|TestModeString' -v` +Expected: PASS. + +- [ ] **Step 5: Commit.** + +```bash +git add libs/dbconnect/result.go libs/dbconnect/result_test.go +git commit -m "Add dbconnect result types and error codes" +``` + +--- + +### Task 3: envKey mapping + Python-version parsing (`envkey.go`) + +**Files:** +- Create: `libs/dbconnect/envkey.go` +- Test: `libs/dbconnect/envkey_test.go` + +**Interfaces:** +- Produces: + - `func EnvKeyForServerless(version string) string` — normalizes `"4"`, `"v4"` → `"serverless/serverless-v4"`. + - `func EnvKeyForSparkVersion(sparkVersion string) string` — returns `"dbr/"+sparkVersion`. + - `func PythonMinorFromRequires(requiresPython string) (string, error)` — parses a PEP 440 `requires-python` (e.g. `"==3.12.*"`, `">=3.12"`, `"==3.12.3"`) and returns `"3.12"`. Error if no `MAJOR.MINOR` can be extracted. + +- [ ] **Step 1: Write the failing test.** `libs/dbconnect/envkey_test.go`: + +```go +package dbconnect + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func TestEnvKeyForServerless(t *testing.T) { + for _, in := range []string{"4", "v4", "V4"} { + assert.Equal(t, "serverless/serverless-v4", EnvKeyForServerless(in)) + } +} + +func TestEnvKeyForSparkVersion(t *testing.T) { + assert.Equal(t, "dbr/15.4.x-scala2.12", EnvKeyForSparkVersion("15.4.x-scala2.12")) +} + +func TestPythonMinorFromRequires(t *testing.T) { + cases := map[string]string{ + "==3.12.*": "3.12", + ">=3.12": "3.12", + "==3.12.3": "3.12", + "~=3.11": "3.11", + } + for in, want := range cases { + got, err := PythonMinorFromRequires(in) + require.NoError(t, err) + assert.Equal(t, want, got) + } + _, err := PythonMinorFromRequires("garbage") + assert.Error(t, err) +} +``` + +- [ ] **Step 2: Run test to verify it fails.** + +Run: `go test ./libs/dbconnect/ -run 'TestEnvKey|TestPythonMinor' -v` +Expected: FAIL (undefined). + +- [ ] **Step 3: Implement `envkey.go`.** `EnvKeyForServerless`: lowercase, strip leading `v`, format `serverless/serverless-v%s`. `EnvKeyForSparkVersion`: `"dbr/" + sparkVersion`. `PythonMinorFromRequires`: use `regexp.MustCompile(\`(\d+)\.(\d+)\`)`, `FindStringSubmatch`; on no match return `fmt.Errorf("cannot parse python version from %q", requiresPython)`. + +- [ ] **Step 4: Run tests to verify they pass.** + +Run: `go test ./libs/dbconnect/ -run 'TestEnvKey|TestPythonMinor' -v` +Expected: PASS. + +- [ ] **Step 5: Commit.** + +```bash +git add libs/dbconnect/envkey.go libs/dbconnect/envkey_test.go +git commit -m "Add dbconnect envKey mapping and python-version parsing" +``` + +--- + +### Task 4: Constraint fetch + cache + parse (`constraints.go`) + +**Files:** +- Create: `libs/dbconnect/constraints.go` +- Test: `libs/dbconnect/constraints_test.go` + +**Interfaces:** +- Consumes: `ErrConstraintFetchFailed`, `NewError` (Task 2); `PythonMinorFromRequires` (Task 3). +- Produces: + - `type Constraints struct { EnvKey, SourceURL string; FromCache bool; RequiresPython, DatabricksConnect string; ConstraintDeps []string }`. + - `func FetchConstraints(ctx context.Context, baseURL, envKey, cacheDir string) (*Constraints, error)` — GET `baseURL+"/"+envKey+"/pyproject.toml"`; on HTTP success, parse and write a cache copy to `cacheDir/.toml`; on network/HTTP error, fall back to the cached file with a `log.Warnf` if present, else return `NewError(ErrConstraintFetchFailed, ...)`. `FromCache` reflects which path served the bytes. + - `func parseConstraints(data []byte) (requiresPython, dbconnect string, deps []string, err error)` — uses `toml.Unmarshal` into a struct mirroring `project.requires-python`, `dependency-groups.dev`, `tool.uv.constraint-dependencies`; selects the `dev` element whose despaced value starts with `databricks-connect`. + +- [ ] **Step 1: Write the failing test.** `libs/dbconnect/constraints_test.go`: + +```go +package dbconnect + +import ( + "net/http" + "net/http/httptest" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +const sampleToml = `[project] +requires-python = "==3.12.*" + +[dependency-groups] +dev = [ + "databricks-connect~=17.2.0", + "pytest~=8.0", +] + +[tool.uv] +constraint-dependencies = [ + "pydantic~=2.10.6", + "anyio~=4.6.2", +] +` + +func TestParseConstraints(t *testing.T) { + rp, dbc, deps, err := parseConstraints([]byte(sampleToml)) + require.NoError(t, err) + assert.Equal(t, "==3.12.*", rp) + assert.Equal(t, "databricks-connect~=17.2.0", dbc) + assert.Equal(t, []string{"pydantic~=2.10.6", "anyio~=4.6.2"}, deps) +} + +func TestFetchConstraintsHTTP(t *testing.T) { + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + assert.Equal(t, "/serverless/serverless-v4/pyproject.toml", r.URL.Path) + _, _ = w.Write([]byte(sampleToml)) + })) + defer srv.Close() + + c, err := FetchConstraints(t.Context(), srv.URL, "serverless/serverless-v4", t.TempDir()) + require.NoError(t, err) + assert.False(t, c.FromCache) + assert.Equal(t, "databricks-connect~=17.2.0", c.DatabricksConnect) + assert.Len(t, c.ConstraintDeps, 2) +} + +func TestFetchConstraintsFallsBackToCache(t *testing.T) { + cacheDir := t.TempDir() + // First, a successful fetch populates the cache. + good := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte(sampleToml)) + })) + _, err := FetchConstraints(t.Context(), good.URL, "serverless/serverless-v4", cacheDir) + require.NoError(t, err) + good.Close() + + // Now the server is down; fetch must serve the cache. + c, err := FetchConstraints(t.Context(), good.URL, "serverless/serverless-v4", cacheDir) + require.NoError(t, err) + assert.True(t, c.FromCache) +} +``` + +- [ ] **Step 2: Run test to verify it fails.** + +Run: `go test ./libs/dbconnect/ -run 'TestParseConstraints|TestFetchConstraints' -v` +Expected: FAIL (undefined). + +- [ ] **Step 3: Implement `constraints.go`.** Build the URL; use an `http.Client` with the request's context (`http.NewRequestWithContext`). On a 2xx, read the body, call `parseConstraints`, write bytes to `filepath.Join(cacheDir, sanitize(envKey)+".toml")` (sanitize replaces `/` with `__`), set `FromCache=false`. On any transport error or non-2xx, attempt to read the cache file: if present, parse it, set `FromCache=true`, `log.Warnf(ctx, "constraint fetch failed, using cached copy: %v", err)`; if absent, return `NewError(ErrConstraintFetchFailed, err, "fetch constraints for %s", envKey)`. `parseConstraints` despaces each dev entry with `strings.ReplaceAll(s, " ", "")` before the `HasPrefix("databricks-connect")` check, but stores the original string. Add a comment citing the constraint repo URL. + +- [ ] **Step 4: Run tests to verify they pass.** + +Run: `go test ./libs/dbconnect/ -run 'TestParseConstraints|TestFetchConstraints' -v` +Expected: PASS. + +- [ ] **Step 5: Commit.** + +```bash +git add libs/dbconnect/constraints.go libs/dbconnect/constraints_test.go +git commit -m "Add dbconnect constraint fetch with offline cache" +``` + +--- + +### Task 5: Surgical TOML merge (`merge.go`) + +**Files:** +- Create: `libs/dbconnect/merge.go` +- Test: `libs/dbconnect/merge_test.go` +- Test fixtures: `libs/dbconnect/testdata/merge/*.toml` + +**Interfaces:** +- Consumes: `Constraints` (Task 4). +- Produces: + - `func MergeManaged(target []byte, c Constraints) (merged []byte, regions []string, err error)` — applies the three managed transforms below, preserving all other bytes (comments/order/whitespace). `regions` lists which of `"requires-python"`, `"databricks-connect"`, `"tool.uv.constraint-dependencies"` were changed. Idempotent: `MergeManaged(MergeManaged(x)) == MergeManaged(x)`. + - `func RenderFreshPyproject(projectName string, c Constraints) []byte` — produces a complete managed `pyproject.toml` for `init` on a project that has none (used by Task 8 only when no file exists; if a file exists, `init` overwrites via MergeManaged after backup). + +The three transforms: +1. `[project].requires-python` — replace the value of an existing `requires-python = ...` line within the `[project]` table, preserving indentation. If `[project]` exists without the key, insert the line directly under the `[project]` header. +2. The `databricks-connect` element inside `[dependency-groups].dev` — replace the existing element matching `"databricks-connect..."` in place, preserving leading indentation and trailing comma. +3. `[tool.uv].constraint-dependencies` — replace the marker-bracketed managed block; if no managed block exists, drop any plain `[tool.uv]` table we own and append a freshly rendered, marker-bracketed `[tool.uv]` block at end of file. + +- [ ] **Step 1: Write the failing tests.** `libs/dbconnect/merge_test.go`: + +```go +package dbconnect + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func testConstraints() Constraints { + return Constraints{ + RequiresPython: "==3.12.*", + DatabricksConnect: "databricks-connect~=17.2.0", + ConstraintDeps: []string{"pydantic~=2.10.6", "anyio~=4.6.2"}, + } +} + +func TestMergeReplacesRequiresPythonPreservingComments(t *testing.T) { + in := []byte(`[project] +name = "demo" +# keep this comment +requires-python = ">=3.10" + +[dependency-groups] +dev = [ + "databricks-connect~=16.0.0", + "pytest~=8.0", +] +`) + out, regions, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + assert.Contains(t, string(out), `requires-python = "==3.12.*"`) + assert.Contains(t, string(out), "# keep this comment") + assert.Contains(t, string(out), `"databricks-connect~=17.2.0",`) + assert.Contains(t, string(out), `"pytest~=8.0",`) + assert.Contains(t, regions, "requires-python") + assert.Contains(t, regions, "databricks-connect") + assert.Contains(t, regions, "tool.uv.constraint-dependencies") + assert.Contains(t, string(out), "pydantic~=2.10.6") +} + +func TestMergeIsIdempotent(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = [ + "databricks-connect~=16.0.0", +] +`) + once, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + twice, _, err := MergeManaged(once, testConstraints()) + require.NoError(t, err) + assert.Equal(t, string(once), string(twice)) +} + +func TestMergeInsertsRequiresPythonWhenMissing(t *testing.T) { + in := []byte(`[project] +name = "demo" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] +`) + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + assert.Contains(t, string(out), `requires-python = "==3.12.*"`) +} + +func TestMergeReplacesExistingManagedToolUvBlock(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] + +` + managedMarkerStart + ` +[tool.uv] +constraint-dependencies = [ + "stale~=1.0.0", +] +` + managedMarkerEnd + ` +`) + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + assert.NotContains(t, string(out), "stale~=1.0.0") + assert.Contains(t, string(out), "pydantic~=2.10.6") + // Only one managed block remains. + assert.Equal(t, 1, countOccurrences(string(out), managedMarkerStart)) +} +``` + +Add a tiny `countOccurrences` helper at the bottom of the test file using `strings.Count`. + +- [ ] **Step 2: Run tests to verify they fail.** + +Run: `go test ./libs/dbconnect/ -run 'TestMerge' -v` +Expected: FAIL (undefined `MergeManaged`, `managedMarkerStart`). + +- [ ] **Step 3: Implement `merge.go`.** Define `const managedMarkerStart = "# managed by databricks dbconnect — do not edit"` and `const managedMarkerEnd = "# end managed by databricks dbconnect"`. Normalize CRLF→LF on entry, restore the original line ending on exit (detect by presence of `\r\n` in input). Work on `strings.Split(s, "\n")`: + - **requires-python:** scan for a line matching `^\s*requires-python\s*=` (regexp) after the `[project]` header and before the next `^\[`; replace its value preserving the leading whitespace capture group. If absent, insert `requires-python = ""` right after the `[project]` header line. + - **databricks-connect:** scan within `[dependency-groups]` for a line containing `"databricks-connect`; replace the quoted token, preserving indentation and a trailing comma if the original had one. Record region only if a replacement happened. + - **tool.uv block:** if a marker-bracketed block exists, replace the lines between (and including) the markers; else remove any existing `[tool.uv]` table (header to next `^\[` or EOF) and append a freshly rendered block. Render: + ``` + + [tool.uv] + constraint-dependencies = [ + "dep1", + "dep2", + ] + + ``` + separated from prior content by exactly one blank line; file ends with a single trailing newline. + - `RenderFreshPyproject` builds a minimal `[project]` + `[dependency-groups].dev` (with the dbconnect pin) + the marker-bracketed `[tool.uv]` block. + +- [ ] **Step 4: Run tests to verify they pass.** + +Run: `go test ./libs/dbconnect/ -run 'TestMerge' -v` +Expected: PASS. + +- [ ] **Step 5: Add CRLF + quote-style edge-case tests** and make them pass (extend `merge_test.go`): + +```go +func TestMergePreservesCRLF(t *testing.T) { + in := []byte("[project]\r\nrequires-python = \">=3.10\"\r\n\r\n[dependency-groups]\r\ndev = [\"databricks-connect~=16.0.0\"]\r\n") + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + assert.Contains(t, string(out), "\r\n") + assert.Contains(t, string(out), `requires-python = "==3.12.*"`) +} +``` + +Run: `go test ./libs/dbconnect/ -run 'TestMerge' -v` +Expected: PASS. + +- [ ] **Step 6: Commit.** + +```bash +git add libs/dbconnect/merge.go libs/dbconnect/merge_test.go +git commit -m "Add surgical formatting-preserving pyproject.toml merge" +``` + +--- + +### Task 6: Target resolution (`target.go`) + +**Files:** +- Create: `libs/dbconnect/target.go` +- Test: `libs/dbconnect/target_test.go` + +**Interfaces:** +- Consumes: `TargetInfo`, `ErrNoTargetSelected`, `ErrClusterUnsupported`, `NewError` (Task 2); `EnvKeyForServerless`, `EnvKeyForSparkVersion` (Task 3). +- Produces: + - `type ComputeClient interface { GetClusterSparkVersion(ctx context.Context, clusterID string) (string, error); GetJobSparkVersion(ctx context.Context, jobID string) (string, isServerless bool, version string, err error) }` — a narrow seam over the SDK so tests stub it. (Job returns either a spark version or serverless marker.) + - `type TargetFlags struct { Cluster, Serverless, Job string }`. + - `type BundleTarget struct { ClusterID string; Serverless bool; Selected bool }` — the three-state result of reading the bundle's configured target (`Selected=false` ⇒ nothing selected). + - `func ResolveTarget(ctx context.Context, f TargetFlags, c ComputeClient, bt BundleTarget) (*TargetInfo, error)` — precedence: cluster flag → serverless flag → job flag → bundle target. Produces `TargetInfo` with `EnvKey` set; `PythonVersion` is filled later from the fetched constraints (left empty here). Three-state errors when falling back to the bundle. + - `func ValidateTargetFlags(f TargetFlags) error` — at most one of the three set (the Cobra layer also marks them mutually exclusive; this guards the library path). + +- [ ] **Step 1: Write the failing test.** `libs/dbconnect/target_test.go`: + +```go +package dbconnect + +import ( + "context" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +type stubCompute struct { + clusterVersion string + clusterErr error +} + +func (s stubCompute) GetClusterSparkVersion(_ context.Context, _ string) (string, error) { + return s.clusterVersion, s.clusterErr +} +func (s stubCompute) GetJobSparkVersion(_ context.Context, _ string) (string, bool, string, error) { + return "", false, "", nil +} + +func TestResolveServerlessFlag(t *testing.T) { + ti, err := ResolveTarget(t.Context(), TargetFlags{Serverless: "v4"}, stubCompute{}, BundleTarget{}) + require.NoError(t, err) + assert.Equal(t, "serverless", ti.Kind) + assert.Equal(t, "serverless/serverless-v4", ti.EnvKey) +} + +func TestResolveClusterFlag(t *testing.T) { + c := stubCompute{clusterVersion: "15.4.x-scala2.12"} + ti, err := ResolveTarget(t.Context(), TargetFlags{Cluster: "abc"}, c, BundleTarget{}) + require.NoError(t, err) + assert.Equal(t, "cluster", ti.Kind) + assert.Equal(t, "dbr/15.4.x-scala2.12", ti.EnvKey) + assert.Equal(t, "abc", ti.ClusterID) +} + +func TestResolveBundleNothingSelected(t *testing.T) { + _, err := ResolveTarget(t.Context(), TargetFlags{}, stubCompute{}, BundleTarget{Selected: false}) + var pe *PipelineError + require.ErrorAs(t, err, &pe) + assert.Equal(t, ErrNoTargetSelected, pe.Code) +} + +func TestResolveBundleServerless(t *testing.T) { + ti, err := ResolveTarget(t.Context(), TargetFlags{}, stubCompute{}, BundleTarget{Selected: true, Serverless: true}) + require.NoError(t, err) + assert.Equal(t, "serverless/serverless-v4", ti.EnvKey) +} + +func TestValidateTargetFlagsMutuallyExclusive(t *testing.T) { + assert.Error(t, ValidateTargetFlags(TargetFlags{Cluster: "a", Serverless: "v4"})) + assert.NoError(t, ValidateTargetFlags(TargetFlags{Cluster: "a"})) +} +``` + +Note: `TestResolveBundleServerless` encodes the spec rule that a bundle serverless target with no recorded version defaults to `v4` (the script's documented stand-in). Add a code comment to that effect. + +- [ ] **Step 2: Run test to verify it fails.** + +Run: `go test ./libs/dbconnect/ -run 'TestResolve|TestValidateTargetFlags' -v` +Expected: FAIL (undefined). + +- [ ] **Step 3: Implement `target.go`** with ordered-precedence early returns. Cluster flag → `GetClusterSparkVersion` → `Kind:"cluster"`, `EnvKey: EnvKeyForSparkVersion(v)`. Serverless flag → normalize, `EnvKey: EnvKeyForServerless(v)`. Job flag → `GetJobSparkVersion`; serverless job → serverless envKey (default `v4`), else cluster envKey. No flag → read `bt`: `!Selected` → `NewError(ErrNoTargetSelected, nil, "No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job.")`; `Serverless` → serverless `v4` (with the documented-default comment); `ClusterID != ""` → resolve via `GetClusterSparkVersion`. `ValidateTargetFlags` counts non-empty fields; >1 → error naming the conflicting flags. + +- [ ] **Step 4: Run tests to verify they pass.** + +Run: `go test ./libs/dbconnect/ -run 'TestResolve|TestValidateTargetFlags' -v` +Expected: PASS. + +- [ ] **Step 5: Commit.** + +```bash +git add libs/dbconnect/target.go libs/dbconnect/target_test.go +git commit -m "Add dbconnect target resolution with three-state messaging" +``` + +--- + +### Task 7: PackageManager interface + uv implementation (`pkgmanager.go`, `uv.go`) + +**Files:** +- Create: `libs/dbconnect/pkgmanager.go` +- Create: `libs/dbconnect/uv.go` +- Test: `libs/dbconnect/uv_test.go` + +**Interfaces:** +- Consumes: `libs/process` for shell-outs; `ErrUvUnavailable`, `ErrProvisionFailed`, `NewError` (Task 2). +- Produces: + - `type PackageManager interface { Name() string; EnsureAvailable(ctx context.Context) (version string, err error); EnsurePython(ctx context.Context, minor string) error; Provision(ctx context.Context, projectDir string) error; PostProvision(ctx context.Context, projectDir string) error; Validate(ctx context.Context, projectDir string) (pythonVersion, dbconnectVersion string, err error) }`. + - `type uvManager struct { bin string }` implementing it; `func newUvManager() *uvManager`. + - `func discoverUv(ctx context.Context) (string, error)` — search `exec.LookPath`, then `~/.local/bin/uv`, `$XDG_BIN_HOME/uv`, `/opt/homebrew/bin/uv`, `/usr/local/bin/uv`. Returns the path or `NewError(ErrUvUnavailable, ...)`. (Bootstrapping via the installer is invoked by `EnsureAvailable` when discovery fails — guarded so tests can stub.) + +Because real uv shell-outs can't run in unit tests, `uv_test.go` covers `discoverUv` path logic (with a fake bin dir on a temp `PATH`) and the argument construction of each command via a small indirection: `uvManager` builds `[]string` arg slices through unexported helpers (`syncArgs()`, `pipSeedArgs(py string)`, `pythonInstallArgs(minor string)`) that are unit-tested directly. + +- [ ] **Step 1: Write the failing test.** `libs/dbconnect/uv_test.go`: + +```go +package dbconnect + +import ( + "os" + "path/filepath" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func TestUvArgs(t *testing.T) { + m := &uvManager{bin: "uv"} + assert.Equal(t, []string{"sync"}, m.syncArgs()) + assert.Equal(t, []string{"python", "install", "3.12"}, m.pythonInstallArgs("3.12")) + assert.Equal(t, []string{"pip", "install", "pip", "--python", "/p/.venv/bin/python"}, m.pipSeedArgs("/p/.venv/bin/python")) +} + +func TestDiscoverUvFindsBinOnPath(t *testing.T) { + dir := t.TempDir() + bin := filepath.Join(dir, "uv") + require.NoError(t, os.WriteFile(bin, []byte("#!/bin/sh\n"), 0o755)) + t.Setenv("PATH", dir) + got, err := discoverUv(t.Context()) + require.NoError(t, err) + assert.Equal(t, bin, got) +} +``` + +- [ ] **Step 2: Run test to verify it fails.** + +Run: `go test ./libs/dbconnect/ -run 'TestUv|TestDiscoverUv' -v` +Expected: FAIL (undefined). + +- [ ] **Step 3: Implement `pkgmanager.go` (interface only) and `uv.go`.** `discoverUv` uses `exec.LookPath` first, then the candidate list (expand `~` via `os.UserHomeDir`, read `XDG_BIN_HOME` via `env.Lookup`). The arg helpers return the slices asserted above. `Provision` runs `uv sync` in `projectDir` via `process.Background` (or the repo's standard `process` runner) with `ctx`. `PostProvision` runs `uv pip install pip --python ` and carries the full Phase 7 rationale comment from the script (VS Code pip fallback; uv venvs lack pip; `uv sync` strips pip). `Validate` runs `uv run --no-project python -c` to read the Python minor and `importlib.metadata.version("databricks-connect")`. `EnsureAvailable` calls `discoverUv`; on failure, runs the installer (`curl ... | sh`) with a reference-URL comment, then re-discovers; on still-missing, returns `ErrUvUnavailable`. + +- [ ] **Step 4: Run tests to verify they pass.** + +Run: `go test ./libs/dbconnect/ -run 'TestUv|TestDiscoverUv' -v` +Expected: PASS. + +- [ ] **Step 5: Commit.** + +```bash +git add libs/dbconnect/pkgmanager.go libs/dbconnect/uv.go libs/dbconnect/uv_test.go +git commit -m "Add PackageManager interface and uv implementation" +``` + +--- + +### Task 8: The pipeline (`pipeline.go`) + +**Files:** +- Create: `libs/dbconnect/pipeline.go` +- Test: `libs/dbconnect/pipeline_test.go` + +**Interfaces:** +- Consumes: every type above. +- Produces: + - `type Pipeline struct { Mode Mode; Check bool; ProjectDir string; ConstraintBaseURL string; CacheDir string; Flags TargetFlags; Compute ComputeClient; Bundle BundleTarget; PM PackageManager }`. + - `func (p *Pipeline) Run(ctx context.Context) (*Result, error)` — executes phases 1–8 (preflight folded into PM `EnsureAvailable`), honoring `Check` (stop after computing the plan/diff; no mutation). Returns a fully populated `*Result`; on a phase error, sets `Result.Error` and returns the error too. + - Phase methods are unexported (`resolve`, `fetch`, `mergePlan`, `applyMerge`, `provision`, `validate`), each appending a `PhaseResult`. + +Mode behavior: `ModeInit` — if `pyproject.toml` exists, back up to `.bak` then `MergeManaged`; if absent, `RenderFreshPyproject`. `ModeSync` — restore from `.bak` if present (else back up), then `MergeManaged`. + +- [ ] **Step 1: Write the failing test** (drives the full pipeline with stubbed Compute + PM + httptest constraint server, against a temp project dir). `libs/dbconnect/pipeline_test.go`: + +```go +package dbconnect + +import ( + "context" + "net/http" + "net/http/httptest" + "os" + "path/filepath" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +type fakePM struct{ py, dbc string } + +func (fakePM) Name() string { return "fake" } +func (fakePM) EnsureAvailable(context.Context) (string, error) { return "fake 1.0", nil } +func (fakePM) EnsurePython(context.Context, string) error { return nil } +func (fakePM) Provision(context.Context, string) error { return nil } +func (fakePM) PostProvision(context.Context, string) error { return nil } +func (f fakePM) Validate(context.Context, string) (string, string, error) { + return f.py, f.dbc, nil +} + +func writeProject(t *testing.T) string { + dir := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(dir, "pyproject.toml"), []byte(`[project] +name = "demo" +requires-python = ">=3.10" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] +`), 0o644)) + return dir +} + +func newTestServer(t *testing.T) *httptest.Server { + return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte(sampleToml)) + })) +} + +func TestPipelineCheckMutatesNothing(t *testing.T) { + dir := writeProject(t) + before, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeSync, Check: true, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, + } + res, err := p.Run(t.Context()) + require.NoError(t, err) + assert.True(t, res.Check) + require.NotNil(t, res.Plan) + assert.Contains(t, res.Plan.Diff, "==3.12.*") + after, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) + assert.Equal(t, string(before), string(after)) // unchanged +} + +func TestPipelineSyncProvisionsAndValidates(t *testing.T) { + dir := writeProject(t) + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeSync, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, + } + res, err := p.Run(t.Context()) + require.NoError(t, err) + require.NotNil(t, res.Result) + assert.Equal(t, "success", res.Result.Status) + assert.Equal(t, "3.12", res.Result.PythonVersion) + merged, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) + assert.Contains(t, string(merged), `"databricks-connect~=17.2.0",`) + assert.FileExists(t, filepath.Join(dir, "pyproject.toml.bak")) +} +``` + +(`sampleToml`, `stubCompute` come from earlier test files in the same package.) + +- [ ] **Step 2: Run tests to verify they fail.** + +Run: `go test ./libs/dbconnect/ -run 'TestPipeline' -v` +Expected: FAIL (undefined `Pipeline`). + +- [ ] **Step 3: Implement `pipeline.go`.** `Run`: `EnsureAvailable` (record phase + `ErrUvUnavailable` on fail) → `resolve` (ResolveTarget) → `fetch` (FetchConstraints; fill `TargetInfo.PythonVersion` from `PythonMinorFromRequires(c.RequiresPython)`, build `ConstraintInfo`) → `mergePlan` (read existing file or empty; compute merged bytes via `MergeManaged`/`RenderFreshPyproject`; build `Plan` with a unified diff — use a small diff helper or `libs/textutil` if present, else a minimal line diff; set `ChangedRegions`). If `Check`, populate `Result` (Mode, Check, Target, Constraints, Plan) and return. Else `applyMerge` (Mode-specific backup/restore then write bytes) → `EnsurePython(py)` → `Provision` → `PostProvision` → `Validate` (assert minor==`py`; `databricks-connect` major matches the pin's major, else `ErrValidationFailed`) → populate `Result.Result`. Each phase appends a `PhaseResult{Name,Status,Detail}`. + +- [ ] **Step 4: Run tests to verify they pass.** + +Run: `go test ./libs/dbconnect/ -run 'TestPipeline' -v` +Expected: PASS. + +- [ ] **Step 5: Run the whole package.** + +Run: `go test ./libs/dbconnect/ -v` +Expected: all PASS. + +- [ ] **Step 6: Commit.** + +```bash +git add libs/dbconnect/pipeline.go libs/dbconnect/pipeline_test.go +git commit -m "Add dbconnect pipeline orchestrating all phases" +``` + +--- + +### Task 9: Wire the Cobra layer (flags, bundle/compute adapters, rendering) + +**Files:** +- Modify: `cmd/dbconnect/init.go`, `cmd/dbconnect/sync.go` +- Create: `cmd/dbconnect/output.go` +- Create: `cmd/dbconnect/compute.go` (SDK adapter implementing `dbconnect.ComputeClient`) + +**Interfaces:** +- Consumes: `dbconnect.Pipeline`, `dbconnect.ComputeClient`, `dbconnect.Result`, `root.OutputType`, `cmdctx.WorkspaceClient`, `root.TryConfigureBundle`. +- Produces: `func runPipeline(cmd *cobra.Command, mode dbconnect.Mode) error`; `type sdkCompute struct{ w *databricks.WorkspaceClient }` implementing `ComputeClient` via `w.Clusters.GetByClusterId` (→ `.SparkVersion`) and `w.Jobs.Get`. + +- [ ] **Step 1: Implement the shared `runPipeline`** in `init.go` (sync.go calls it with `ModeSync`). Read flags (`--cluster/--serverless/--job/--check/--constraint-source`), build `TargetFlags`, `ValidateTargetFlags`, resolve `ProjectDir` (cwd), `CacheDir` (`os.UserCacheDir()/databricks/dbconnect`), `ConstraintBaseURL` (flag → `env.Lookup(ctx, "DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE")` → default constant), `Compute: sdkCompute{w}`, `Bundle:` from `root.TryConfigureBundle` (map `ClusterId`/serverless mode → `BundleTarget`), `PM: newUvManager()`. Mark the three target flags mutually exclusive via `cmd.MarkFlagsMutuallyExclusive`. Call `p.Run(ctx)`, then `renderResult`. + +- [ ] **Step 2: Implement `output.go`** `renderResult(cmd, res, err)`: when `root.OutputType(cmd) == flags.OutputJSON`, `cmdio.Render(ctx, res)`; else print the phase headers + a success/plan summary mirroring the script (`=== Phase N ===` style via `cmdio.LogString`). On error, JSON path still renders `res` (with `res.Error` set); text path returns the wrapped error. + +- [ ] **Step 3: Implement `compute.go`** adapter. `GetClusterSparkVersion`: `d, err := w.Clusters.GetByClusterId(ctx, id)`; return `d.SparkVersion`. `GetJobSparkVersion`: `w.Jobs.Get`; inspect the job's first task/job-cluster for a `SparkVersion` or serverless. Add a comment if the job compute shape is non-obvious. + +- [ ] **Step 4: Build + manual smoke.** + +Run: `./task build && ./bin/databricks dbconnect init --serverless v4 --check --output json` +Expected: prints the JSON plan; no files changed. (Network to the constraint repo required; if offline, expect the `constraint_fetch_failed` code.) + +- [ ] **Step 5: Commit.** + +```bash +git add cmd/dbconnect/ +git commit -m "Wire dbconnect Cobra layer: flags, compute adapter, rendering" +``` + +--- + +### Task 10: Acceptance tests + +**Files:** +- Create: `acceptance/dbconnect/serverless-check/{script,output.txt}` +- Create: `acceptance/dbconnect/no-target/{script,output.txt}` +- Create: `acceptance/dbconnect/cluster-unsupported/{script,output.txt}` +- Create: `acceptance/dbconnect/flag-conflict/{script,output.txt}` +- Create: `acceptance/dbconnect/serverless-json/{script,output.txt}` +- Possibly: per-case `test.toml` to stub the constraint server + workspace (follow `acceptance/quickstart/` and any testserver-backed case). + +**Interfaces:** Consumes the built CLI via the acceptance harness `$CLI`. + +- [ ] **Step 1: Inspect an existing testserver-backed acceptance case** to copy the pattern for stubbing HTTP + the workspace client. + +Run: `ls acceptance/cmd/ acceptance/auth/ && sed -n '1,40p' acceptance/quickstart/script 2>/dev/null` +Expected: see how `script`, `output.txt`, and `test.toml` cooperate (env, replacements, stubbed server). + +- [ ] **Step 2: Write `flag-conflict`** (no network needed). `script`: + +``` +$CLI dbconnect init --cluster abc --serverless v4 +``` + +Generate golden: + +Run: `go test ./acceptance -run 'TestAccept/dbconnect/flag-conflict' -tail -test.v -update` +Expected: `output.txt` shows the mutual-exclusion error and non-zero exit. + +- [ ] **Step 3: Write `no-target`** (bundle with no compute selected, no flags). Provide a minimal `databricks.yml` fixture in the case dir; `script`: + +``` +$CLI dbconnect init +``` + +Run: `go test ./acceptance -run 'TestAccept/dbconnect/no-target' -tail -test.v -update` +Expected: `output.txt` shows the "No compute target is selected…" message. + +- [ ] **Step 4: Write `serverless-check`, `serverless-json`, `cluster-unsupported`** using the stubbed constraint server (point `DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE` at the test server via `test.toml`/`script`). `serverless-check` runs `--serverless v4 --check`; `serverless-json` adds `--output json`; `cluster-unsupported` points `--cluster` at a stubbed cluster whose DBR has no constraint dir (server 404) → `cluster_unsupported`/`constraint_fetch_failed`. + +Run: `go test ./acceptance -run 'TestAccept/dbconnect' -tail -test.v -update` +Expected: all goldens created. + +- [ ] **Step 5: Verify without `-update`.** + +Run: `go test ./acceptance -run 'TestAccept/dbconnect' -tail -test.v` +Expected: all PASS. + +- [ ] **Step 6: Commit.** + +```bash +git add acceptance/dbconnect/ +git commit -m "Add dbconnect acceptance tests" +``` + +--- + +### Task 11: Changelog, lint, fmt, full suite + +**Files:** +- Modify: `NEXT_CHANGELOG.md` + +- [ ] **Step 1: Add the changelog entry** under `### CLI` in `NEXT_CHANGELOG.md`: + +```markdown +* Add `databricks dbconnect init` and `databricks dbconnect sync` to provision a local Python environment (Python version, `databricks-connect` pin, and dependency constraints) matched to the selected Databricks compute target. +``` + +- [ ] **Step 2: Format changed files.** + +Run: `./task fmt-q` +Expected: no diff or auto-applied formatting only. + +- [ ] **Step 3: Lint changed files.** + +Run: `./task lint-q` +Expected: clean (fix anything reported). + +- [ ] **Step 4: Full test suite.** + +Run: `./task test` +Expected: all PASS. + +- [ ] **Step 5: Commit.** + +```bash +git add NEXT_CHANGELOG.md +git commit -m "Add changelog entry for dbconnect init/sync" +``` + +--- + +## Self-Review + +**Spec coverage:** +- Namespace + `init`/`sync` → Tasks 1, 9. ✓ +- Phase pipeline (0–8) → Task 8 (preflight folded into PM.EnsureAvailable, Task 7). ✓ +- Shared flags `--cluster/--serverless/--job/--check/--json` → Task 9; `--json` realized as global `--output json` per Global Constraints. ✓ +- Target resolution via API + three-state messaging + full cluster/job → Tasks 6, 9. ✓ +- Robust surgical TOML merge of the 3 managed regions → Task 5. ✓ +- Constraint fetch (configurable URL) + offline cache → Task 4. ✓ +- Structured `--json` schema + `--check` dry-run → Tasks 2, 8, 9. ✓ +- uv branch incl. pip-seed (Phase 7) rationale → Task 7. ✓ +- Acceptance cases (serverless happy/check, no-target, cluster-stub→unsupported, --check, --json) → Task 10. ✓ +- Unit tests for merge/envkey/target/constraints → Tasks 3–8. ✓ +- Changelog + lint + fmt → Task 11. ✓ +- "uv only now, pip/conda later" → PackageManager interface (Task 7), no pip/conda files. ✓ +- No new dependency → uses vendored BurntSushi (read-only) + stdlib. ✓ + +**Placeholder scan:** No "TBD"/"handle edge cases"/"similar to". Each code step shows code; each run step shows command + expected output. The one explicit investigation step (Task 10 Step 1) is a deliberate "inspect existing pattern" action, not a placeholder. + +**Type consistency:** `MergeManaged`, `FetchConstraints`, `ResolveTarget`, `TargetFlags`, `BundleTarget`, `ComputeClient`, `PackageManager`, `Pipeline`, `Result`/`Plan`/`TargetInfo`/`ConstraintInfo` names are used identically across Tasks 2–9. `managedMarkerStart`/`managedMarkerEnd` consistent between Task 5 impl and tests. uv arg-helper names (`syncArgs`,`pythonInstallArgs`,`pipSeedArgs`) consistent between Task 7 impl and tests. + +**Known follow-ups (out of scope, noted for the implementer):** confirm the exact `databricks.yml` shape used to derive `BundleTarget` from `TryConfigureBundle` (cluster_id vs serverless mode) during Task 9; the SDK `Jobs.Get` compute shape may need a small comment per the repo's "non-obvious backend quirk" rule. From a0efbc19109132adca1660e49afc29fcc3c7db29 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Fri, 19 Jun 2026 17:15:32 +0200 Subject: [PATCH 03/33] Add dbconnect command namespace scaffold --- acceptance/dbconnect/help/output.txt | 33 ++++++++++++++++++++++++++++ acceptance/dbconnect/help/script | 2 ++ cmd/cmd.go | 2 ++ cmd/dbconnect/dbconnect.go | 20 +++++++++++++++++ cmd/dbconnect/init.go | 18 +++++++++++++++ cmd/dbconnect/sync.go | 18 +++++++++++++++ 6 files changed, 93 insertions(+) create mode 100644 acceptance/dbconnect/help/output.txt create mode 100644 acceptance/dbconnect/help/script create mode 100644 cmd/dbconnect/dbconnect.go create mode 100644 cmd/dbconnect/init.go create mode 100644 cmd/dbconnect/sync.go diff --git a/acceptance/dbconnect/help/output.txt b/acceptance/dbconnect/help/output.txt new file mode 100644 index 00000000000..4811f013d5f --- /dev/null +++ b/acceptance/dbconnect/help/output.txt @@ -0,0 +1,33 @@ +>>> $CLI dbconnect --help +Set up a local Python environment matched to your Databricks compute + +Usage: + databricks dbconnect [command] + +Available Commands: + init Create a fresh pyproject.toml and provision a matched .venv + sync Merge managed dependencies into an existing pyproject.toml and re-provision + +Flags: + -h, --help help for dbconnect + +Global Flags: + --debug enable debug logging + -o, --output type output type: text or json (default text) + -p, --profile string ~/.databrickscfg profile + -t, --target string bundle target to use (if applicable) + +>>> $CLI dbconnect init --help +Create a fresh pyproject.toml and provision a matched .venv + +Usage: + databricks dbconnect init [flags] + +Flags: + -h, --help help for init + +Global Flags: + --debug enable debug logging + -o, --output type output type: text or json (default text) + -p, --profile string ~/.databrickscfg profile + -t, --target string bundle target to use (if applicable) diff --git a/acceptance/dbconnect/help/script b/acceptance/dbconnect/help/script new file mode 100644 index 00000000000..962d7c3f64e --- /dev/null +++ b/acceptance/dbconnect/help/script @@ -0,0 +1,2 @@ +$CLI dbconnect --help +$CLI dbconnect init --help diff --git a/cmd/cmd.go b/cmd/cmd.go index 718d3a8fda3..47aef433604 100644 --- a/cmd/cmd.go +++ b/cmd/cmd.go @@ -15,6 +15,7 @@ import ( "github.com/databricks/cli/cmd/cache" "github.com/databricks/cli/cmd/completion" "github.com/databricks/cli/cmd/configure" + "github.com/databricks/cli/cmd/dbconnect" "github.com/databricks/cli/cmd/experimental" "github.com/databricks/cli/cmd/fs" "github.com/databricks/cli/cmd/labs" @@ -120,6 +121,7 @@ func New(ctx context.Context) *cobra.Command { cli.AddCommand(cache.New()) cli.AddCommand(experimental.New()) cli.AddCommand(psql.New()) + cli.AddCommand(dbconnect.New()) cli.AddCommand(configure.New()) cli.AddCommand(fs.New()) cli.AddCommand(labs.New(ctx)) diff --git a/cmd/dbconnect/dbconnect.go b/cmd/dbconnect/dbconnect.go new file mode 100644 index 00000000000..5ccaeb1ac44 --- /dev/null +++ b/cmd/dbconnect/dbconnect.go @@ -0,0 +1,20 @@ +package dbconnect + +import "github.com/spf13/cobra" + +// New returns the `dbconnect` command group. +func New() *cobra.Command { + cmd := &cobra.Command{ + Use: "dbconnect", + Short: "Set up a local Python environment matched to your Databricks compute", + GroupID: "development", + Long: `Set up a local Python environment matched to your Databricks compute target. + +Derives the Python version, databricks-connect version, and dependency +constraints from the selected compute (cluster, serverless, or job) so that +local resolution matches the Databricks runtime.`, + } + cmd.AddCommand(newInitCommand()) + cmd.AddCommand(newSyncCommand()) + return cmd +} diff --git a/cmd/dbconnect/init.go b/cmd/dbconnect/init.go new file mode 100644 index 00000000000..84e017a0e79 --- /dev/null +++ b/cmd/dbconnect/init.go @@ -0,0 +1,18 @@ +package dbconnect + +import ( + "github.com/databricks/cli/cmd/root" + "github.com/spf13/cobra" +) + +func newInitCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "init", + Short: "Create a fresh pyproject.toml and provision a matched .venv", + } + cmd.PreRunE = root.MustWorkspaceClient + cmd.RunE = func(cmd *cobra.Command, args []string) error { + return nil + } + return cmd +} diff --git a/cmd/dbconnect/sync.go b/cmd/dbconnect/sync.go new file mode 100644 index 00000000000..d2cbeb00de0 --- /dev/null +++ b/cmd/dbconnect/sync.go @@ -0,0 +1,18 @@ +package dbconnect + +import ( + "github.com/databricks/cli/cmd/root" + "github.com/spf13/cobra" +) + +func newSyncCommand() *cobra.Command { + cmd := &cobra.Command{ + Use: "sync", + Short: "Merge managed dependencies into an existing pyproject.toml and re-provision", + } + cmd.PreRunE = root.MustWorkspaceClient + cmd.RunE = func(cmd *cobra.Command, args []string) error { + return nil + } + return cmd +} From 050bf987d9f5faaccfc7cc49f92f16ec1e96a542 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 09:43:27 +0200 Subject: [PATCH 04/33] Fix dbconnect help acceptance golden Regenerate the golden from the built binary; the prior hand-written version showed the command Short text instead of the rendered Long help. Co-authored-by: Isaac --- acceptance/dbconnect/help/output.txt | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/acceptance/dbconnect/help/output.txt b/acceptance/dbconnect/help/output.txt index 4811f013d5f..1ef0c45aa82 100644 --- a/acceptance/dbconnect/help/output.txt +++ b/acceptance/dbconnect/help/output.txt @@ -1,5 +1,8 @@ ->>> $CLI dbconnect --help -Set up a local Python environment matched to your Databricks compute +Set up a local Python environment matched to your Databricks compute target. + +Derives the Python version, databricks-connect version, and dependency +constraints from the selected compute (cluster, serverless, or job) so that +local resolution matches the Databricks runtime. Usage: databricks dbconnect [command] @@ -17,7 +20,7 @@ Global Flags: -p, --profile string ~/.databrickscfg profile -t, --target string bundle target to use (if applicable) ->>> $CLI dbconnect init --help +Use "databricks dbconnect [command] --help" for more information about a command. Create a fresh pyproject.toml and provision a matched .venv Usage: From ee4da532463b173b03d81e100922b7fe216eb463 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 09:47:31 +0200 Subject: [PATCH 05/33] Add dbconnect result types and error codes --- libs/dbconnect/result.go | 125 ++++++++++++++++++++++++++++++++++ libs/dbconnect/result_test.go | 21 ++++++ 2 files changed, 146 insertions(+) create mode 100644 libs/dbconnect/result.go create mode 100644 libs/dbconnect/result_test.go diff --git a/libs/dbconnect/result.go b/libs/dbconnect/result.go new file mode 100644 index 00000000000..18d31a53edb --- /dev/null +++ b/libs/dbconnect/result.go @@ -0,0 +1,125 @@ +package dbconnect + +import "fmt" + +// Mode represents the dbconnect operation mode. +type Mode int + +const ( + ModeInit Mode = iota + ModeSync +) + +// String returns the string representation of the Mode. +func (m Mode) String() string { + switch m { + case ModeInit: + return "init" + case ModeSync: + return "sync" + default: + return "unknown" + } +} + +// ErrorCode represents a dbconnect error code. +type ErrorCode string + +const ( + ErrNoTargetSelected ErrorCode = "no_target_selected" + ErrClusterUnsupported ErrorCode = "cluster_unsupported" + ErrConstraintFetchFailed ErrorCode = "constraint_fetch_failed" + ErrMergeFailed ErrorCode = "merge_failed" + ErrProvisionFailed ErrorCode = "provision_failed" + ErrValidationFailed ErrorCode = "validation_failed" + ErrUvUnavailable ErrorCode = "uv_unavailable" +) + +// PipelineError represents an error during the dbconnect pipeline. +type PipelineError struct { + Code ErrorCode + Msg string + Err error +} + +// Error returns the error message. +func (e *PipelineError) Error() string { + if e.Err != nil { + return e.Msg + ": " + e.Err.Error() + } + return e.Msg +} + +// Unwrap returns the wrapped error. +func (e *PipelineError) Unwrap() error { + return e.Err +} + +// NewError creates a new PipelineError with the given code and error. +func NewError(code ErrorCode, err error, format string, args ...any) *PipelineError { + return &PipelineError{ + Code: code, + Msg: fmt.Sprintf(format, args...), + Err: err, + } +} + +// TargetInfo contains information about the target environment. +type TargetInfo struct { + Kind string `json:"kind"` + ClusterID string `json:"cluster_id"` + SparkVersion string `json:"spark_version"` + EnvKey string `json:"env_key"` + PythonVersion string `json:"python_version"` + Fallback *FallbackInfo `json:"fallback,omitempty"` +} + +// FallbackInfo contains fallback information. +type FallbackInfo struct { + Requested string `json:"requested"` + Resolved string `json:"resolved"` +} + +// ConstraintInfo contains constraint information. +type ConstraintInfo struct { + SourceURL string `json:"source_url"` + FromCache bool `json:"from_cache"` + RequiresPython string `json:"requires_python"` + DatabricksConnect string `json:"databricks_connect"` + ConstraintCount int `json:"constraint_count"` +} + +// Plan contains the deployment plan. +type Plan struct { + PyprojectPath string `json:"pyproject_path"` + BackupPath string `json:"backup_path"` + Diff string `json:"diff"` + ChangedRegions []string `json:"changed_regions"` +} + +// PhaseResult contains the result of a single phase. +type PhaseResult struct { + Name string `json:"name"` + Status string `json:"status"` + Detail string `json:"detail"` +} + +// ResultDetail contains the final result details. +type ResultDetail struct { + Status string `json:"status"` + VenvPath string `json:"venv_path"` + PythonVersion string `json:"python_version"` + DatabricksConnectInstalled string `json:"databricks_connect_installed"` +} + +// Result contains the overall result of the dbconnect operation. +type Result struct { + Mode string `json:"mode"` + Check bool `json:"check"` + Target *TargetInfo `json:"target,omitempty"` + Constraints *ConstraintInfo `json:"constraints,omitempty"` + Plan *Plan `json:"plan,omitempty"` + Phases []PhaseResult `json:"phases,omitempty"` + Result *ResultDetail `json:"result,omitempty"` + Error *PipelineError `json:"error,omitempty"` +} diff --git a/libs/dbconnect/result_test.go b/libs/dbconnect/result_test.go new file mode 100644 index 00000000000..53650ca3638 --- /dev/null +++ b/libs/dbconnect/result_test.go @@ -0,0 +1,21 @@ +package dbconnect + +import ( + "errors" + "testing" + + "github.com/stretchr/testify/assert" +) + +func TestPipelineErrorWrapsAndExposesCode(t *testing.T) { + base := errors.New("boom") + err := NewError(ErrConstraintFetchFailed, base, "fetch %s", "x") + assert.Equal(t, "fetch x: boom", err.Error()) + assert.Equal(t, ErrConstraintFetchFailed, err.Code) + assert.True(t, errors.Is(err, base)) +} + +func TestModeString(t *testing.T) { + assert.Equal(t, "init", ModeInit.String()) + assert.Equal(t, "sync", ModeSync.String()) +} From d7ad457b561ff08b879a3565ed172d7081348827 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 09:51:05 +0200 Subject: [PATCH 06/33] Address review: remove noise comments and Mode.String default - Remove noise doc comments from Error() and Unwrap() (idiomatic for standard interface methods) - Replace thin NewError doc comment with meaningful info about fmt.Sprintf and nil handling - Remove YAGNI default case from Mode.String(), use if/return instead Co-authored-by: Isaac --- libs/dbconnect/result.go | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-) diff --git a/libs/dbconnect/result.go b/libs/dbconnect/result.go index 18d31a53edb..dcf7fbc7831 100644 --- a/libs/dbconnect/result.go +++ b/libs/dbconnect/result.go @@ -12,14 +12,10 @@ const ( // String returns the string representation of the Mode. func (m Mode) String() string { - switch m { - case ModeInit: + if m == ModeInit { return "init" - case ModeSync: - return "sync" - default: - return "unknown" } + return "sync" } // ErrorCode represents a dbconnect error code. @@ -42,7 +38,6 @@ type PipelineError struct { Err error } -// Error returns the error message. func (e *PipelineError) Error() string { if e.Err != nil { return e.Msg + ": " + e.Err.Error() @@ -50,12 +45,12 @@ func (e *PipelineError) Error() string { return e.Msg } -// Unwrap returns the wrapped error. func (e *PipelineError) Unwrap() error { return e.Err } -// NewError creates a new PipelineError with the given code and error. +// NewError creates a new PipelineError. The message is formatted using fmt.Sprintf(format, args...), +// and err may be nil. func NewError(code ErrorCode, err error, format string, args ...any) *PipelineError { return &PipelineError{ Code: code, From 99f5c154b4106f92c1205bfeaab9db76c254a6c1 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 09:52:25 +0200 Subject: [PATCH 07/33] Add dbconnect envKey mapping and python-version parsing --- libs/dbconnect/envkey.go | 30 ++++++++++++++++++++++++++++++ libs/dbconnect/envkey_test.go | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+) create mode 100644 libs/dbconnect/envkey.go create mode 100644 libs/dbconnect/envkey_test.go diff --git a/libs/dbconnect/envkey.go b/libs/dbconnect/envkey.go new file mode 100644 index 00000000000..e3428301344 --- /dev/null +++ b/libs/dbconnect/envkey.go @@ -0,0 +1,30 @@ +package dbconnect + +import ( + "fmt" + "regexp" + "strings" +) + +// EnvKeyForServerless returns the environment key for a serverless version. +func EnvKeyForServerless(version string) string { + // Strip leading 'v' or 'V' and lowercase + normalized := strings.TrimPrefix(strings.TrimPrefix(version, "v"), "V") + normalized = strings.ToLower(normalized) + return fmt.Sprintf("serverless/serverless-v%s", normalized) +} + +// EnvKeyForSparkVersion returns the environment key for a Spark version. +func EnvKeyForSparkVersion(sparkVersion string) string { + return "dbr/" + sparkVersion +} + +// PythonMinorFromRequires parses a PEP 440 requires-python string and extracts MAJOR.MINOR. +func PythonMinorFromRequires(requiresPython string) (string, error) { + re := regexp.MustCompile(`(\d+)\.(\d+)`) + match := re.FindStringSubmatch(requiresPython) + if match == nil { + return "", fmt.Errorf("cannot parse python version from %q", requiresPython) + } + return fmt.Sprintf("%s.%s", match[1], match[2]), nil +} diff --git a/libs/dbconnect/envkey_test.go b/libs/dbconnect/envkey_test.go new file mode 100644 index 00000000000..4c8e368d303 --- /dev/null +++ b/libs/dbconnect/envkey_test.go @@ -0,0 +1,34 @@ +package dbconnect + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func TestEnvKeyForServerless(t *testing.T) { + for _, in := range []string{"4", "v4", "V4"} { + assert.Equal(t, "serverless/serverless-v4", EnvKeyForServerless(in)) + } +} + +func TestEnvKeyForSparkVersion(t *testing.T) { + assert.Equal(t, "dbr/15.4.x-scala2.12", EnvKeyForSparkVersion("15.4.x-scala2.12")) +} + +func TestPythonMinorFromRequires(t *testing.T) { + cases := map[string]string{ + "==3.12.*": "3.12", + ">=3.12": "3.12", + "==3.12.3": "3.12", + "~=3.11": "3.11", + } + for in, want := range cases { + got, err := PythonMinorFromRequires(in) + require.NoError(t, err) + assert.Equal(t, want, got) + } + _, err := PythonMinorFromRequires("garbage") + assert.Error(t, err) +} From 28f06c87cd4ae3a49aba9d00f911564fb7a9ea0b Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 09:54:34 +0200 Subject: [PATCH 08/33] Address review: simplify envKey normalization, hoist regex - Replace double TrimPrefix calls with simpler strings.TrimPrefix(strings.ToLower(version), "v") - Hoist pythonVersionRe to package-level var to avoid repeated compilation - Remove noise comment that restated the code Co-authored-by: Isaac --- libs/dbconnect/envkey.go | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/libs/dbconnect/envkey.go b/libs/dbconnect/envkey.go index e3428301344..726cbf1d97c 100644 --- a/libs/dbconnect/envkey.go +++ b/libs/dbconnect/envkey.go @@ -6,11 +6,11 @@ import ( "strings" ) +var pythonVersionRe = regexp.MustCompile(`(\d+)\.(\d+)`) + // EnvKeyForServerless returns the environment key for a serverless version. func EnvKeyForServerless(version string) string { - // Strip leading 'v' or 'V' and lowercase - normalized := strings.TrimPrefix(strings.TrimPrefix(version, "v"), "V") - normalized = strings.ToLower(normalized) + normalized := strings.TrimPrefix(strings.ToLower(version), "v") return fmt.Sprintf("serverless/serverless-v%s", normalized) } @@ -21,8 +21,7 @@ func EnvKeyForSparkVersion(sparkVersion string) string { // PythonMinorFromRequires parses a PEP 440 requires-python string and extracts MAJOR.MINOR. func PythonMinorFromRequires(requiresPython string) (string, error) { - re := regexp.MustCompile(`(\d+)\.(\d+)`) - match := re.FindStringSubmatch(requiresPython) + match := pythonVersionRe.FindStringSubmatch(requiresPython) if match == nil { return "", fmt.Errorf("cannot parse python version from %q", requiresPython) } From 563f415a194ee1182179a89fbb3e32ff64601a53 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 09:56:33 +0200 Subject: [PATCH 09/33] Add dbconnect constraint fetch with offline cache --- libs/dbconnect/constraints.go | 141 +++++++++++++++++++++++++++++ libs/dbconnect/constraints_test.go | 64 +++++++++++++ 2 files changed, 205 insertions(+) create mode 100644 libs/dbconnect/constraints.go create mode 100644 libs/dbconnect/constraints_test.go diff --git a/libs/dbconnect/constraints.go b/libs/dbconnect/constraints.go new file mode 100644 index 00000000000..2dd0eb24dcc --- /dev/null +++ b/libs/dbconnect/constraints.go @@ -0,0 +1,141 @@ +package dbconnect + +import ( + "context" + "fmt" + "io" + "net/http" + "os" + "path/filepath" + "strings" + + "github.com/BurntSushi/toml" + "github.com/databricks/cli/libs/log" +) + +// Constraints holds the parsed contents of a per-environment pyproject.toml. +type Constraints struct { + // EnvKey is the environment key used to look up the constraints. + EnvKey string + // SourceURL is the URL from which the constraints were fetched. + SourceURL string + // FromCache is true when the data came from the on-disk cache rather than a live fetch. + FromCache bool + // RequiresPython is the PEP 440 python version specifier from [project].requires-python. + RequiresPython string + // DatabricksConnect is the full dependency string for databricks-connect from [dependency-groups].dev. + DatabricksConnect string + // ConstraintDeps is the list of entries from [tool.uv].constraint-dependencies. + ConstraintDeps []string +} + +// sanitizeEnvKey replaces path separators with double-underscores to produce a flat filename. +func sanitizeEnvKey(envKey string) string { + return strings.ReplaceAll(envKey, "/", "__") +} + +// FetchConstraints fetches the pyproject.toml for envKey from baseURL, caches it in cacheDir, +// and falls back to the cached copy on network or HTTP errors. +// +// Constraint files are hosted at: +// https://github.com/databricks/databricks-dbconnect-constraints +func FetchConstraints(ctx context.Context, baseURL, envKey, cacheDir string) (*Constraints, error) { + url := baseURL + "/" + envKey + "/pyproject.toml" + cachePath := filepath.Join(cacheDir, sanitizeEnvKey(envKey)+".toml") + + data, fetchErr := fetchURL(ctx, url) + if fetchErr == nil { + // Write the cache copy; ignore errors so a read-only cacheDir is non-fatal. + _ = os.WriteFile(cachePath, data, 0o600) + rp, dbc, deps, err := parseConstraints(data) + if err != nil { + return nil, fmt.Errorf("parse constraints for %s: %w", envKey, err) + } + return &Constraints{ + EnvKey: envKey, + SourceURL: url, + FromCache: false, + RequiresPython: rp, + DatabricksConnect: dbc, + ConstraintDeps: deps, + }, nil + } + + // Network or HTTP failure: attempt to serve from cache. + cached, readErr := os.ReadFile(cachePath) + if readErr != nil { + return nil, NewError(ErrConstraintFetchFailed, fetchErr, "fetch constraints for %s", envKey) + } + + log.Warnf(ctx, "constraint fetch failed, using cached copy: %v", fetchErr) + rp, dbc, deps, err := parseConstraints(cached) + if err != nil { + return nil, fmt.Errorf("parse cached constraints for %s: %w", envKey, err) + } + return &Constraints{ + EnvKey: envKey, + SourceURL: url, + FromCache: true, + RequiresPython: rp, + DatabricksConnect: dbc, + ConstraintDeps: deps, + }, nil +} + +// fetchURL performs an HTTP GET and returns the body bytes, or an error on non-2xx or transport failure. +func fetchURL(ctx context.Context, url string) ([]byte, error) { + req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil) + if err != nil { + return nil, fmt.Errorf("build request for %s: %w", url, err) + } + resp, err := http.DefaultClient.Do(req) + if err != nil { + return nil, fmt.Errorf("GET %s: %w", url, err) + } + defer resp.Body.Close() + if resp.StatusCode < 200 || resp.StatusCode >= 300 { + return nil, fmt.Errorf("GET %s: unexpected status %s", url, resp.Status) + } + data, err := io.ReadAll(resp.Body) + if err != nil { + return nil, fmt.Errorf("read body from %s: %w", url, err) + } + return data, nil +} + +// pyprojectTOML mirrors the pyproject.toml fields we care about. +type pyprojectTOML struct { + Project struct { + RequiresPython string `toml:"requires-python"` + } `toml:"project"` + DependencyGroups struct { + Dev []string `toml:"dev"` + } `toml:"dependency-groups"` + Tool struct { + UV struct { + ConstraintDependencies []string `toml:"constraint-dependencies"` + } `toml:"uv"` + } `toml:"tool"` +} + +// parseConstraints parses a pyproject.toml byte slice and extracts requires-python, +// the databricks-connect entry from dependency-groups.dev, and constraint-dependencies. +func parseConstraints(data []byte) (requiresPython, dbconnect string, deps []string, err error) { + var p pyprojectTOML + if err = toml.Unmarshal(data, &p); err != nil { + return "", "", nil, fmt.Errorf("unmarshal pyproject.toml: %w", err) + } + + requiresPython = p.Project.RequiresPython + + for _, entry := range p.DependencyGroups.Dev { + // Despace before matching so whitespace variants like "databricks-connect ~=17" also match. + if strings.HasPrefix(strings.ReplaceAll(entry, " ", ""), "databricks-connect") { + dbconnect = entry + break + } + } + + deps = p.Tool.UV.ConstraintDependencies + return requiresPython, dbconnect, deps, nil +} diff --git a/libs/dbconnect/constraints_test.go b/libs/dbconnect/constraints_test.go new file mode 100644 index 00000000000..9a5275e0e6d --- /dev/null +++ b/libs/dbconnect/constraints_test.go @@ -0,0 +1,64 @@ +package dbconnect + +import ( + "net/http" + "net/http/httptest" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +const sampleToml = `[project] +requires-python = "==3.12.*" + +[dependency-groups] +dev = [ + "databricks-connect~=17.2.0", + "pytest~=8.0", +] + +[tool.uv] +constraint-dependencies = [ + "pydantic~=2.10.6", + "anyio~=4.6.2", +] +` + +func TestParseConstraints(t *testing.T) { + rp, dbc, deps, err := parseConstraints([]byte(sampleToml)) + require.NoError(t, err) + assert.Equal(t, "==3.12.*", rp) + assert.Equal(t, "databricks-connect~=17.2.0", dbc) + assert.Equal(t, []string{"pydantic~=2.10.6", "anyio~=4.6.2"}, deps) +} + +func TestFetchConstraintsHTTP(t *testing.T) { + srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + assert.Equal(t, "/serverless/serverless-v4/pyproject.toml", r.URL.Path) + _, _ = w.Write([]byte(sampleToml)) + })) + defer srv.Close() + + c, err := FetchConstraints(t.Context(), srv.URL, "serverless/serverless-v4", t.TempDir()) + require.NoError(t, err) + assert.False(t, c.FromCache) + assert.Equal(t, "databricks-connect~=17.2.0", c.DatabricksConnect) + assert.Len(t, c.ConstraintDeps, 2) +} + +func TestFetchConstraintsFallsBackToCache(t *testing.T) { + cacheDir := t.TempDir() + // First, a successful fetch populates the cache. + good := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte(sampleToml)) + })) + _, err := FetchConstraints(t.Context(), good.URL, "serverless/serverless-v4", cacheDir) + require.NoError(t, err) + good.Close() + + // Now the server is down; fetch must serve the cache. + c, err := FetchConstraints(t.Context(), good.URL, "serverless/serverless-v4", cacheDir) + require.NoError(t, err) + assert.True(t, c.FromCache) +} From f76db1fb19bc948a7f8fdfaa64ff7ebd7754d3ea Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:02:24 +0200 Subject: [PATCH 10/33] Add surgical formatting-preserving pyproject.toml merge Co-authored-by: Isaac --- libs/dbconnect/merge.go | 290 +++++++++++++++++++++++++++++++++++ libs/dbconnect/merge_test.go | 147 ++++++++++++++++++ 2 files changed, 437 insertions(+) create mode 100644 libs/dbconnect/merge.go create mode 100644 libs/dbconnect/merge_test.go diff --git a/libs/dbconnect/merge.go b/libs/dbconnect/merge.go new file mode 100644 index 00000000000..5855f7ee7ff --- /dev/null +++ b/libs/dbconnect/merge.go @@ -0,0 +1,290 @@ +package dbconnect + +import ( + "fmt" + "regexp" + "strings" +) + +// managedMarkerStart and managedMarkerEnd bracket the region of pyproject.toml that +// "databricks dbconnect" owns. Everything between them is rewritten on each merge; +// everything outside is preserved byte-for-byte. +const ( + managedMarkerStart = "# managed by databricks dbconnect — do not edit" + managedMarkerEnd = "# end managed by databricks dbconnect" +) + +// Region names reported back to the caller via MergeManaged's regions return value. +const ( + regionRequiresPython = "requires-python" + regionDatabricksConnect = "databricks-connect" + regionToolUv = "tool.uv.constraint-dependencies" +) + +var ( + // tableHeaderRe matches a TOML table header line such as "[project]" or "[tool.uv]". + tableHeaderRe = regexp.MustCompile(`^\s*\[[^\]]+\]\s*$`) + // requiresPythonRe captures the leading whitespace of a requires-python assignment so it + // can be preserved when the value is replaced. + requiresPythonRe = regexp.MustCompile(`^(\s*)requires-python\s*=`) +) + +// MergeManaged applies the three managed transforms to target, preserving every other +// byte (comments, ordering, whitespace). It returns the merged bytes and the list of +// regions that actually changed. The operation is idempotent: feeding its own output +// back in produces identical bytes. +func MergeManaged(target []byte, c Constraints) (merged []byte, regions []string, err error) { + s := string(target) + + // Detect and normalize line endings. We process on "\n" and restore "\r\n" on exit. + crlf := strings.Contains(s, "\r\n") + if crlf { + s = strings.ReplaceAll(s, "\r\n", "\n") + } + + lines := strings.Split(s, "\n") + + lines, rpChanged := mergeRequiresPython(lines, c.RequiresPython) + if rpChanged { + regions = append(regions, regionRequiresPython) + } + + lines, dbcChanged := mergeDatabricksConnect(lines, c.DatabricksConnect) + if dbcChanged { + regions = append(regions, regionDatabricksConnect) + } + + lines, uvChanged := mergeToolUv(lines, c.ConstraintDeps) + if uvChanged { + regions = append(regions, regionToolUv) + } + + out := strings.Join(lines, "\n") + if crlf { + out = strings.ReplaceAll(out, "\n", "\r\n") + } + return []byte(out), regions, nil +} + +// tableBounds returns the line index of the header matching name (e.g. "[project]") and +// the index of the first line after the table body (the next table header or EOF). If the +// table is absent, found is false. +func tableBounds(lines []string, name string) (header, end int, found bool) { + header = -1 + for i, line := range lines { + if strings.TrimSpace(line) == name { + header = i + break + } + } + if header == -1 { + return -1, -1, false + } + end = len(lines) + for i := header + 1; i < len(lines); i++ { + if tableHeaderRe.MatchString(lines[i]) { + end = i + break + } + } + return header, end, true +} + +// mergeRequiresPython replaces the value of requires-python within [project], preserving +// the line's leading whitespace. If the key is absent, it is inserted directly under the +// [project] header. Returns whether the line slice changed. +func mergeRequiresPython(lines []string, value string) ([]string, bool) { + header, end, found := tableBounds(lines, "[project]") + if !found { + return lines, false + } + + want := func(indent string) string { + return fmt.Sprintf(`%srequires-python = "%s"`, indent, value) + } + + for i := header + 1; i < end; i++ { + m := requiresPythonRe.FindStringSubmatch(lines[i]) + if m == nil { + continue + } + replacement := want(m[1]) + if lines[i] == replacement { + return lines, false + } + lines[i] = replacement + return lines, true + } + + // Key absent: insert directly under the [project] header. + inserted := make([]string, 0, len(lines)+1) + inserted = append(inserted, lines[:header+1]...) + inserted = append(inserted, want("")) + inserted = append(inserted, lines[header+1:]...) + return inserted, true +} + +// dbconnectLineRe captures, for a line holding a databricks-connect dependency element: +// (1) the leading whitespace, and (3) any trailing comma (with optional trailing space), +// so that indentation and comma style are preserved when the quoted token is replaced. +var dbconnectLineRe = regexp.MustCompile(`^(\s*)"databricks-connect[^"]*"(\s*,?\s*)$`) + +// mergeDatabricksConnect replaces the databricks-connect element inside +// [dependency-groups].dev. It handles both the multi-line array form (one element per +// line) and the single-line array form (dev = ["databricks-connect~=..."]). +func mergeDatabricksConnect(lines []string, value string) ([]string, bool) { + header, end, found := tableBounds(lines, "[dependency-groups]") + if !found { + return lines, false + } + + for i := header + 1; i < end; i++ { + // Multi-line element form: a standalone line holding only the quoted token. + if m := dbconnectLineRe.FindStringSubmatch(lines[i]); m != nil { + replacement := fmt.Sprintf(`%s"%s"%s`, m[1], value, m[2]) + if lines[i] == replacement { + return lines, false + } + lines[i] = replacement + return lines, true + } + // Single-line array form: replace the quoted databricks-connect token in place. + if strings.Contains(lines[i], `"databricks-connect`) { + replaced := dbconnectTokenRe.ReplaceAllString(lines[i], `"`+value+`"`) + if replaced == lines[i] { + return lines, false + } + lines[i] = replaced + return lines, true + } + } + return lines, false +} + +// dbconnectTokenRe matches a quoted databricks-connect element anywhere in a line, used +// for the single-line array form. +var dbconnectTokenRe = regexp.MustCompile(`"databricks-connect[^"]*"`) + +// mergeToolUv rewrites the managed [tool.uv] constraint-dependencies block. If a +// marker-bracketed block already exists, its contents are replaced in place. Otherwise any +// plain [tool.uv] table is removed and a fresh marker-bracketed block is appended at EOF. +func mergeToolUv(lines []string, deps []string) ([]string, bool) { + block := renderToolUvBlock(deps) + + start, stop, found := markerBounds(lines) + if found { + existing := lines[start : stop+1] + if equalLines(existing, block) { + return lines, false + } + out := make([]string, 0, len(lines)-(stop-start+1)+len(block)) + out = append(out, lines[:start]...) + out = append(out, block...) + out = append(out, lines[stop+1:]...) + return out, true + } + + // No managed block: drop any plain [tool.uv] table we may have written previously, + // then append a fresh managed block at EOF. + if header, end, ok := tableBounds(lines, "[tool.uv]"); ok { + out := make([]string, 0, len(lines)) + out = append(out, lines[:header]...) + out = append(out, lines[end:]...) + lines = out + } + + lines = appendManagedBlock(lines, block) + return lines, true +} + +// markerBounds returns the indices of the managed marker start and end lines, if present. +func markerBounds(lines []string) (start, stop int, found bool) { + start, stop = -1, -1 + for i, line := range lines { + if strings.TrimSpace(line) == managedMarkerStart { + start = i + break + } + } + if start == -1 { + return -1, -1, false + } + for i := start + 1; i < len(lines); i++ { + if strings.TrimSpace(lines[i]) == managedMarkerEnd { + stop = i + break + } + } + if stop == -1 { + return -1, -1, false + } + return start, stop, true +} + +// renderToolUvBlock builds the marker-bracketed [tool.uv] block lines (no surrounding +// blank lines). +func renderToolUvBlock(deps []string) []string { + block := []string{ + managedMarkerStart, + "[tool.uv]", + "constraint-dependencies = [", + } + for _, d := range deps { + block = append(block, fmt.Sprintf(" %q,", d)) + } + block = append(block, "]", managedMarkerEnd) + return block +} + +// appendManagedBlock appends block to lines, ensuring exactly one blank line separates it +// from prior content and the file ends with a single trailing newline. +func appendManagedBlock(lines []string, block []string) []string { + // strings.Split on a trailing "\n" leaves a final empty element; drop trailing empty + // lines so we control the spacing precisely. + for len(lines) > 0 && lines[len(lines)-1] == "" { + lines = lines[:len(lines)-1] + } + + out := make([]string, 0, len(lines)+len(block)+2) + out = append(out, lines...) + if len(out) > 0 { + out = append(out, "") // exactly one blank line before the managed block + } + out = append(out, block...) + out = append(out, "") // trailing newline after final join + return out +} + +// equalLines reports whether two line slices are identical. +func equalLines(a, b []string) bool { + if len(a) != len(b) { + return false + } + for i := range a { + if a[i] != b[i] { + return false + } + } + return true +} + +// RenderFreshPyproject produces a complete managed pyproject.toml for a project that has +// none, with [project], [dependency-groups].dev (carrying the databricks-connect pin), and +// the marker-bracketed [tool.uv] constraint block. +func RenderFreshPyproject(projectName string, c Constraints) []byte { + var b strings.Builder + b.WriteString("[project]\n") + b.WriteString(fmt.Sprintf("name = %q\n", projectName)) + b.WriteString(fmt.Sprintf("requires-python = %q\n", c.RequiresPython)) + b.WriteString("\n") + b.WriteString("[dependency-groups]\n") + b.WriteString("dev = [\n") + b.WriteString(fmt.Sprintf(" %q,\n", c.DatabricksConnect)) + b.WriteString("]\n") + b.WriteString("\n") + for _, line := range renderToolUvBlock(c.ConstraintDeps) { + b.WriteString(line) + b.WriteString("\n") + } + return []byte(b.String()) +} diff --git a/libs/dbconnect/merge_test.go b/libs/dbconnect/merge_test.go new file mode 100644 index 00000000000..712283351c2 --- /dev/null +++ b/libs/dbconnect/merge_test.go @@ -0,0 +1,147 @@ +package dbconnect + +import ( + "strings" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func testConstraints() Constraints { + return Constraints{ + RequiresPython: "==3.12.*", + DatabricksConnect: "databricks-connect~=17.2.0", + ConstraintDeps: []string{"pydantic~=2.10.6", "anyio~=4.6.2"}, + } +} + +func TestMergeReplacesRequiresPythonPreservingComments(t *testing.T) { + in := []byte(`[project] +name = "demo" +# keep this comment +requires-python = ">=3.10" + +[dependency-groups] +dev = [ + "databricks-connect~=16.0.0", + "pytest~=8.0", +] +`) + out, regions, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + assert.Contains(t, string(out), `requires-python = "==3.12.*"`) + assert.Contains(t, string(out), "# keep this comment") + assert.Contains(t, string(out), `"databricks-connect~=17.2.0",`) + assert.Contains(t, string(out), `"pytest~=8.0",`) + assert.Contains(t, regions, "requires-python") + assert.Contains(t, regions, "databricks-connect") + assert.Contains(t, regions, "tool.uv.constraint-dependencies") + assert.Contains(t, string(out), "pydantic~=2.10.6") +} + +func TestMergeIsIdempotent(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = [ + "databricks-connect~=16.0.0", +] +`) + once, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + twice, _, err := MergeManaged(once, testConstraints()) + require.NoError(t, err) + assert.Equal(t, string(once), string(twice)) +} + +func TestMergeInsertsRequiresPythonWhenMissing(t *testing.T) { + in := []byte(`[project] +name = "demo" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] +`) + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + assert.Contains(t, string(out), `requires-python = "==3.12.*"`) +} + +func TestMergeReplacesExistingManagedToolUvBlock(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] + +` + managedMarkerStart + ` +[tool.uv] +constraint-dependencies = [ + "stale~=1.0.0", +] +` + managedMarkerEnd + ` +`) + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + assert.NotContains(t, string(out), "stale~=1.0.0") + assert.Contains(t, string(out), "pydantic~=2.10.6") + // Only one managed block remains. + assert.Equal(t, 1, countOccurrences(string(out), managedMarkerStart)) +} + +func TestMergePreservesCRLF(t *testing.T) { + in := []byte("[project]\r\nrequires-python = \">=3.10\"\r\n\r\n[dependency-groups]\r\ndev = [\"databricks-connect~=16.0.0\"]\r\n") + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + assert.Contains(t, string(out), "\r\n") + assert.Contains(t, string(out), `requires-python = "==3.12.*"`) +} + +func TestMergeReplacesSingleLineDevArray(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0", "pytest~=8.0"] +`) + out, regions, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + // Sibling element and single-line array layout are preserved. + assert.Contains(t, string(out), `dev = ["databricks-connect~=17.2.0", "pytest~=8.0"]`) + assert.Contains(t, regions, "databricks-connect") +} + +func TestMergePreservesMultiLineTrailingComma(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = [ + "databricks-connect~=16.0.0", +] +`) + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + // The trailing comma on the managed element is preserved. + assert.Contains(t, string(out), ` "databricks-connect~=17.2.0",`) +} + +func TestRenderFreshPyproject(t *testing.T) { + out := RenderFreshPyproject("demo", testConstraints()) + s := string(out) + assert.Contains(t, s, `name = "demo"`) + assert.Contains(t, s, `requires-python = "==3.12.*"`) + assert.Contains(t, s, `"databricks-connect~=17.2.0",`) + assert.Contains(t, s, managedMarkerStart) + assert.Contains(t, s, managedMarkerEnd) + assert.Contains(t, s, "pydantic~=2.10.6") + // A fresh render is itself a no-op under MergeManaged (already fully managed). + merged, _, err := MergeManaged(out, testConstraints()) + require.NoError(t, err) + assert.Equal(t, s, string(merged)) +} + +func countOccurrences(s, substr string) int { + return strings.Count(s, substr) +} From 8a0fa12a76b6a39b1d135b87f7a35933d0a5ff04 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:11:59 +0200 Subject: [PATCH 11/33] Fix tool.uv merge to preserve user-authored keys Co-authored-by: Isaac --- libs/dbconnect/merge.go | 66 +++++++++++++++++++++++++++++--- libs/dbconnect/merge_test.go | 74 ++++++++++++++++++++++++++++++++++++ 2 files changed, 134 insertions(+), 6 deletions(-) diff --git a/libs/dbconnect/merge.go b/libs/dbconnect/merge.go index 5855f7ee7ff..d57d241a4ce 100644 --- a/libs/dbconnect/merge.go +++ b/libs/dbconnect/merge.go @@ -184,19 +184,73 @@ func mergeToolUv(lines []string, deps []string) ([]string, bool) { return out, true } - // No managed block: drop any plain [tool.uv] table we may have written previously, - // then append a fresh managed block at EOF. + // No managed block: reconcile any plain [tool.uv] table, then append a fresh managed + // block at EOF. If the table is effectively ours (its only meaningful key is + // constraint-dependencies, from a pre-marker run), drop it whole. Otherwise the table + // holds user-authored keys, so we preserve it and strip only our constraint-dependencies. if header, end, ok := tableBounds(lines, "[tool.uv]"); ok { - out := make([]string, 0, len(lines)) - out = append(out, lines[:header]...) - out = append(out, lines[end:]...) - lines = out + if toolUvHasOnlyConstraintDeps(lines, header, end) { + out := make([]string, 0, len(lines)) + out = append(out, lines[:header]...) + out = append(out, lines[end:]...) + lines = out + } else { + lines = removeConstraintDeps(lines, header, end) + } } lines = appendManagedBlock(lines, block) return lines, true } +// constraintDepsRe matches the start of a constraint-dependencies assignment within a +// [tool.uv] table, capturing its leading whitespace. +var constraintDepsRe = regexp.MustCompile(`^\s*constraint-dependencies\s*=`) + +// toolUvHasOnlyConstraintDeps reports whether the [tool.uv] table body spanning +// (header, end) contains no meaningful key other than constraint-dependencies. Blank lines +// and comment-only lines are ignored when deciding "only". +func toolUvHasOnlyConstraintDeps(lines []string, header, end int) bool { + for i := header + 1; i < end; i++ { + trimmed := strings.TrimSpace(lines[i]) + if trimmed == "" || strings.HasPrefix(trimmed, "#") { + continue + } + if !constraintDepsRe.MatchString(lines[i]) { + return false + } + } + return true +} + +// removeConstraintDeps strips a constraint-dependencies key from the [tool.uv] table body +// spanning (header, end), leaving the table header and all other user keys in place. It +// handles both the single-line array form and the multi-line array form (the value spans +// several lines until a line whose trimmed content is "]"). +func removeConstraintDeps(lines []string, header, end int) []string { + for i := header + 1; i < end; i++ { + if !constraintDepsRe.MatchString(lines[i]) { + continue + } + last := i + // Multi-line array form: extend through the closing "]" line. The single-line form + // already contains the closing bracket, so this loop does not advance. + if !strings.Contains(lines[i], "]") { + for j := i + 1; j < end; j++ { + last = j + if strings.TrimSpace(lines[j]) == "]" { + break + } + } + } + out := make([]string, 0, len(lines)-(last-i+1)) + out = append(out, lines[:i]...) + out = append(out, lines[last+1:]...) + return out + } + return lines +} + // markerBounds returns the indices of the managed marker start and end lines, if present. func markerBounds(lines []string) (start, stop int, found bool) { start, stop = -1, -1 diff --git a/libs/dbconnect/merge_test.go b/libs/dbconnect/merge_test.go index 712283351c2..d4a576dc507 100644 --- a/libs/dbconnect/merge_test.go +++ b/libs/dbconnect/merge_test.go @@ -96,6 +96,80 @@ func TestMergePreservesCRLF(t *testing.T) { require.NoError(t, err) assert.Contains(t, string(out), "\r\n") assert.Contains(t, string(out), `requires-python = "==3.12.*"`) + // Merging the CRLF output again must be byte-identical (idempotent under \r\n). + twice, _, err := MergeManaged(out, testConstraints()) + require.NoError(t, err) + assert.Equal(t, string(out), string(twice)) +} + +func TestMergePreservesUserToolUvKeys(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] + +[tool.uv] +package = true +dev-dependencies = ["ruff"] +`) + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + s := string(out) + assert.Contains(t, s, "[tool.uv]") + assert.Contains(t, s, "package = true") + assert.Contains(t, s, `dev-dependencies = ["ruff"]`) + assert.Contains(t, s, managedMarkerStart) + assert.Contains(t, s, "pydantic~=2.10.6") + // The user's keys must live outside the managed marker block. + start := strings.Index(s, managedMarkerStart) + require.GreaterOrEqual(t, start, 0) + assert.NotContains(t, s[start:], "package = true") + assert.NotContains(t, s[start:], `dev-dependencies = ["ruff"]`) +} + +func TestMergeStripsStaleConstraintDepsFromUserToolUv(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] + +[tool.uv] +package = true +constraint-dependencies = ["old~=1.0"] +`) + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + s := string(out) + assert.Contains(t, s, "package = true") + // The stale constraint must be gone from the user table; the managed block has the new deps. + assert.NotContains(t, s, "old~=1.0") + assert.Contains(t, s, "pydantic~=2.10.6") + // Merge-twice is byte-identical. + twice, _, err := MergeManaged(out, testConstraints()) + require.NoError(t, err) + assert.Equal(t, string(out), string(twice)) +} + +func TestMergeRemovesOwnedOnlyToolUv(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] + +[tool.uv] +constraint-dependencies = ["old~=1.0"] +`) + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + s := string(out) + assert.NotContains(t, s, "old~=1.0") + assert.Contains(t, s, "pydantic~=2.10.6") + // The plain table was removed and replaced by exactly one managed block. + assert.Equal(t, 1, countOccurrences(s, "[tool.uv]")) + assert.Equal(t, 1, countOccurrences(s, managedMarkerStart)) } func TestMergeReplacesSingleLineDevArray(t *testing.T) { From ffe111aec733996a11ac572bb06ffeea89c3a490 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:15:07 +0200 Subject: [PATCH 12/33] Detect multi-line owned-only [tool.uv] to avoid stray empty table Co-authored-by: Isaac --- libs/dbconnect/merge.go | 10 ++++++++++ libs/dbconnect/merge_test.go | 27 +++++++++++++++++++++++++++ 2 files changed, 37 insertions(+) diff --git a/libs/dbconnect/merge.go b/libs/dbconnect/merge.go index d57d241a4ce..86205b88826 100644 --- a/libs/dbconnect/merge.go +++ b/libs/dbconnect/merge.go @@ -219,6 +219,16 @@ func toolUvHasOnlyConstraintDeps(lines []string, header, end int) bool { if !constraintDepsRe.MatchString(lines[i]) { return false } + // Multi-line array form: skip the continuation lines through the closing "]" + // so the whole managed key counts as ignorable (mirrors removeConstraintDeps). + // The single-line form already holds the "]" and does not advance i. + if !strings.Contains(lines[i], "]") { + for i++; i < end; i++ { + if strings.TrimSpace(lines[i]) == "]" { + break + } + } + } } return true } diff --git a/libs/dbconnect/merge_test.go b/libs/dbconnect/merge_test.go index d4a576dc507..caf8ce3ad43 100644 --- a/libs/dbconnect/merge_test.go +++ b/libs/dbconnect/merge_test.go @@ -172,6 +172,33 @@ constraint-dependencies = ["old~=1.0"] assert.Equal(t, 1, countOccurrences(s, managedMarkerStart)) } +func TestMergeRemovesOwnedOnlyMultiLineToolUv(t *testing.T) { + in := []byte(`[project] +requires-python = ">=3.10" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] + +[tool.uv] +constraint-dependencies = [ + "old~=1.0", +] +`) + out, _, err := MergeManaged(in, testConstraints()) + require.NoError(t, err) + s := string(out) + assert.NotContains(t, s, "old~=1.0") + assert.Contains(t, s, "pydantic~=2.10.6") + // The multi-line owned-only table was removed whole, leaving exactly one + // [tool.uv] (inside the managed block) and no stray empty header. + assert.Equal(t, 1, countOccurrences(s, "[tool.uv]")) + assert.Equal(t, 1, countOccurrences(s, managedMarkerStart)) + // Merge-twice is byte-identical. + twice, _, err := MergeManaged(out, testConstraints()) + require.NoError(t, err) + assert.Equal(t, string(out), string(twice)) +} + func TestMergeReplacesSingleLineDevArray(t *testing.T) { in := []byte(`[project] requires-python = ">=3.10" From 801b6c1add237b9d1ef7731ee501779ab8316d0e Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:17:13 +0200 Subject: [PATCH 13/33] Add dbconnect target resolution with three-state messaging --- libs/dbconnect/target.go | 131 ++++++++++++++++++++++++++++++++++ libs/dbconnect/target_test.go | 55 ++++++++++++++ 2 files changed, 186 insertions(+) create mode 100644 libs/dbconnect/target.go create mode 100644 libs/dbconnect/target_test.go diff --git a/libs/dbconnect/target.go b/libs/dbconnect/target.go new file mode 100644 index 00000000000..861f914402b --- /dev/null +++ b/libs/dbconnect/target.go @@ -0,0 +1,131 @@ +package dbconnect + +import ( + "context" + "fmt" + "strings" +) + +// ComputeClient is a narrow seam over the SDK so tests can stub it. +// The real adapter is wired in Task 9. +type ComputeClient interface { + // GetClusterSparkVersion returns the Spark version string for a cluster. + GetClusterSparkVersion(ctx context.Context, clusterID string) (string, error) + // GetJobSparkVersion returns either a Spark version (isServerless=false) or a + // serverless marker (isServerless=true) for a job, plus a recorded version string. + GetJobSparkVersion(ctx context.Context, jobID string) (sparkVersion string, isServerless bool, version string, err error) +} + +// TargetFlags holds the mutually-exclusive compute target flags from the CLI. +type TargetFlags struct { + Cluster string + Serverless string + Job string +} + +// BundleTarget is the three-state result of reading the bundle's configured +// target. Selected=false means nothing was configured. +type BundleTarget struct { + ClusterID string + Serverless bool + Selected bool +} + +// ValidateTargetFlags returns an error if more than one of the three flags is set. +// Cobra marks them mutually exclusive too; this guards the library path. +func ValidateTargetFlags(f TargetFlags) error { + var set []string + if f.Cluster != "" { + set = append(set, "--cluster") + } + if f.Serverless != "" { + set = append(set, "--serverless") + } + if f.Job != "" { + set = append(set, "--job") + } + if len(set) > 1 { + return fmt.Errorf("flags %s are mutually exclusive; specify at most one", strings.Join(set, " and ")) + } + return nil +} + +// ResolveTarget resolves the compute target using ordered precedence: +// --cluster flag → --serverless flag → --job flag → bundle target. +// PythonVersion is left empty; it is filled later from constraint data. +func ResolveTarget(ctx context.Context, f TargetFlags, c ComputeClient, bt BundleTarget) (*TargetInfo, error) { + if f.Cluster != "" { + v, err := c.GetClusterSparkVersion(ctx, f.Cluster) + if err != nil { + return nil, fmt.Errorf("resolving cluster %s: %w", f.Cluster, err) + } + return &TargetInfo{ + Kind: "cluster", + ClusterID: f.Cluster, + EnvKey: EnvKeyForSparkVersion(v), + }, nil + } + + if f.Serverless != "" { + return &TargetInfo{ + Kind: "serverless", + EnvKey: EnvKeyForServerless(f.Serverless), + }, nil + } + + if f.Job != "" { + _, isServerless, version, err := c.GetJobSparkVersion(ctx, f.Job) + if err != nil { + return nil, fmt.Errorf("resolving job %s: %w", f.Job, err) + } + if isServerless { + // Default to v4 when the job is serverless; the serverless env version + // is not recorded in the bundle/project (documented stand-in from the + // original script). + v := version + if v == "" { + v = "v4" + } + return &TargetInfo{ + Kind: "serverless", + EnvKey: EnvKeyForServerless(v), + }, nil + } + return &TargetInfo{ + Kind: "cluster", + EnvKey: EnvKeyForSparkVersion(version), + }, nil + } + + // Fall back to bundle target. + if !bt.Selected { + return nil, NewError(ErrNoTargetSelected, nil, + "No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job.") + } + + if bt.Serverless { + // Default to serverless-v4: the serverless env version is not recorded + // in the bundle/project (documented stand-in from the original script). + return &TargetInfo{ + Kind: "serverless", + EnvKey: EnvKeyForServerless("v4"), + }, nil + } + + if bt.ClusterID != "" { + v, err := c.GetClusterSparkVersion(ctx, bt.ClusterID) + if err != nil { + return nil, fmt.Errorf("resolving bundle cluster %s: %w", bt.ClusterID, err) + } + return &TargetInfo{ + Kind: "cluster", + ClusterID: bt.ClusterID, + EnvKey: EnvKeyForSparkVersion(v), + }, nil + } + + // Bundle target is selected but has neither serverless nor a cluster ID — + // treat this the same as nothing selected so the user gets a clear message. + return nil, NewError(ErrNoTargetSelected, nil, + "No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job.") +} diff --git a/libs/dbconnect/target_test.go b/libs/dbconnect/target_test.go new file mode 100644 index 00000000000..3d67e24657d --- /dev/null +++ b/libs/dbconnect/target_test.go @@ -0,0 +1,55 @@ +package dbconnect + +import ( + "context" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +type stubCompute struct { + clusterVersion string + clusterErr error +} + +func (s stubCompute) GetClusterSparkVersion(_ context.Context, _ string) (string, error) { + return s.clusterVersion, s.clusterErr +} +func (s stubCompute) GetJobSparkVersion(_ context.Context, _ string) (string, bool, string, error) { + return "", false, "", nil +} + +func TestResolveServerlessFlag(t *testing.T) { + ti, err := ResolveTarget(t.Context(), TargetFlags{Serverless: "v4"}, stubCompute{}, BundleTarget{}) + require.NoError(t, err) + assert.Equal(t, "serverless", ti.Kind) + assert.Equal(t, "serverless/serverless-v4", ti.EnvKey) +} + +func TestResolveClusterFlag(t *testing.T) { + c := stubCompute{clusterVersion: "15.4.x-scala2.12"} + ti, err := ResolveTarget(t.Context(), TargetFlags{Cluster: "abc"}, c, BundleTarget{}) + require.NoError(t, err) + assert.Equal(t, "cluster", ti.Kind) + assert.Equal(t, "dbr/15.4.x-scala2.12", ti.EnvKey) + assert.Equal(t, "abc", ti.ClusterID) +} + +func TestResolveBundleNothingSelected(t *testing.T) { + _, err := ResolveTarget(t.Context(), TargetFlags{}, stubCompute{}, BundleTarget{Selected: false}) + var pe *PipelineError + require.ErrorAs(t, err, &pe) + assert.Equal(t, ErrNoTargetSelected, pe.Code) +} + +func TestResolveBundleServerless(t *testing.T) { + ti, err := ResolveTarget(t.Context(), TargetFlags{}, stubCompute{}, BundleTarget{Selected: true, Serverless: true}) + require.NoError(t, err) + assert.Equal(t, "serverless/serverless-v4", ti.EnvKey) +} + +func TestValidateTargetFlagsMutuallyExclusive(t *testing.T) { + assert.Error(t, ValidateTargetFlags(TargetFlags{Cluster: "a", Serverless: "v4"})) + assert.NoError(t, ValidateTargetFlags(TargetFlags{Cluster: "a"})) +} From 98a1affe718b8d0a9416eeeb069e7a983d7c68d1 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:19:51 +0200 Subject: [PATCH 14/33] gofmt dbconnect target.go field alignment Co-authored-by: Isaac --- libs/dbconnect/target.go | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libs/dbconnect/target.go b/libs/dbconnect/target.go index 861f914402b..16f8ddadd62 100644 --- a/libs/dbconnect/target.go +++ b/libs/dbconnect/target.go @@ -18,9 +18,9 @@ type ComputeClient interface { // TargetFlags holds the mutually-exclusive compute target flags from the CLI. type TargetFlags struct { - Cluster string + Cluster string Serverless string - Job string + Job string } // BundleTarget is the three-state result of reading the bundle's configured From a72fb4132d27944441ce572ddbf7d080ce7eba14 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:20:06 +0200 Subject: [PATCH 15/33] gofmt dbconnect result.go field alignment Co-authored-by: Isaac --- libs/dbconnect/result.go | 50 ++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/libs/dbconnect/result.go b/libs/dbconnect/result.go index dcf7fbc7831..44a5b2af34a 100644 --- a/libs/dbconnect/result.go +++ b/libs/dbconnect/result.go @@ -61,12 +61,12 @@ func NewError(code ErrorCode, err error, format string, args ...any) *PipelineEr // TargetInfo contains information about the target environment. type TargetInfo struct { - Kind string `json:"kind"` - ClusterID string `json:"cluster_id"` - SparkVersion string `json:"spark_version"` - EnvKey string `json:"env_key"` - PythonVersion string `json:"python_version"` - Fallback *FallbackInfo `json:"fallback,omitempty"` + Kind string `json:"kind"` + ClusterID string `json:"cluster_id"` + SparkVersion string `json:"spark_version"` + EnvKey string `json:"env_key"` + PythonVersion string `json:"python_version"` + Fallback *FallbackInfo `json:"fallback,omitempty"` } // FallbackInfo contains fallback information. @@ -77,18 +77,18 @@ type FallbackInfo struct { // ConstraintInfo contains constraint information. type ConstraintInfo struct { - SourceURL string `json:"source_url"` - FromCache bool `json:"from_cache"` - RequiresPython string `json:"requires_python"` - DatabricksConnect string `json:"databricks_connect"` - ConstraintCount int `json:"constraint_count"` + SourceURL string `json:"source_url"` + FromCache bool `json:"from_cache"` + RequiresPython string `json:"requires_python"` + DatabricksConnect string `json:"databricks_connect"` + ConstraintCount int `json:"constraint_count"` } // Plan contains the deployment plan. type Plan struct { - PyprojectPath string `json:"pyproject_path"` - BackupPath string `json:"backup_path"` - Diff string `json:"diff"` + PyprojectPath string `json:"pyproject_path"` + BackupPath string `json:"backup_path"` + Diff string `json:"diff"` ChangedRegions []string `json:"changed_regions"` } @@ -101,20 +101,20 @@ type PhaseResult struct { // ResultDetail contains the final result details. type ResultDetail struct { - Status string `json:"status"` - VenvPath string `json:"venv_path"` - PythonVersion string `json:"python_version"` + Status string `json:"status"` + VenvPath string `json:"venv_path"` + PythonVersion string `json:"python_version"` DatabricksConnectInstalled string `json:"databricks_connect_installed"` } // Result contains the overall result of the dbconnect operation. type Result struct { - Mode string `json:"mode"` - Check bool `json:"check"` - Target *TargetInfo `json:"target,omitempty"` - Constraints *ConstraintInfo `json:"constraints,omitempty"` - Plan *Plan `json:"plan,omitempty"` - Phases []PhaseResult `json:"phases,omitempty"` - Result *ResultDetail `json:"result,omitempty"` - Error *PipelineError `json:"error,omitempty"` + Mode string `json:"mode"` + Check bool `json:"check"` + Target *TargetInfo `json:"target,omitempty"` + Constraints *ConstraintInfo `json:"constraints,omitempty"` + Plan *Plan `json:"plan,omitempty"` + Phases []PhaseResult `json:"phases,omitempty"` + Result *ResultDetail `json:"result,omitempty"` + Error *PipelineError `json:"error,omitempty"` } From 5f5461feb5c3eea8952728edccc5e926d0e86447 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:22:40 +0200 Subject: [PATCH 16/33] Add PackageManager interface and uv implementation Co-authored-by: Isaac --- libs/dbconnect/pkgmanager.go | 31 +++++++ libs/dbconnect/uv.go | 166 +++++++++++++++++++++++++++++++++++ libs/dbconnect/uv_test.go | 27 ++++++ 3 files changed, 224 insertions(+) create mode 100644 libs/dbconnect/pkgmanager.go create mode 100644 libs/dbconnect/uv.go create mode 100644 libs/dbconnect/uv_test.go diff --git a/libs/dbconnect/pkgmanager.go b/libs/dbconnect/pkgmanager.go new file mode 100644 index 00000000000..84d7c04b47d --- /dev/null +++ b/libs/dbconnect/pkgmanager.go @@ -0,0 +1,31 @@ +package dbconnect + +import "context" + +// PackageManager manages the Python environment for a dbconnect project. +type PackageManager interface { + // Name returns the name of the package manager (e.g. "uv"). + Name() string + + // EnsureAvailable ensures the package manager binary is present, installing + // it if necessary. It returns the version string on success. + EnsureAvailable(ctx context.Context) (version string, err error) + + // EnsurePython ensures the requested Python minor version (e.g. "3.12") is + // available via the package manager. + EnsurePython(ctx context.Context, minor string) error + + // Provision installs the project dependencies inside projectDir. + Provision(ctx context.Context, projectDir string) error + + // PostProvision seeds pip into the virtual environment inside projectDir. + // This step is required because VS Code's ms-python.vscode-python-envs + // extension falls back to `python -m pip list` when its `uv --version` + // probe fails on the GUI PATH; uv venvs contain no pip; and `uv sync` + // strips pip, so seeding must run after every sync. + PostProvision(ctx context.Context, projectDir string) error + + // Validate reads the Python minor version and databricks-connect version + // from the virtual environment inside projectDir. + Validate(ctx context.Context, projectDir string) (pythonVersion, dbconnectVersion string, err error) +} diff --git a/libs/dbconnect/uv.go b/libs/dbconnect/uv.go new file mode 100644 index 00000000000..07916d45dbc --- /dev/null +++ b/libs/dbconnect/uv.go @@ -0,0 +1,166 @@ +package dbconnect + +import ( + "context" + "os" + "os/exec" + "path/filepath" + "strings" + + "github.com/databricks/cli/libs/env" + "github.com/databricks/cli/libs/process" +) + +// uvManager implements PackageManager using the uv tool. +// https://docs.astral.sh/uv/ +type uvManager struct { + bin string +} + +// newUvManager returns a uvManager whose binary path is resolved lazily via +// EnsureAvailable. +func newUvManager() *uvManager { + return &uvManager{} +} + +// Name returns "uv". +func (m *uvManager) Name() string { + return "uv" +} + +// EnsureAvailable discovers or installs uv and records the binary path. +// It runs the official installer when uv is not found on the PATH or in the +// standard candidate locations. +// https://docs.astral.sh/uv/getting-started/installation/ +func (m *uvManager) EnsureAvailable(ctx context.Context) (string, error) { + bin, err := discoverUv(ctx) + if err != nil { + // Install uv using the official installer script. + // https://astral.sh/uv/install.sh + _, installErr := process.Background(ctx, []string{"sh", "-c", "curl -LsSf https://astral.sh/uv/install.sh | sh"}) + if installErr != nil { + return "", NewError(ErrUvUnavailable, installErr, "uv installation failed") + } + bin, err = discoverUv(ctx) + if err != nil { + return "", err + } + } + m.bin = bin + + version, err := process.Background(ctx, []string{m.bin, "version"}) + if err != nil { + return "", NewError(ErrProvisionFailed, err, "uv version check failed") + } + return strings.TrimSpace(version), nil +} + +// EnsurePython installs the requested Python minor version via uv. +func (m *uvManager) EnsurePython(ctx context.Context, minor string) error { + args := append([]string{m.bin}, m.pythonInstallArgs(minor)...) + _, err := process.Background(ctx, args) + if err != nil { + return NewError(ErrProvisionFailed, err, "uv python install %s failed", minor) + } + return nil +} + +// Provision runs `uv sync` inside projectDir to install project dependencies. +func (m *uvManager) Provision(ctx context.Context, projectDir string) error { + args := append([]string{m.bin}, m.syncArgs()...) + _, err := process.Background(ctx, args, process.WithDir(projectDir)) + if err != nil { + return NewError(ErrProvisionFailed, err, "uv sync failed") + } + return nil +} + +// PostProvision seeds pip into the project's virtual environment. +// +// VS Code's ms-python.vscode-python-envs extension falls back to +// `python -m pip list` when its `uv --version` probe fails on the GUI PATH. +// uv virtual environments do not include pip by default, and `uv sync` strips +// pip if it was previously present. Seeding pip after every sync ensures the +// VS Code integration works correctly regardless of how the environment was +// activated. +func (m *uvManager) PostProvision(ctx context.Context, projectDir string) error { + venvPython := filepath.Join(projectDir, ".venv", "bin", "python") + args := append([]string{m.bin}, m.pipSeedArgs(venvPython)...) + _, err := process.Background(ctx, args, process.WithDir(projectDir)) + if err != nil { + return NewError(ErrProvisionFailed, err, "uv pip seed failed") + } + return nil +} + +// Validate reads the Python minor version and databricks-connect package +// version from the project's virtual environment. +func (m *uvManager) Validate(ctx context.Context, projectDir string) (string, string, error) { + pyCode := `import sys, importlib.metadata; print(f"{sys.version_info.major}.{sys.version_info.minor}"); print(importlib.metadata.version("databricks-connect"))` + out, err := process.Background(ctx, + []string{m.bin, "run", "--no-project", "python", "-c", pyCode}, + process.WithDir(projectDir), + ) + if err != nil { + return "", "", NewError(ErrValidationFailed, err, "uv run python validation failed") + } + lines := strings.Split(strings.TrimSpace(out), "\n") + if len(lines) < 2 { + return "", "", NewError(ErrValidationFailed, nil, "unexpected output from uv run: %q", out) + } + return strings.TrimSpace(lines[0]), strings.TrimSpace(lines[1]), nil +} + +// syncArgs returns the argument slice for `uv sync` (without the binary). +func (m *uvManager) syncArgs() []string { + return []string{"sync"} +} + +// pythonInstallArgs returns the argument slice for `uv python install `. +func (m *uvManager) pythonInstallArgs(minor string) []string { + return []string{"python", "install", minor} +} + +// pipSeedArgs returns the argument slice for seeding pip into the venv. +func (m *uvManager) pipSeedArgs(venvPython string) []string { + return []string{"pip", "install", "pip", "--python", venvPython} +} + +// discoverUv searches for the uv binary on PATH and in well-known install +// locations. It returns NewError(ErrUvUnavailable, ...) if uv is not found. +// +// Candidate locations follow the uv installer defaults: +// https://docs.astral.sh/uv/getting-started/installation/ +// XDG_BIN_HOME is specified by the XDG Base Directory Specification: +// https://specifications.freedesktop.org/basedir-spec/latest/ +func discoverUv(ctx context.Context) (string, error) { + // Prefer PATH lookup first; it respects user customisation. + if p, err := exec.LookPath("uv"); err == nil { + return p, nil + } + + home, _ := os.UserHomeDir() + + // XDG_BIN_HOME defaults to $HOME/.local/bin when unset. + xdgBinHome, _ := env.Lookup(ctx, "XDG_BIN_HOME") + + candidates := []string{ + filepath.Join(home, ".local", "bin", "uv"), + filepath.Join(xdgBinHome, "uv"), + "/opt/homebrew/bin/uv", + "/usr/local/bin/uv", + } + + for _, c := range candidates { + if c == "/uv" || c == "" { + // Skip degenerate paths produced when home or xdgBinHome is empty. + continue + } + if _, err := os.Stat(c); err == nil { + return c, nil + } + } + + return "", NewError(ErrUvUnavailable, nil, + "uv not found on PATH or in well-known locations (%s)", strings.Join(candidates, ", ")) +} diff --git a/libs/dbconnect/uv_test.go b/libs/dbconnect/uv_test.go new file mode 100644 index 00000000000..6552cd60fa3 --- /dev/null +++ b/libs/dbconnect/uv_test.go @@ -0,0 +1,27 @@ +package dbconnect + +import ( + "os" + "path/filepath" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +func TestUvArgs(t *testing.T) { + m := &uvManager{bin: "uv"} + assert.Equal(t, []string{"sync"}, m.syncArgs()) + assert.Equal(t, []string{"python", "install", "3.12"}, m.pythonInstallArgs("3.12")) + assert.Equal(t, []string{"pip", "install", "pip", "--python", "/p/.venv/bin/python"}, m.pipSeedArgs("/p/.venv/bin/python")) +} + +func TestDiscoverUvFindsBinOnPath(t *testing.T) { + dir := t.TempDir() + bin := filepath.Join(dir, "uv") + require.NoError(t, os.WriteFile(bin, []byte("#!/bin/sh\n"), 0o755)) + t.Setenv("PATH", dir) + got, err := discoverUv(t.Context()) + require.NoError(t, err) + assert.Equal(t, bin, got) +} From 6d88a131b50e65375c5c45dceb6ae698e334fa79 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:25:37 +0200 Subject: [PATCH 17/33] Make uv venv python path cross-platform; explain --no-project --- libs/dbconnect/uv.go | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/libs/dbconnect/uv.go b/libs/dbconnect/uv.go index 07916d45dbc..a3be9684263 100644 --- a/libs/dbconnect/uv.go +++ b/libs/dbconnect/uv.go @@ -5,6 +5,7 @@ import ( "os" "os/exec" "path/filepath" + "runtime" "strings" "github.com/databricks/cli/libs/env" @@ -75,6 +76,15 @@ func (m *uvManager) Provision(ctx context.Context, projectDir string) error { return nil } +// venvPython returns the path to the virtualenv's Python interpreter, +// accounting for the Windows (Scripts/python.exe) vs Unix (bin/python) layout. +func venvPython(projectDir string) string { + if runtime.GOOS == "windows" { + return filepath.Join(projectDir, ".venv", "Scripts", "python.exe") + } + return filepath.Join(projectDir, ".venv", "bin", "python") +} + // PostProvision seeds pip into the project's virtual environment. // // VS Code's ms-python.vscode-python-envs extension falls back to @@ -84,8 +94,7 @@ func (m *uvManager) Provision(ctx context.Context, projectDir string) error { // VS Code integration works correctly regardless of how the environment was // activated. func (m *uvManager) PostProvision(ctx context.Context, projectDir string) error { - venvPython := filepath.Join(projectDir, ".venv", "bin", "python") - args := append([]string{m.bin}, m.pipSeedArgs(venvPython)...) + args := append([]string{m.bin}, m.pipSeedArgs(venvPython(projectDir))...) _, err := process.Background(ctx, args, process.WithDir(projectDir)) if err != nil { return NewError(ErrProvisionFailed, err, "uv pip seed failed") @@ -97,6 +106,8 @@ func (m *uvManager) PostProvision(ctx context.Context, projectDir string) error // version from the project's virtual environment. func (m *uvManager) Validate(ctx context.Context, projectDir string) (string, string, error) { pyCode := `import sys, importlib.metadata; print(f"{sys.version_info.major}.{sys.version_info.minor}"); print(importlib.metadata.version("databricks-connect"))` + // --no-project runs the interpreter from the created .venv without re-resolving/syncing + // the project's declared dependencies, so validation observes exactly what was installed. out, err := process.Background(ctx, []string{m.bin, "run", "--no-project", "python", "-c", pyCode}, process.WithDir(projectDir), From 37e3c87915f12ea90aab3c88ad1790b0f0e5063b Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:33:35 +0200 Subject: [PATCH 18/33] Add dbconnect pipeline orchestrating all phases Co-authored-by: Isaac --- libs/dbconnect/pipeline.go | 431 ++++++++++++++++++++++++++++++++ libs/dbconnect/pipeline_test.go | 200 +++++++++++++++ 2 files changed, 631 insertions(+) create mode 100644 libs/dbconnect/pipeline.go create mode 100644 libs/dbconnect/pipeline_test.go diff --git a/libs/dbconnect/pipeline.go b/libs/dbconnect/pipeline.go new file mode 100644 index 00000000000..0e1ab3d913e --- /dev/null +++ b/libs/dbconnect/pipeline.go @@ -0,0 +1,431 @@ +package dbconnect + +import ( + "context" + "errors" + "fmt" + "os" + "path/filepath" + "strings" + + "github.com/hexops/gotextdiff" + "github.com/hexops/gotextdiff/myers" + "github.com/hexops/gotextdiff/span" +) + +// Pipeline orchestrates the dbconnect init/sync phases against a project directory. +type Pipeline struct { + Mode Mode + Check bool + ProjectDir string + ConstraintBaseURL string + CacheDir string + Flags TargetFlags + Bundle BundleTarget + Compute ComputeClient + PM PackageManager +} + +// Run executes all pipeline phases in order and returns a fully populated Result. +// On a phase error, Result.Error is set and the same error is also returned. +func (p *Pipeline) Run(ctx context.Context) (*Result, error) { + res := &Result{ + Mode: p.Mode.String(), + Check: p.Check, + } + + // Phase 0: ensure the package manager is available. + phase := PhaseResult{Name: "preflight"} + version, err := p.PM.EnsureAvailable(ctx) + if err != nil { + phase.Status = "failed" + phase.Detail = err.Error() + res.Phases = append(res.Phases, phase) + pe := NewError(ErrUvUnavailable, err, "%s unavailable", p.PM.Name()) + res.Error = pe + return res, pe + } + phase.Status = "ok" + phase.Detail = p.PM.Name() + " " + version + res.Phases = append(res.Phases, phase) + + // Phase 1: resolve the compute target. + target, err := p.resolve(ctx, res) + if err != nil { + return res, err + } + + // Phase 2: fetch constraints. + c, err := p.fetch(ctx, res, target) + if err != nil { + return res, err + } + + // Phase 2b: fill in the python version on the target info from the constraints. + pyMinor, err := PythonMinorFromRequires(c.RequiresPython) + if err != nil { + pe := NewError(ErrConstraintFetchFailed, err, "cannot determine python version from constraints") + res.Error = pe + return res, pe + } + target.PythonVersion = pyMinor + + // Phase 3: compute the merge plan (in-memory, no disk writes yet). + plan, mergedBytes, err := p.mergePlan(ctx, res, c) + if err != nil { + return res, err + } + res.Plan = plan + + // Check mode stops here — phases 4+ mutate disk. + if p.Check { + return res, nil + } + + // Phase 4: write the merged content to disk (mode-specific backup/restore). + if err := p.applyMerge(ctx, res, mergedBytes); err != nil { + return res, err + } + + // Phase 5: ensure the required Python version is installed. + if err := p.ensurePython(ctx, res, pyMinor); err != nil { + return res, err + } + + // Phase 6: provision the virtual environment. + if err := p.provision(ctx, res); err != nil { + return res, err + } + + // Phase 7: post-provision (pip seed). + if err := p.postProvision(ctx, res); err != nil { + return res, err + } + + // Phase 8: validate the environment. + if err := p.validate(ctx, res, pyMinor, c.DatabricksConnect); err != nil { + return res, err + } + + return res, nil +} + +// resolve runs ResolveTarget and appends a phase result. +func (p *Pipeline) resolve(ctx context.Context, res *Result) (*TargetInfo, error) { + phase := PhaseResult{Name: "resolve"} + target, err := ResolveTarget(ctx, p.Flags, p.Compute, p.Bundle) + if err != nil { + phase.Status = "failed" + phase.Detail = err.Error() + res.Phases = append(res.Phases, phase) + var pe *PipelineError + if !errors.As(err, &pe) { + pe = NewError(ErrNoTargetSelected, err, "target resolution failed") + } + res.Error = pe + return nil, pe + } + phase.Status = "ok" + phase.Detail = fmt.Sprintf("kind=%s envKey=%s", target.Kind, target.EnvKey) + res.Phases = append(res.Phases, phase) + res.Target = target + return target, nil +} + +// fetch fetches constraints for the resolved target and appends a phase result. +func (p *Pipeline) fetch(ctx context.Context, res *Result, target *TargetInfo) (*Constraints, error) { + phase := PhaseResult{Name: "fetch"} + c, err := FetchConstraints(ctx, p.ConstraintBaseURL, target.EnvKey, p.CacheDir) + if err != nil { + phase.Status = "failed" + phase.Detail = err.Error() + res.Phases = append(res.Phases, phase) + var pe *PipelineError + if !errors.As(err, &pe) { + pe = NewError(ErrConstraintFetchFailed, err, "fetch constraints failed") + } + res.Error = pe + return nil, pe + } + phase.Status = "ok" + phase.Detail = fmt.Sprintf("source=%s fromCache=%v", c.SourceURL, c.FromCache) + res.Phases = append(res.Phases, phase) + res.Constraints = &ConstraintInfo{ + SourceURL: c.SourceURL, + FromCache: c.FromCache, + RequiresPython: c.RequiresPython, + DatabricksConnect: c.DatabricksConnect, + ConstraintCount: len(c.ConstraintDeps), + } + return c, nil +} + +// pyprojectPath returns the path to pyproject.toml in the project directory. +func (p *Pipeline) pyprojectPath() string { + return filepath.Join(p.ProjectDir, "pyproject.toml") +} + +// backupPath returns the path to the pyproject.toml backup file. +func (p *Pipeline) backupPath() string { + return filepath.Join(p.ProjectDir, "pyproject.toml.bak") +} + +// mergePlan computes the merged pyproject.toml bytes (without writing to disk) +// and builds the Plan with a unified diff. +func (p *Pipeline) mergePlan(_ context.Context, res *Result, c *Constraints) (*Plan, []byte, error) { + phase := PhaseResult{Name: "plan"} + pyproject := p.pyprojectPath() + backup := p.backupPath() + + // Determine base bytes for the merge. For sync with a backup, the backup is + // the canonical base so the merge starts from the original unmanaged state. + var baseBytes []byte + if p.Mode == ModeSync { + if data, err := os.ReadFile(backup); err == nil { + baseBytes = data + } + } + + // Fall back to the current pyproject.toml if no base was found above. + if baseBytes == nil { + if data, err := os.ReadFile(pyproject); err == nil { + baseBytes = data + } + } + + var mergedBytes []byte + var changedRegions []string + + if baseBytes == nil { + // No existing pyproject.toml — render a fresh one. + // Extract the project name from the directory name as a reasonable default. + projectName := filepath.Base(p.ProjectDir) + mergedBytes = RenderFreshPyproject(projectName, *c) + changedRegions = []string{regionRequiresPython, regionDatabricksConnect, regionToolUv} + } else { + var err error + mergedBytes, changedRegions, err = MergeManaged(baseBytes, *c) + if err != nil { + pe := NewError(ErrMergeFailed, err, "merge managed regions failed") + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return nil, nil, pe + } + } + + // Build a unified diff for the plan. + oldStr := "" + newStr := string(mergedBytes) + oldName := "pyproject.toml" + newName := "pyproject.toml" + if baseBytes != nil { + oldStr = string(baseBytes) + oldName = "pyproject.toml" + newName = "pyproject.toml.new" + } + edits := myers.ComputeEdits(span.URIFromPath(oldName), oldStr, newStr) + diff := fmt.Sprint(gotextdiff.ToUnified(oldName, newName, oldStr, edits)) + + plan := &Plan{ + PyprojectPath: pyproject, + BackupPath: backup, + Diff: diff, + ChangedRegions: changedRegions, + } + + phase.Status = "ok" + phase.Detail = fmt.Sprintf("changed=%s", strings.Join(changedRegions, ",")) + res.Phases = append(res.Phases, phase) + return plan, mergedBytes, nil +} + +// applyMerge writes the merged bytes to disk, performing the mode-specific +// backup or restore first. +func (p *Pipeline) applyMerge(_ context.Context, res *Result, mergedBytes []byte) error { + phase := PhaseResult{Name: "apply"} + pyproject := p.pyprojectPath() + backup := p.backupPath() + + switch p.Mode { + case ModeInit: + // Back up only if a pyproject.toml already exists. + if _, err := os.Stat(pyproject); err == nil { + if err := copyFile(pyproject, backup); err != nil { + pe := NewError(ErrMergeFailed, err, "backup pyproject.toml failed") + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + } + case ModeSync: + if _, err := os.Stat(backup); err != nil { + // No backup yet — create one from the current pyproject.toml. + if _, statErr := os.Stat(pyproject); statErr == nil { + if err := copyFile(pyproject, backup); err != nil { + pe := NewError(ErrMergeFailed, err, "backup pyproject.toml failed") + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + } + } + // When a backup already exists, mergePlan already used it as the base — no + // additional restore step is needed here. + } + + if err := os.WriteFile(pyproject, mergedBytes, 0o644); err != nil { + pe := NewError(ErrMergeFailed, err, "write pyproject.toml failed") + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + + phase.Status = "ok" + res.Phases = append(res.Phases, phase) + return nil +} + +// ensurePython ensures the required Python version is installed. +func (p *Pipeline) ensurePython(ctx context.Context, res *Result, pyMinor string) error { + phase := PhaseResult{Name: "ensure-python"} + if err := p.PM.EnsurePython(ctx, pyMinor); err != nil { + pe := NewError(ErrProvisionFailed, err, "ensure python %s failed", pyMinor) + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + phase.Status = "ok" + phase.Detail = pyMinor + res.Phases = append(res.Phases, phase) + return nil +} + +// provision installs project dependencies into the virtual environment. +func (p *Pipeline) provision(ctx context.Context, res *Result) error { + phase := PhaseResult{Name: "provision"} + if err := p.PM.Provision(ctx, p.ProjectDir); err != nil { + pe := NewError(ErrProvisionFailed, err, "provision failed") + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + phase.Status = "ok" + res.Phases = append(res.Phases, phase) + return nil +} + +// postProvision seeds pip into the virtual environment. +func (p *Pipeline) postProvision(ctx context.Context, res *Result) error { + phase := PhaseResult{Name: "post-provision"} + if err := p.PM.PostProvision(ctx, p.ProjectDir); err != nil { + pe := NewError(ErrProvisionFailed, err, "post-provision failed") + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + phase.Status = "ok" + res.Phases = append(res.Phases, phase) + return nil +} + +// validate reads the Python and databricks-connect versions from the venv and +// populates Result.Result. +func (p *Pipeline) validate(ctx context.Context, res *Result, expectedPyMinor, dbcPin string) error { + phase := PhaseResult{Name: "validate"} + pyVer, dbcVer, err := p.PM.Validate(ctx, p.ProjectDir) + if err != nil { + pe := NewError(ErrValidationFailed, err, "validation failed") + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + + // Assert the installed Python minor matches the target. + if pyVer != expectedPyMinor { + pe := NewError(ErrValidationFailed, nil, + "python version mismatch: want %s, got %s", expectedPyMinor, pyVer) + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + + // Assert the installed databricks-connect major matches the pin's major. + // dbcPin is e.g. "databricks-connect~=17.2.0"; dbcVer is e.g. "17.2.0". + pinMajor := dbcMajorFromPin(dbcPin) + installedMajor := majorVersion(dbcVer) + if pinMajor != "" && installedMajor != "" && pinMajor != installedMajor { + pe := NewError(ErrValidationFailed, nil, + "databricks-connect major version mismatch: want %s.x, got %s", pinMajor, dbcVer) + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + + phase.Status = "ok" + phase.Detail = fmt.Sprintf("python=%s databricks-connect=%s", pyVer, dbcVer) + res.Phases = append(res.Phases, phase) + + venvPath := filepath.Join(p.ProjectDir, ".venv") + res.Result = &ResultDetail{ + Status: "success", + VenvPath: venvPath, + PythonVersion: pyVer, + DatabricksConnectInstalled: dbcVer, + } + return nil +} + +// dbcMajorFromPin extracts the major version number from a databricks-connect +// pin string such as "databricks-connect~=17.2.0". Returns "" if unparseable. +func dbcMajorFromPin(pin string) string { + // Strip the "databricks-connect" prefix and any operator (~=, ==, >=, etc.). + // The first digit sequence is the major version. + for i, c := range pin { + if c >= '0' && c <= '9' { + return majorVersion(pin[i:]) + } + } + return "" +} + +// majorVersion returns the major portion of a version string (digits before the +// first dot), e.g. "17" from "17.2.0". Returns "" if not parseable. +func majorVersion(v string) string { + dot := strings.Index(v, ".") + if dot <= 0 { + return "" + } + return v[:dot] +} + +// copyFile copies src to dst, creating or overwriting dst. +func copyFile(src, dst string) error { + data, err := os.ReadFile(src) + if err != nil { + return fmt.Errorf("read %s: %w", src, err) + } + if err := os.WriteFile(dst, data, 0o644); err != nil { + return fmt.Errorf("write %s: %w", dst, err) + } + return nil +} diff --git a/libs/dbconnect/pipeline_test.go b/libs/dbconnect/pipeline_test.go new file mode 100644 index 00000000000..28bed3f089d --- /dev/null +++ b/libs/dbconnect/pipeline_test.go @@ -0,0 +1,200 @@ +package dbconnect + +import ( + "context" + "net/http" + "net/http/httptest" + "os" + "path/filepath" + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" +) + +type fakePM struct{ py, dbc string } + +func (fakePM) Name() string { return "fake" } +func (fakePM) EnsureAvailable(context.Context) (string, error) { return "fake 1.0", nil } +func (fakePM) EnsurePython(context.Context, string) error { return nil } +func (fakePM) Provision(context.Context, string) error { return nil } +func (fakePM) PostProvision(context.Context, string) error { return nil } +func (f fakePM) Validate(context.Context, string) (string, string, error) { + return f.py, f.dbc, nil +} + +func writeProject(t *testing.T) string { + dir := t.TempDir() + require.NoError(t, os.WriteFile(filepath.Join(dir, "pyproject.toml"), []byte(`[project] +name = "demo" +requires-python = ">=3.10" + +[dependency-groups] +dev = ["databricks-connect~=16.0.0"] +`), 0o644)) + return dir +} + +func newTestServer(t *testing.T) *httptest.Server { + return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte(sampleToml)) + })) +} + +func TestPipelineCheckMutatesNothing(t *testing.T) { + dir := writeProject(t) + before, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeSync, Check: true, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, + } + res, err := p.Run(t.Context()) + require.NoError(t, err) + assert.True(t, res.Check) + require.NotNil(t, res.Plan) + assert.Contains(t, res.Plan.Diff, "==3.12.*") + after, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) + assert.Equal(t, string(before), string(after)) // unchanged +} + +func TestPipelineSyncProvisionsAndValidates(t *testing.T) { + dir := writeProject(t) + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeSync, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, + } + res, err := p.Run(t.Context()) + require.NoError(t, err) + require.NotNil(t, res.Result) + assert.Equal(t, "success", res.Result.Status) + assert.Equal(t, "3.12", res.Result.PythonVersion) + merged, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) + assert.Contains(t, string(merged), `"databricks-connect~=17.2.0"`) + assert.FileExists(t, filepath.Join(dir, "pyproject.toml.bak")) +} + +func TestPipelineInitCreatesNewPyproject(t *testing.T) { + dir := t.TempDir() + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeInit, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, + } + res, err := p.Run(t.Context()) + require.NoError(t, err) + require.NotNil(t, res.Result) + assert.Equal(t, "success", res.Result.Status) + data, readErr := os.ReadFile(filepath.Join(dir, "pyproject.toml")) + require.NoError(t, readErr) + assert.Contains(t, string(data), `"databricks-connect~=17.2.0",`) + // No backup created when pyproject.toml did not previously exist. + assert.NoFileExists(t, filepath.Join(dir, "pyproject.toml.bak")) +} + +func TestPipelineInitBacksUpExistingPyproject(t *testing.T) { + dir := writeProject(t) + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeInit, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, + } + res, err := p.Run(t.Context()) + require.NoError(t, err) + require.NotNil(t, res.Result) + assert.FileExists(t, filepath.Join(dir, "pyproject.toml.bak")) +} + +func TestPipelineNoTarget(t *testing.T) { + dir := writeProject(t) + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeSync, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{}, + Compute: stubCompute{}, PM: fakePM{}, + } + res, err := p.Run(t.Context()) + require.Error(t, err) + require.NotNil(t, res) + require.NotNil(t, res.Error) + assert.Equal(t, ErrNoTargetSelected, res.Error.Code) +} + +func TestPipelineSyncRestoresBackupBeforeMerge(t *testing.T) { + dir := t.TempDir() + // Write an original pyproject.toml and a pre-existing .bak. + original := []byte(`[project] +name = "demo" +requires-python = ">=3.9" + +[dependency-groups] +dev = ["databricks-connect~=15.0.0"] +`) + require.NoError(t, os.WriteFile(filepath.Join(dir, "pyproject.toml.bak"), original, 0o644)) + // Current pyproject.toml has been mutated by a previous run. + mutated := []byte(`[project] +name = "demo" +requires-python = "==3.12.*" + +[dependency-groups] +dev = ["databricks-connect~=17.2.0"] +`) + require.NoError(t, os.WriteFile(filepath.Join(dir, "pyproject.toml"), mutated, 0o644)) + + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeSync, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, + } + res, err := p.Run(t.Context()) + require.NoError(t, err) + require.NotNil(t, res) + // The bak content (requires-python = ">=3.9") was the base; merged result should + // contain the newly pinned version. + data, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) + assert.Contains(t, string(data), `"databricks-connect~=17.2.0"`) + assert.Contains(t, string(data), `requires-python = "==3.12.*"`) +} + +func TestPipelineResultPopulatesConstraintInfo(t *testing.T) { + dir := writeProject(t) + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeSync, Check: true, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, + } + res, err := p.Run(t.Context()) + require.NoError(t, err) + require.NotNil(t, res.Constraints) + assert.Equal(t, "==3.12.*", res.Constraints.RequiresPython) + assert.Equal(t, "databricks-connect~=17.2.0", res.Constraints.DatabricksConnect) + assert.Equal(t, 2, res.Constraints.ConstraintCount) +} From 33162f13bf43f6feef0cea96a0e28bcd6a1de28d Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:37:24 +0200 Subject: [PATCH 19/33] Fix: use correct error code for python version parse failure in pipeline The PythonMinorFromRequires call happens after a successful network fetch, so wrapping its error with ErrConstraintFetchFailed was a misattribution. Use ErrValidationFailed instead, which correctly signals that the constraint file content failed to parse rather than that the fetch itself failed. Co-authored-by: Isaac --- libs/dbconnect/pipeline.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libs/dbconnect/pipeline.go b/libs/dbconnect/pipeline.go index 0e1ab3d913e..4abfad62420 100644 --- a/libs/dbconnect/pipeline.go +++ b/libs/dbconnect/pipeline.go @@ -64,7 +64,7 @@ func (p *Pipeline) Run(ctx context.Context) (*Result, error) { // Phase 2b: fill in the python version on the target info from the constraints. pyMinor, err := PythonMinorFromRequires(c.RequiresPython) if err != nil { - pe := NewError(ErrConstraintFetchFailed, err, "cannot determine python version from constraints") + pe := NewError(ErrValidationFailed, err, "failed to parse python version from constraints") res.Error = pe return res, pe } From 29662f66fc613de21f6346ab6d7544b41dcbeda8 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:39:50 +0200 Subject: [PATCH 20/33] Fix: append PhaseResult on python-version parse failure in pipeline fetch Co-authored-by: Isaac --- libs/dbconnect/pipeline.go | 1 + 1 file changed, 1 insertion(+) diff --git a/libs/dbconnect/pipeline.go b/libs/dbconnect/pipeline.go index 4abfad62420..02512fa4891 100644 --- a/libs/dbconnect/pipeline.go +++ b/libs/dbconnect/pipeline.go @@ -65,6 +65,7 @@ func (p *Pipeline) Run(ctx context.Context) (*Result, error) { pyMinor, err := PythonMinorFromRequires(c.RequiresPython) if err != nil { pe := NewError(ErrValidationFailed, err, "failed to parse python version from constraints") + res.Phases = append(res.Phases, PhaseResult{Name: "parse-python-version", Status: "failed", Detail: pe.Error()}) res.Error = pe return res, pe } From 668211ab239ec490ada7239940826a8f538dedb0 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:42:12 +0200 Subject: [PATCH 21/33] Fix: add success PhaseResult for parse-python-version in fetch phase Co-authored-by: Isaac --- libs/dbconnect/pipeline.go | 1 + 1 file changed, 1 insertion(+) diff --git a/libs/dbconnect/pipeline.go b/libs/dbconnect/pipeline.go index 02512fa4891..209bbcb4fd0 100644 --- a/libs/dbconnect/pipeline.go +++ b/libs/dbconnect/pipeline.go @@ -69,6 +69,7 @@ func (p *Pipeline) Run(ctx context.Context) (*Result, error) { res.Error = pe return res, pe } + res.Phases = append(res.Phases, PhaseResult{Name: "parse-python-version", Status: "ok", Detail: pyMinor}) target.PythonVersion = pyMinor // Phase 3: compute the merge plan (in-memory, no disk writes yet). From 8f4631c8f3d818ef71ad1de2e27dfbbc2f61afce Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:49:04 +0200 Subject: [PATCH 22/33] Fail validation when databricks-connect major version is unparseable Co-authored-by: Isaac --- libs/dbconnect/pipeline.go | 51 +++++++++++++++++------ libs/dbconnect/pipeline_test.go | 73 +++++++++++++++++++++++++++++++++ 2 files changed, 111 insertions(+), 13 deletions(-) diff --git a/libs/dbconnect/pipeline.go b/libs/dbconnect/pipeline.go index 209bbcb4fd0..19f38914adb 100644 --- a/libs/dbconnect/pipeline.go +++ b/libs/dbconnect/pipeline.go @@ -371,16 +371,36 @@ func (p *Pipeline) validate(ctx context.Context, res *Result, expectedPyMinor, d // Assert the installed databricks-connect major matches the pin's major. // dbcPin is e.g. "databricks-connect~=17.2.0"; dbcVer is e.g. "17.2.0". - pinMajor := dbcMajorFromPin(dbcPin) - installedMajor := majorVersion(dbcVer) - if pinMajor != "" && installedMajor != "" && pinMajor != installedMajor { - pe := NewError(ErrValidationFailed, nil, - "databricks-connect major version mismatch: want %s.x, got %s", pinMajor, dbcVer) - phase.Status = "failed" - phase.Detail = pe.Error() - res.Phases = append(res.Phases, phase) - res.Error = pe - return pe + if dbcPin != "" { + pinMajor := dbcMajorFromPin(dbcPin) + if pinMajor == "" { + pe := NewError(ErrValidationFailed, nil, + "cannot determine databricks-connect major version from pin %q", dbcPin) + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + installedMajor := majorVersion(dbcVer) + if installedMajor == "" { + pe := NewError(ErrValidationFailed, nil, + "cannot determine installed databricks-connect major version from %q", dbcVer) + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } + if pinMajor != installedMajor { + pe := NewError(ErrValidationFailed, nil, + "databricks-connect major version mismatch: want %s.x, got %s", pinMajor, dbcVer) + phase.Status = "failed" + phase.Detail = pe.Error() + res.Phases = append(res.Phases, phase) + res.Error = pe + return pe + } } phase.Status = "ok" @@ -411,12 +431,17 @@ func dbcMajorFromPin(pin string) string { } // majorVersion returns the major portion of a version string (digits before the -// first dot), e.g. "17" from "17.2.0". Returns "" if not parseable. +// first dot), e.g. "17" from "17.2.0". A bare integer like "17" returns "17". +// Returns "" for an empty string. func majorVersion(v string) string { - dot := strings.Index(v, ".") - if dot <= 0 { + if v == "" { return "" } + dot := strings.Index(v, ".") + if dot < 0 { + // No dot — the whole string is the major component. + return v + } return v[:dot] } diff --git a/libs/dbconnect/pipeline_test.go b/libs/dbconnect/pipeline_test.go index 28bed3f089d..e27a22b94f4 100644 --- a/libs/dbconnect/pipeline_test.go +++ b/libs/dbconnect/pipeline_test.go @@ -198,3 +198,76 @@ func TestPipelineResultPopulatesConstraintInfo(t *testing.T) { assert.Equal(t, "databricks-connect~=17.2.0", res.Constraints.DatabricksConnect) assert.Equal(t, 2, res.Constraints.ConstraintCount) } + +// newServerWithDBC returns a test server that serves a constraints TOML with the +// given databricks-connect pin value in the dev dependency group. +func newServerWithDBC(t *testing.T, dbcPin string) *httptest.Server { + t.Helper() + body := `[project] +requires-python = "==3.12.*" + +[dependency-groups] +dev = ["` + dbcPin + `"] + +[tool.uv] +constraint-dependencies = [] +` + return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + _, _ = w.Write([]byte(body)) + })) +} + +func TestPipelineValidateRejectsUnparseablePin(t *testing.T) { + dir := writeProject(t) + // Serve a TOML whose dev group has a malformed databricks-connect entry + // (no version digits after the package name). + srv := newServerWithDBC(t, "databricks-connect") + defer srv.Close() + + p := &Pipeline{ + Mode: ModeSync, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, + } + res, err := p.Run(t.Context()) + require.Error(t, err) + require.NotNil(t, res.Error) + assert.Equal(t, ErrValidationFailed, res.Error.Code) +} + +func TestPipelineValidateRejectsUnparseableInstalledVersion(t *testing.T) { + dir := writeProject(t) + // sampleToml has databricks-connect~=17.2.0 as the pin; fakePM returns a + // bare integer "17" as the installed version — majorVersion("17") must now + // return "17" (not ""), so this actually passes. Use an empty installed + // version string to simulate an installed version that can't be parsed. + srv := newTestServer(t) + defer srv.Close() + + p := &Pipeline{ + Mode: ModeSync, ProjectDir: dir, + ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), + Flags: TargetFlags{Serverless: "v4"}, + Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: ""}, + } + res, err := p.Run(t.Context()) + require.Error(t, err) + require.NotNil(t, res.Error) + assert.Equal(t, ErrValidationFailed, res.Error.Code) +} + +func TestMajorVersion(t *testing.T) { + cases := []struct { + input string + want string + }{ + {"17.2.0", "17"}, + {"17", "17"}, + {"", ""}, + {"3.12", "3"}, + } + for _, tc := range cases { + assert.Equal(t, tc.want, majorVersion(tc.input), "input=%q", tc.input) + } +} From 0ffdabf1882ae4a3bb8e727c959db59db60c718f Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:55:50 +0200 Subject: [PATCH 23/33] Wire dbconnect Cobra layer: flags, compute adapter, rendering Co-authored-by: Isaac --- cmd/dbconnect/compute.go | 60 ++++++++++++++++++++ cmd/dbconnect/init.go | 118 ++++++++++++++++++++++++++++++++++++++- cmd/dbconnect/output.go | 59 ++++++++++++++++++++ cmd/dbconnect/sync.go | 4 +- libs/dbconnect/uv.go | 6 ++ 5 files changed, 245 insertions(+), 2 deletions(-) create mode 100644 cmd/dbconnect/compute.go create mode 100644 cmd/dbconnect/output.go diff --git a/cmd/dbconnect/compute.go b/cmd/dbconnect/compute.go new file mode 100644 index 00000000000..97e702e3190 --- /dev/null +++ b/cmd/dbconnect/compute.go @@ -0,0 +1,60 @@ +package dbconnect + +import ( + "context" + "fmt" + "strconv" + + databricks "github.com/databricks/databricks-sdk-go" +) + +// sdkCompute adapts the Databricks SDK to the dbconnect.ComputeClient interface. +type sdkCompute struct { + w *databricks.WorkspaceClient +} + +// GetClusterSparkVersion returns the Spark version string for a running cluster. +func (c sdkCompute) GetClusterSparkVersion(ctx context.Context, clusterID string) (string, error) { + d, err := c.w.Clusters.GetByClusterId(ctx, clusterID) + if err != nil { + return "", fmt.Errorf("get cluster %s: %w", clusterID, err) + } + return d.SparkVersion, nil +} + +// GetJobSparkVersion inspects the job's configuration to determine compute type. +// +// A job is considered serverless when it has non-empty Environments (JobEnvironment +// entries), which signals the Databricks serverless runtime. A job with classic compute +// uses JobClusters; we read SparkVersion from the first job cluster's NewCluster spec. +// If neither indicator is present the job's compute cannot be determined. +func (c sdkCompute) GetJobSparkVersion(ctx context.Context, jobID string) (sparkVersion string, isServerless bool, version string, err error) { + id, err := strconv.ParseInt(jobID, 10, 64) + if err != nil { + return "", false, "", fmt.Errorf("invalid job ID %q: must be an integer: %w", jobID, err) + } + + job, err := c.w.Jobs.GetByJobId(ctx, id) + if err != nil { + return "", false, "", fmt.Errorf("get job %d: %w", id, err) + } + + if job.Settings == nil { + return "", false, "", fmt.Errorf("job %d has no settings", id) + } + + // Serverless jobs have Environments populated; classic compute uses JobClusters. + if len(job.Settings.Environments) > 0 { + return "", true, "", nil + } + + if len(job.Settings.JobClusters) > 0 { + sv := job.Settings.JobClusters[0].NewCluster.SparkVersion + if sv == "" { + return "", false, "", fmt.Errorf("could not determine compute for job %d: first job cluster has no spark_version", id) + } + return sv, false, sv, nil + } + + return "", false, "", fmt.Errorf("could not determine compute for job %d: no environments or job clusters found", id) +} diff --git a/cmd/dbconnect/init.go b/cmd/dbconnect/init.go index 84e017a0e79..b2f9e4a7f76 100644 --- a/cmd/dbconnect/init.go +++ b/cmd/dbconnect/init.go @@ -1,18 +1,134 @@ package dbconnect import ( + "context" + "os" + "path/filepath" + "github.com/databricks/cli/cmd/root" + "github.com/databricks/cli/libs/cmdctx" + libsdbconnect "github.com/databricks/cli/libs/dbconnect" + "github.com/databricks/cli/libs/env" "github.com/spf13/cobra" ) +const ( + // defaultConstraintBaseURL is the default URL for the constraint source. + defaultConstraintBaseURL = "https://raw.githubusercontent.com/pietern/databricks-environments/main" + + // envConstraintSource is the environment variable for overriding the constraint source URL. + envConstraintSource = "DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE" +) + func newInitCommand() *cobra.Command { cmd := &cobra.Command{ Use: "init", Short: "Create a fresh pyproject.toml and provision a matched .venv", } cmd.PreRunE = root.MustWorkspaceClient + addTargetFlags(cmd) cmd.RunE = func(cmd *cobra.Command, args []string) error { - return nil + return runPipeline(cmd, libsdbconnect.ModeInit) } return cmd } + +// addTargetFlags adds the shared target flags to a command. +func addTargetFlags(cmd *cobra.Command) { + cmd.Flags().String("cluster", "", "cluster ID to use as the compute target") + cmd.Flags().String("serverless", "", "serverless version to use as the compute target (e.g. v4)") + cmd.Flags().String("job", "", "job ID to use as the compute target") + cmd.Flags().Bool("check", false, "compute the plan without writing files or provisioning") + cmd.Flags().String("constraint-source", "", "URL for the constraint source (overrides "+envConstraintSource+")") + // Hide constraint-source from casual --help output; it is a power-user escape hatch. + _ = cmd.Flags().MarkHidden("constraint-source") + cmd.MarkFlagsMutuallyExclusive("cluster", "serverless", "job") +} + +// runPipeline builds and runs the dbconnect Pipeline for the given mode. +func runPipeline(cmd *cobra.Command, mode libsdbconnect.Mode) error { + ctx := cmd.Context() + + cluster, _ := cmd.Flags().GetString("cluster") + serverless, _ := cmd.Flags().GetString("serverless") + job, _ := cmd.Flags().GetString("job") + check, _ := cmd.Flags().GetBool("check") + constraintSource, _ := cmd.Flags().GetString("constraint-source") + + targetFlags := libsdbconnect.TargetFlags{ + Cluster: cluster, + Serverless: serverless, + Job: job, + } + if err := libsdbconnect.ValidateTargetFlags(targetFlags); err != nil { + return err + } + + // Resolve constraint base URL: flag → env var → default constant. + constraintBaseURL := resolveConstraintBaseURL(ctx, constraintSource) + + projectDir, err := os.Getwd() + if err != nil { + return err + } + + cacheDir, err := os.UserCacheDir() + if err != nil { + return err + } + cacheDir = filepath.Join(cacheDir, "databricks", "dbconnect") + + bt := bundleTarget(cmd) + + w := cmdctx.WorkspaceClient(ctx) + p := &libsdbconnect.Pipeline{ + Mode: mode, + Check: check, + ProjectDir: projectDir, + ConstraintBaseURL: constraintBaseURL, + CacheDir: cacheDir, + Flags: targetFlags, + Compute: sdkCompute{w: w}, + Bundle: bt, + PM: libsdbconnect.NewUvManager(), + } + + res, pipelineErr := p.Run(ctx) + return renderResult(cmd, ctx, res, pipelineErr) +} + +// resolveConstraintBaseURL returns the constraint base URL using ordered precedence: +// flag → env var → default constant. +func resolveConstraintBaseURL(ctx context.Context, flagValue string) string { + if flagValue != "" { + return flagValue + } + if v, ok := env.Lookup(ctx, envConstraintSource); ok { + return v + } + return defaultConstraintBaseURL +} + +// bundleTarget reads the active bundle (if any) and maps its compute configuration +// to a libsdbconnect.BundleTarget. +// +// Only the top-level bundle.cluster_id field is consulted here; serverless is not +// recorded in the bundle config, so Selected=true is set only when a cluster ID is +// present. If the bundle is absent or has no cluster_id, Selected=false is returned +// so the pipeline falls through to requiring an explicit flag. +// +// TODO: extend once bundle config exposes a serverless field at the bundle level. +func bundleTarget(cmd *cobra.Command) libsdbconnect.BundleTarget { + b := root.TryConfigureBundle(cmd) + if b == nil { + return libsdbconnect.BundleTarget{Selected: false} + } + clusterID := b.Config.Bundle.ClusterId + if clusterID == "" { + return libsdbconnect.BundleTarget{Selected: false} + } + return libsdbconnect.BundleTarget{ + ClusterID: clusterID, + Selected: true, + } +} diff --git a/cmd/dbconnect/output.go b/cmd/dbconnect/output.go new file mode 100644 index 00000000000..816bff2dc0e --- /dev/null +++ b/cmd/dbconnect/output.go @@ -0,0 +1,59 @@ +package dbconnect + +import ( + "context" + "fmt" + "path/filepath" + + "github.com/databricks/cli/cmd/root" + "github.com/databricks/cli/libs/cmdio" + "github.com/databricks/cli/libs/flags" + libsdbconnect "github.com/databricks/cli/libs/dbconnect" + "github.com/spf13/cobra" +) + +// renderResult renders the pipeline result to the command's output. +// In JSON mode it renders the full structured result (even on error). +// In text mode it prints phase headers and a summary, then returns the error. +func renderResult(cmd *cobra.Command, ctx context.Context, res *libsdbconnect.Result, pipelineErr error) error { + if root.OutputType(cmd) == flags.OutputJSON { + return cmdio.Render(ctx, res) + } + + // Text mode: print phase headers. + for i, phase := range res.Phases { + cmdio.LogString(ctx, fmt.Sprintf("=== Phase %d: %s ===", i, phase.Name)) + if phase.Detail != "" { + cmdio.LogString(ctx, fmt.Sprintf(" status=%s %s", phase.Status, phase.Detail)) + } else { + cmdio.LogString(ctx, fmt.Sprintf(" status=%s", phase.Status)) + } + } + + if pipelineErr != nil { + return pipelineErr + } + + // Print a final success / check summary. + if res.Check { + if res.Plan != nil { + cmdio.LogString(ctx, fmt.Sprintf("Plan: %s", filepath.ToSlash(res.Plan.PyprojectPath))) + if len(res.Plan.ChangedRegions) > 0 { + for _, region := range res.Plan.ChangedRegions { + cmdio.LogString(ctx, fmt.Sprintf(" changed region: %s", region)) + } + } + } + cmdio.LogString(ctx, "Check complete. No files were modified.") + return nil + } + + if res.Result != nil { + cmdio.LogString(ctx, fmt.Sprintf("Success: python=%s databricks-connect=%s venv=%s", + res.Result.PythonVersion, + res.Result.DatabricksConnectInstalled, + filepath.ToSlash(res.Result.VenvPath), + )) + } + return nil +} diff --git a/cmd/dbconnect/sync.go b/cmd/dbconnect/sync.go index d2cbeb00de0..8e23fda5f4a 100644 --- a/cmd/dbconnect/sync.go +++ b/cmd/dbconnect/sync.go @@ -2,6 +2,7 @@ package dbconnect import ( "github.com/databricks/cli/cmd/root" + libsdbconnect "github.com/databricks/cli/libs/dbconnect" "github.com/spf13/cobra" ) @@ -11,8 +12,9 @@ func newSyncCommand() *cobra.Command { Short: "Merge managed dependencies into an existing pyproject.toml and re-provision", } cmd.PreRunE = root.MustWorkspaceClient + addTargetFlags(cmd) cmd.RunE = func(cmd *cobra.Command, args []string) error { - return nil + return runPipeline(cmd, libsdbconnect.ModeSync) } return cmd } diff --git a/libs/dbconnect/uv.go b/libs/dbconnect/uv.go index a3be9684263..2ad19dec06d 100644 --- a/libs/dbconnect/uv.go +++ b/libs/dbconnect/uv.go @@ -24,6 +24,12 @@ func newUvManager() *uvManager { return &uvManager{} } +// NewUvManager returns a PackageManager backed by the uv tool. +// This is the exported constructor for use outside this package. +func NewUvManager() PackageManager { + return newUvManager() +} + // Name returns "uv". func (m *uvManager) Name() string { return "uv" From 03fa3ee7c1057dfca7df9b1fd9e8e1c07d5948c3 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 10:57:50 +0200 Subject: [PATCH 24/33] gofmt dbconnect output.go Co-authored-by: Isaac --- cmd/dbconnect/output.go | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cmd/dbconnect/output.go b/cmd/dbconnect/output.go index 816bff2dc0e..d51627e7ac9 100644 --- a/cmd/dbconnect/output.go +++ b/cmd/dbconnect/output.go @@ -7,8 +7,8 @@ import ( "github.com/databricks/cli/cmd/root" "github.com/databricks/cli/libs/cmdio" - "github.com/databricks/cli/libs/flags" libsdbconnect "github.com/databricks/cli/libs/dbconnect" + "github.com/databricks/cli/libs/flags" "github.com/spf13/cobra" ) From 04aedef9f6c61eace349fd0b2910bf5e0e9aa5ad Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 11:02:19 +0200 Subject: [PATCH 25/33] Fix dbconnect JSON error schema, uv version probe, nil-result rendering - Add json tags to PipelineError (code/message/-) so --output json emits the documented contract instead of Go field names - Change uv version probe from "version" subcommand to --version flag to avoid project-scoped failure when no pyproject.toml exists in cwd - Guard renderResult against nil res: synthesize a minimal Result with error populated so JSON mode always emits a structured object - Use i+1 for 1-based phase numbering in text output - Add comment explaining why ValidateTargetFlags is kept alongside MarkFlagsMutuallyExclusive Co-authored-by: Isaac --- cmd/dbconnect/init.go | 3 +++ cmd/dbconnect/output.go | 17 ++++++++++++++++- libs/dbconnect/result.go | 6 +++--- libs/dbconnect/uv.go | 3 ++- 4 files changed, 24 insertions(+), 5 deletions(-) diff --git a/cmd/dbconnect/init.go b/cmd/dbconnect/init.go index b2f9e4a7f76..d7191832d2d 100644 --- a/cmd/dbconnect/init.go +++ b/cmd/dbconnect/init.go @@ -60,6 +60,9 @@ func runPipeline(cmd *cobra.Command, mode libsdbconnect.Mode) error { Serverless: serverless, Job: job, } + // ValidateTargetFlags is kept despite MarkFlagsMutuallyExclusive above: + // it also validates the library path (no Cobra equivalent) and guards + // non-Cobra call paths such as tests that invoke runPipeline directly. if err := libsdbconnect.ValidateTargetFlags(targetFlags); err != nil { return err } diff --git a/cmd/dbconnect/output.go b/cmd/dbconnect/output.go index d51627e7ac9..e3f9fe8f250 100644 --- a/cmd/dbconnect/output.go +++ b/cmd/dbconnect/output.go @@ -2,6 +2,7 @@ package dbconnect import ( "context" + "errors" "fmt" "path/filepath" @@ -16,13 +17,27 @@ import ( // In JSON mode it renders the full structured result (even on error). // In text mode it prints phase headers and a summary, then returns the error. func renderResult(cmd *cobra.Command, ctx context.Context, res *libsdbconnect.Result, pipelineErr error) error { + // Guard against a nil result (e.g. pipeline failed before constructing one). + // Always emit a structured object in JSON mode so callers can rely on the schema. + if res == nil { + res = &libsdbconnect.Result{} + if pipelineErr != nil { + var pe *libsdbconnect.PipelineError + if errors.As(pipelineErr, &pe) { + res.Error = pe + } else { + res.Error = libsdbconnect.NewError(libsdbconnect.ErrProvisionFailed, pipelineErr, "%s", pipelineErr.Error()) + } + } + } + if root.OutputType(cmd) == flags.OutputJSON { return cmdio.Render(ctx, res) } // Text mode: print phase headers. for i, phase := range res.Phases { - cmdio.LogString(ctx, fmt.Sprintf("=== Phase %d: %s ===", i, phase.Name)) + cmdio.LogString(ctx, fmt.Sprintf("=== Phase %d: %s ===", i+1, phase.Name)) if phase.Detail != "" { cmdio.LogString(ctx, fmt.Sprintf(" status=%s %s", phase.Status, phase.Detail)) } else { diff --git a/libs/dbconnect/result.go b/libs/dbconnect/result.go index 44a5b2af34a..5fbcbd6ba8e 100644 --- a/libs/dbconnect/result.go +++ b/libs/dbconnect/result.go @@ -33,9 +33,9 @@ const ( // PipelineError represents an error during the dbconnect pipeline. type PipelineError struct { - Code ErrorCode - Msg string - Err error + Code ErrorCode `json:"code"` + Msg string `json:"message"` + Err error `json:"-"` } func (e *PipelineError) Error() string { diff --git a/libs/dbconnect/uv.go b/libs/dbconnect/uv.go index 2ad19dec06d..ff9faee1405 100644 --- a/libs/dbconnect/uv.go +++ b/libs/dbconnect/uv.go @@ -55,7 +55,8 @@ func (m *uvManager) EnsureAvailable(ctx context.Context) (string, error) { } m.bin = bin - version, err := process.Background(ctx, []string{m.bin, "version"}) + // Use --version (not "version") to avoid project-scoped sub-command that requires pyproject.toml. + version, err := process.Background(ctx, []string{m.bin, "--version"}) if err != nil { return "", NewError(ErrProvisionFailed, err, "uv version check failed") } From fceace296a5af20c1341d6b984b6e495493b3a3b Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 11:18:59 +0200 Subject: [PATCH 26/33] Add dbconnect acceptance tests Add acceptance tests for the dbconnect init/sync feature: - flag-conflict: verifies Cobra mutual exclusion of --cluster/--serverless/--job - no-target: verifies error when no compute target is selected - serverless-check: verifies --serverless v4 --check with stubbed constraint server - serverless-json: verifies --output json with full Result struct - cluster-unsupported: verifies constraint fetch failure for unsupported DBR version - help/test.toml: opts out of bundle-engine matrix for the help case Each case stubs the test server via [[Server]] in test.toml and uses DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE=$DATABRICKS_HOST to point the constraint fetch at the local test server. Co-authored-by: Grigory Panov --- .../cluster-unsupported/out.test.toml | 3 + .../dbconnect/cluster-unsupported/output.txt | 9 +++ .../dbconnect/cluster-unsupported/script | 1 + .../dbconnect/cluster-unsupported/test.toml | 22 +++++++ .../dbconnect/flag-conflict/out.test.toml | 3 + acceptance/dbconnect/flag-conflict/output.txt | 1 + acceptance/dbconnect/flag-conflict/script | 1 + acceptance/dbconnect/flag-conflict/test.toml | 1 + acceptance/dbconnect/help/out.test.toml | 3 + acceptance/dbconnect/help/output.txt | 6 +- acceptance/dbconnect/help/test.toml | 1 + acceptance/dbconnect/no-target/out.test.toml | 3 + acceptance/dbconnect/no-target/output.txt | 7 +++ acceptance/dbconnect/no-target/script | 1 + acceptance/dbconnect/no-target/test.toml | 5 ++ .../dbconnect/serverless-check/out.test.toml | 3 + .../dbconnect/serverless-check/output.txt | 17 ++++++ acceptance/dbconnect/serverless-check/script | 1 + .../dbconnect/serverless-check/test.toml | 21 +++++++ .../dbconnect/serverless-json/out.test.toml | 3 + .../dbconnect/serverless-json/output.txt | 57 +++++++++++++++++++ acceptance/dbconnect/serverless-json/script | 1 + .../dbconnect/serverless-json/test.toml | 21 +++++++ 23 files changed, 190 insertions(+), 1 deletion(-) create mode 100644 acceptance/dbconnect/cluster-unsupported/out.test.toml create mode 100644 acceptance/dbconnect/cluster-unsupported/output.txt create mode 100644 acceptance/dbconnect/cluster-unsupported/script create mode 100644 acceptance/dbconnect/cluster-unsupported/test.toml create mode 100644 acceptance/dbconnect/flag-conflict/out.test.toml create mode 100644 acceptance/dbconnect/flag-conflict/output.txt create mode 100644 acceptance/dbconnect/flag-conflict/script create mode 100644 acceptance/dbconnect/flag-conflict/test.toml create mode 100644 acceptance/dbconnect/help/out.test.toml create mode 100644 acceptance/dbconnect/help/test.toml create mode 100644 acceptance/dbconnect/no-target/out.test.toml create mode 100644 acceptance/dbconnect/no-target/output.txt create mode 100644 acceptance/dbconnect/no-target/script create mode 100644 acceptance/dbconnect/no-target/test.toml create mode 100644 acceptance/dbconnect/serverless-check/out.test.toml create mode 100644 acceptance/dbconnect/serverless-check/output.txt create mode 100644 acceptance/dbconnect/serverless-check/script create mode 100644 acceptance/dbconnect/serverless-check/test.toml create mode 100644 acceptance/dbconnect/serverless-json/out.test.toml create mode 100644 acceptance/dbconnect/serverless-json/output.txt create mode 100644 acceptance/dbconnect/serverless-json/script create mode 100644 acceptance/dbconnect/serverless-json/test.toml diff --git a/acceptance/dbconnect/cluster-unsupported/out.test.toml b/acceptance/dbconnect/cluster-unsupported/out.test.toml new file mode 100644 index 00000000000..d6187dcb046 --- /dev/null +++ b/acceptance/dbconnect/cluster-unsupported/out.test.toml @@ -0,0 +1,3 @@ +Local = true +Cloud = false +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] diff --git a/acceptance/dbconnect/cluster-unsupported/output.txt b/acceptance/dbconnect/cluster-unsupported/output.txt new file mode 100644 index 00000000000..404cbd80cd0 --- /dev/null +++ b/acceptance/dbconnect/cluster-unsupported/output.txt @@ -0,0 +1,9 @@ +=== Phase 1: preflight === + status=ok uv [UV_VERSION] +=== Phase 2: resolve === + status=ok kind=cluster envKey=dbr/15.4.x-scala2.12 +=== Phase 3: fetch === + status=failed fetch constraints for dbr/15.4.x-scala2.12: GET [DATABRICKS_URL]/dbr/15.4.x-scala2.12/pyproject.toml: unexpected status 404 Not Found +Error: fetch constraints for dbr/15.4.x-scala2.12: GET [DATABRICKS_URL]/dbr/15.4.x-scala2.12/pyproject.toml: unexpected status 404 Not Found + +Exit code: 1 diff --git a/acceptance/dbconnect/cluster-unsupported/script b/acceptance/dbconnect/cluster-unsupported/script new file mode 100644 index 00000000000..347469de108 --- /dev/null +++ b/acceptance/dbconnect/cluster-unsupported/script @@ -0,0 +1 @@ +errcode $CLI dbconnect init --cluster test-cluster-id --check diff --git a/acceptance/dbconnect/cluster-unsupported/test.toml b/acceptance/dbconnect/cluster-unsupported/test.toml new file mode 100644 index 00000000000..b152f24ecbf --- /dev/null +++ b/acceptance/dbconnect/cluster-unsupported/test.toml @@ -0,0 +1,22 @@ +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] + +[Env] +DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE = "$DATABRICKS_HOST" + +[[Server]] +Pattern = "GET /api/2.1/clusters/get" +Response.Body = ''' +{ + "cluster_id": "test-cluster-id", + "spark_version": "15.4.x-scala2.12" +} +''' + +[[Server]] +Pattern = "GET /dbr/15.4.x-scala2.12/pyproject.toml" +Response.StatusCode = 404 +Response.Body = '{"message": "Not found"}' + +[[Repls]] +Old = 'uv uv \S+(?: \([^)]+\))?' +New = 'uv [UV_VERSION]' diff --git a/acceptance/dbconnect/flag-conflict/out.test.toml b/acceptance/dbconnect/flag-conflict/out.test.toml new file mode 100644 index 00000000000..d6187dcb046 --- /dev/null +++ b/acceptance/dbconnect/flag-conflict/out.test.toml @@ -0,0 +1,3 @@ +Local = true +Cloud = false +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] diff --git a/acceptance/dbconnect/flag-conflict/output.txt b/acceptance/dbconnect/flag-conflict/output.txt new file mode 100644 index 00000000000..141152ae1cb --- /dev/null +++ b/acceptance/dbconnect/flag-conflict/output.txt @@ -0,0 +1 @@ +Error: if any flags in the group [cluster serverless job] are set none of the others can be; [cluster serverless] were all set diff --git a/acceptance/dbconnect/flag-conflict/script b/acceptance/dbconnect/flag-conflict/script new file mode 100644 index 00000000000..9c344219428 --- /dev/null +++ b/acceptance/dbconnect/flag-conflict/script @@ -0,0 +1 @@ +musterr $CLI dbconnect init --cluster abc --serverless v4 diff --git a/acceptance/dbconnect/flag-conflict/test.toml b/acceptance/dbconnect/flag-conflict/test.toml new file mode 100644 index 00000000000..c63fe3fe108 --- /dev/null +++ b/acceptance/dbconnect/flag-conflict/test.toml @@ -0,0 +1 @@ +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] diff --git a/acceptance/dbconnect/help/out.test.toml b/acceptance/dbconnect/help/out.test.toml new file mode 100644 index 00000000000..d6187dcb046 --- /dev/null +++ b/acceptance/dbconnect/help/out.test.toml @@ -0,0 +1,3 @@ +Local = true +Cloud = false +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] diff --git a/acceptance/dbconnect/help/output.txt b/acceptance/dbconnect/help/output.txt index 1ef0c45aa82..bbafed7a057 100644 --- a/acceptance/dbconnect/help/output.txt +++ b/acceptance/dbconnect/help/output.txt @@ -27,7 +27,11 @@ Usage: databricks dbconnect init [flags] Flags: - -h, --help help for init + --check compute the plan without writing files or provisioning + --cluster string cluster ID to use as the compute target + -h, --help help for init + --job string job ID to use as the compute target + --serverless string serverless version to use as the compute target (e.g. v4) Global Flags: --debug enable debug logging diff --git a/acceptance/dbconnect/help/test.toml b/acceptance/dbconnect/help/test.toml new file mode 100644 index 00000000000..c63fe3fe108 --- /dev/null +++ b/acceptance/dbconnect/help/test.toml @@ -0,0 +1 @@ +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] diff --git a/acceptance/dbconnect/no-target/out.test.toml b/acceptance/dbconnect/no-target/out.test.toml new file mode 100644 index 00000000000..d6187dcb046 --- /dev/null +++ b/acceptance/dbconnect/no-target/out.test.toml @@ -0,0 +1,3 @@ +Local = true +Cloud = false +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] diff --git a/acceptance/dbconnect/no-target/output.txt b/acceptance/dbconnect/no-target/output.txt new file mode 100644 index 00000000000..5af0cb576ea --- /dev/null +++ b/acceptance/dbconnect/no-target/output.txt @@ -0,0 +1,7 @@ +=== Phase 1: preflight === + status=ok uv [UV_VERSION] +=== Phase 2: resolve === + status=failed No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job. +Error: No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job. + +Exit code: 1 diff --git a/acceptance/dbconnect/no-target/script b/acceptance/dbconnect/no-target/script new file mode 100644 index 00000000000..77bef83565e --- /dev/null +++ b/acceptance/dbconnect/no-target/script @@ -0,0 +1 @@ +errcode $CLI dbconnect init diff --git a/acceptance/dbconnect/no-target/test.toml b/acceptance/dbconnect/no-target/test.toml new file mode 100644 index 00000000000..0d0481fb836 --- /dev/null +++ b/acceptance/dbconnect/no-target/test.toml @@ -0,0 +1,5 @@ +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] + +[[Repls]] +Old = 'uv uv \S+(?: \([^)]+\))?' +New = 'uv [UV_VERSION]' diff --git a/acceptance/dbconnect/serverless-check/out.test.toml b/acceptance/dbconnect/serverless-check/out.test.toml new file mode 100644 index 00000000000..d6187dcb046 --- /dev/null +++ b/acceptance/dbconnect/serverless-check/out.test.toml @@ -0,0 +1,3 @@ +Local = true +Cloud = false +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] diff --git a/acceptance/dbconnect/serverless-check/output.txt b/acceptance/dbconnect/serverless-check/output.txt new file mode 100644 index 00000000000..98c84da8858 --- /dev/null +++ b/acceptance/dbconnect/serverless-check/output.txt @@ -0,0 +1,17 @@ + +>>> [CLI] dbconnect init --serverless v4 --check +=== Phase 1: preflight === + status=ok uv [UV_VERSION] +=== Phase 2: resolve === + status=ok kind=serverless envKey=serverless/serverless-v4 +=== Phase 3: fetch === + status=ok source=[DATABRICKS_URL]/serverless/serverless-v4/pyproject.toml fromCache=false +=== Phase 4: parse-python-version === + status=ok 3.12 +=== Phase 5: plan === + status=ok changed=requires-python,databricks-connect,tool.uv.constraint-dependencies +Plan: [TEST_TMP_DIR]/pyproject.toml + changed region: requires-python + changed region: databricks-connect + changed region: tool.uv.constraint-dependencies +Check complete. No files were modified. diff --git a/acceptance/dbconnect/serverless-check/script b/acceptance/dbconnect/serverless-check/script new file mode 100644 index 00000000000..f360138e4f3 --- /dev/null +++ b/acceptance/dbconnect/serverless-check/script @@ -0,0 +1 @@ +trace $CLI dbconnect init --serverless v4 --check diff --git a/acceptance/dbconnect/serverless-check/test.toml b/acceptance/dbconnect/serverless-check/test.toml new file mode 100644 index 00000000000..47881839976 --- /dev/null +++ b/acceptance/dbconnect/serverless-check/test.toml @@ -0,0 +1,21 @@ +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] + +[Env] +DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE = "$DATABRICKS_HOST" + +[[Server]] +Pattern = "GET /serverless/serverless-v4/pyproject.toml" +Response.Body = ''' +[project] +requires-python = ">=3.12" + +[dependency-groups] +dev = ["databricks-connect~=17.2.0"] + +[tool.uv] +constraint-dependencies = ["pyarrow<19", "pandas<3"] +''' + +[[Repls]] +Old = 'uv uv \S+(?: \([^)]+\))?' +New = 'uv [UV_VERSION]' diff --git a/acceptance/dbconnect/serverless-json/out.test.toml b/acceptance/dbconnect/serverless-json/out.test.toml new file mode 100644 index 00000000000..d6187dcb046 --- /dev/null +++ b/acceptance/dbconnect/serverless-json/out.test.toml @@ -0,0 +1,3 @@ +Local = true +Cloud = false +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] diff --git a/acceptance/dbconnect/serverless-json/output.txt b/acceptance/dbconnect/serverless-json/output.txt new file mode 100644 index 00000000000..966ab4e556a --- /dev/null +++ b/acceptance/dbconnect/serverless-json/output.txt @@ -0,0 +1,57 @@ + +>>> [CLI] dbconnect init --serverless v4 --check --output json +{ + "mode": "init", + "check": true, + "target": { + "kind": "serverless", + "cluster_id": "", + "spark_version": "", + "env_key": "serverless/serverless-v4", + "python_version": "3.12" + }, + "constraints": { + "source_url": "[DATABRICKS_URL]/serverless/serverless-v4/pyproject.toml", + "from_cache": false, + "requires_python": "\u003e=3.12", + "databricks_connect": "databricks-connect~=17.2.0", + "constraint_count": 2 + }, + "plan": { + "pyproject_path": "[TEST_TMP_DIR]/pyproject.toml", + "backup_path": "[TEST_TMP_DIR]/pyproject.toml.bak", + "diff": "--- pyproject.toml\n+++ pyproject.toml\n@@ -1 +1,16 @@\n+[project]\n+name = \"001\"\n+requires-python = \"\u003e=3.12\"\n+\n+[dependency-groups]\n+dev = [\n+ \"databricks-connect~=17.2.0\",\n+]\n+\n+# managed by databricks dbconnect — do not edit\n+[tool.uv]\n+constraint-dependencies = [\n+ \"pyarrow\u003c19\",\n+ \"pandas\u003c3\",\n+]\n+# end managed by databricks dbconnect\n", + "changed_regions": [ + "requires-python", + "databricks-connect", + "tool.uv.constraint-dependencies" + ] + }, + "phases": [ + { + "name": "preflight", + "status": "ok", + "detail": "uv [UV_VERSION]" + }, + { + "name": "resolve", + "status": "ok", + "detail": "kind=serverless envKey=serverless/serverless-v4" + }, + { + "name": "fetch", + "status": "ok", + "detail": "source=[DATABRICKS_URL]/serverless/serverless-v4/pyproject.toml fromCache=false" + }, + { + "name": "parse-python-version", + "status": "ok", + "detail": "3.12" + }, + { + "name": "plan", + "status": "ok", + "detail": "changed=requires-python,databricks-connect,tool.uv.constraint-dependencies" + } + ] +} diff --git a/acceptance/dbconnect/serverless-json/script b/acceptance/dbconnect/serverless-json/script new file mode 100644 index 00000000000..68f96164406 --- /dev/null +++ b/acceptance/dbconnect/serverless-json/script @@ -0,0 +1 @@ +trace $CLI dbconnect init --serverless v4 --check --output json diff --git a/acceptance/dbconnect/serverless-json/test.toml b/acceptance/dbconnect/serverless-json/test.toml new file mode 100644 index 00000000000..d274a7486c9 --- /dev/null +++ b/acceptance/dbconnect/serverless-json/test.toml @@ -0,0 +1,21 @@ +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] + +[Env] +DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE = "$DATABRICKS_HOST" + +[[Server]] +Pattern = "GET /serverless/serverless-v4/pyproject.toml" +Response.Body = ''' +[project] +requires-python = ">=3.12" + +[dependency-groups] +dev = ["databricks-connect~=17.2.0"] + +[tool.uv] +constraint-dependencies = ["pyarrow<19", "pandas<3"] +''' + +[[Repls]] +Old = '"uv uv \S+(?: \([^)]+\))?"' +New = '"uv [UV_VERSION]"' From 8f9a1340ff577adb376e56114dc956db1ff26c04 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 11:25:42 +0200 Subject: [PATCH 27/33] Fix acceptance tests to use musterr for required-failure commands no-target and cluster-unsupported tests use commands that must fail; musterr asserts this and fails the test if the command unexpectedly succeeds. errcode is for tolerated failures only. Co-authored-by: Isaac --- acceptance/dbconnect/cluster-unsupported/output.txt | 2 -- acceptance/dbconnect/cluster-unsupported/script | 2 +- acceptance/dbconnect/no-target/output.txt | 2 -- acceptance/dbconnect/no-target/script | 2 +- 4 files changed, 2 insertions(+), 6 deletions(-) diff --git a/acceptance/dbconnect/cluster-unsupported/output.txt b/acceptance/dbconnect/cluster-unsupported/output.txt index 404cbd80cd0..63bc819b9b3 100644 --- a/acceptance/dbconnect/cluster-unsupported/output.txt +++ b/acceptance/dbconnect/cluster-unsupported/output.txt @@ -5,5 +5,3 @@ === Phase 3: fetch === status=failed fetch constraints for dbr/15.4.x-scala2.12: GET [DATABRICKS_URL]/dbr/15.4.x-scala2.12/pyproject.toml: unexpected status 404 Not Found Error: fetch constraints for dbr/15.4.x-scala2.12: GET [DATABRICKS_URL]/dbr/15.4.x-scala2.12/pyproject.toml: unexpected status 404 Not Found - -Exit code: 1 diff --git a/acceptance/dbconnect/cluster-unsupported/script b/acceptance/dbconnect/cluster-unsupported/script index 347469de108..c07f6635790 100644 --- a/acceptance/dbconnect/cluster-unsupported/script +++ b/acceptance/dbconnect/cluster-unsupported/script @@ -1 +1 @@ -errcode $CLI dbconnect init --cluster test-cluster-id --check +musterr $CLI dbconnect init --cluster test-cluster-id --check diff --git a/acceptance/dbconnect/no-target/output.txt b/acceptance/dbconnect/no-target/output.txt index 5af0cb576ea..b4f70ed4bb3 100644 --- a/acceptance/dbconnect/no-target/output.txt +++ b/acceptance/dbconnect/no-target/output.txt @@ -3,5 +3,3 @@ === Phase 2: resolve === status=failed No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job. Error: No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job. - -Exit code: 1 diff --git a/acceptance/dbconnect/no-target/script b/acceptance/dbconnect/no-target/script index 77bef83565e..d8f8e147a53 100644 --- a/acceptance/dbconnect/no-target/script +++ b/acceptance/dbconnect/no-target/script @@ -1 +1 @@ -errcode $CLI dbconnect init +musterr $CLI dbconnect init From 7cf72cd2b64750bb5f2f080f99797e175e461bbb Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 11:43:41 +0200 Subject: [PATCH 28/33] Fix lint findings in dbconnect packages Co-authored-by: Isaac --- cmd/dbconnect/init.go | 2 +- cmd/dbconnect/output.go | 11 +++++------ libs/dbconnect/constraints.go | 2 +- libs/dbconnect/envkey.go | 2 +- libs/dbconnect/merge.go | 18 +++++++++--------- libs/dbconnect/pipeline.go | 16 ++++++++-------- libs/dbconnect/result_test.go | 2 +- libs/dbconnect/target_test.go | 1 + libs/dbconnect/uv.go | 2 +- 9 files changed, 28 insertions(+), 28 deletions(-) diff --git a/cmd/dbconnect/init.go b/cmd/dbconnect/init.go index d7191832d2d..c862dee3a4d 100644 --- a/cmd/dbconnect/init.go +++ b/cmd/dbconnect/init.go @@ -97,7 +97,7 @@ func runPipeline(cmd *cobra.Command, mode libsdbconnect.Mode) error { } res, pipelineErr := p.Run(ctx) - return renderResult(cmd, ctx, res, pipelineErr) + return renderResult(ctx, cmd, res, pipelineErr) } // resolveConstraintBaseURL returns the constraint base URL using ordered precedence: diff --git a/cmd/dbconnect/output.go b/cmd/dbconnect/output.go index e3f9fe8f250..dbe7a4a1a71 100644 --- a/cmd/dbconnect/output.go +++ b/cmd/dbconnect/output.go @@ -16,14 +16,13 @@ import ( // renderResult renders the pipeline result to the command's output. // In JSON mode it renders the full structured result (even on error). // In text mode it prints phase headers and a summary, then returns the error. -func renderResult(cmd *cobra.Command, ctx context.Context, res *libsdbconnect.Result, pipelineErr error) error { +func renderResult(ctx context.Context, cmd *cobra.Command, res *libsdbconnect.Result, pipelineErr error) error { // Guard against a nil result (e.g. pipeline failed before constructing one). // Always emit a structured object in JSON mode so callers can rely on the schema. if res == nil { res = &libsdbconnect.Result{} if pipelineErr != nil { - var pe *libsdbconnect.PipelineError - if errors.As(pipelineErr, &pe) { + if pe, ok := errors.AsType[*libsdbconnect.PipelineError](pipelineErr); ok { res.Error = pe } else { res.Error = libsdbconnect.NewError(libsdbconnect.ErrProvisionFailed, pipelineErr, "%s", pipelineErr.Error()) @@ -41,7 +40,7 @@ func renderResult(cmd *cobra.Command, ctx context.Context, res *libsdbconnect.Re if phase.Detail != "" { cmdio.LogString(ctx, fmt.Sprintf(" status=%s %s", phase.Status, phase.Detail)) } else { - cmdio.LogString(ctx, fmt.Sprintf(" status=%s", phase.Status)) + cmdio.LogString(ctx, " status="+phase.Status) } } @@ -52,10 +51,10 @@ func renderResult(cmd *cobra.Command, ctx context.Context, res *libsdbconnect.Re // Print a final success / check summary. if res.Check { if res.Plan != nil { - cmdio.LogString(ctx, fmt.Sprintf("Plan: %s", filepath.ToSlash(res.Plan.PyprojectPath))) + cmdio.LogString(ctx, "Plan: "+filepath.ToSlash(res.Plan.PyprojectPath)) if len(res.Plan.ChangedRegions) > 0 { for _, region := range res.Plan.ChangedRegions { - cmdio.LogString(ctx, fmt.Sprintf(" changed region: %s", region)) + cmdio.LogString(ctx, " changed region: "+region) } } } diff --git a/libs/dbconnect/constraints.go b/libs/dbconnect/constraints.go index 2dd0eb24dcc..5f1f0901fa2 100644 --- a/libs/dbconnect/constraints.go +++ b/libs/dbconnect/constraints.go @@ -38,7 +38,7 @@ func sanitizeEnvKey(envKey string) string { // and falls back to the cached copy on network or HTTP errors. // // Constraint files are hosted at: -// https://github.com/databricks/databricks-dbconnect-constraints +// https://github.com/pietern/databricks-environments func FetchConstraints(ctx context.Context, baseURL, envKey, cacheDir string) (*Constraints, error) { url := baseURL + "/" + envKey + "/pyproject.toml" cachePath := filepath.Join(cacheDir, sanitizeEnvKey(envKey)+".toml") diff --git a/libs/dbconnect/envkey.go b/libs/dbconnect/envkey.go index 726cbf1d97c..3f53bb2b94e 100644 --- a/libs/dbconnect/envkey.go +++ b/libs/dbconnect/envkey.go @@ -11,7 +11,7 @@ var pythonVersionRe = regexp.MustCompile(`(\d+)\.(\d+)`) // EnvKeyForServerless returns the environment key for a serverless version. func EnvKeyForServerless(version string) string { normalized := strings.TrimPrefix(strings.ToLower(version), "v") - return fmt.Sprintf("serverless/serverless-v%s", normalized) + return "serverless/serverless-v" + normalized } // EnvKeyForSparkVersion returns the environment key for a Spark version. diff --git a/libs/dbconnect/merge.go b/libs/dbconnect/merge.go index 86205b88826..e6c475ebc92 100644 --- a/libs/dbconnect/merge.go +++ b/libs/dbconnect/merge.go @@ -168,7 +168,7 @@ var dbconnectTokenRe = regexp.MustCompile(`"databricks-connect[^"]*"`) // mergeToolUv rewrites the managed [tool.uv] constraint-dependencies block. If a // marker-bracketed block already exists, its contents are replaced in place. Otherwise any // plain [tool.uv] table is removed and a fresh marker-bracketed block is appended at EOF. -func mergeToolUv(lines []string, deps []string) ([]string, bool) { +func mergeToolUv(lines, deps []string) ([]string, bool) { block := renderToolUvBlock(deps) start, stop, found := markerBounds(lines) @@ -242,20 +242,20 @@ func removeConstraintDeps(lines []string, header, end int) []string { if !constraintDepsRe.MatchString(lines[i]) { continue } - last := i // Multi-line array form: extend through the closing "]" line. The single-line form // already contains the closing bracket, so this loop does not advance. + end2 := i + 1 if !strings.Contains(lines[i], "]") { for j := i + 1; j < end; j++ { - last = j + end2 = j + 1 if strings.TrimSpace(lines[j]) == "]" { break } } } - out := make([]string, 0, len(lines)-(last-i+1)) + out := make([]string, 0, len(lines)-(end2-i)) out = append(out, lines[:i]...) - out = append(out, lines[last+1:]...) + out = append(out, lines[end2:]...) return out } return lines @@ -302,7 +302,7 @@ func renderToolUvBlock(deps []string) []string { // appendManagedBlock appends block to lines, ensuring exactly one blank line separates it // from prior content and the file ends with a single trailing newline. -func appendManagedBlock(lines []string, block []string) []string { +func appendManagedBlock(lines, block []string) []string { // strings.Split on a trailing "\n" leaves a final empty element; drop trailing empty // lines so we control the spacing precisely. for len(lines) > 0 && lines[len(lines)-1] == "" { @@ -338,12 +338,12 @@ func equalLines(a, b []string) bool { func RenderFreshPyproject(projectName string, c Constraints) []byte { var b strings.Builder b.WriteString("[project]\n") - b.WriteString(fmt.Sprintf("name = %q\n", projectName)) - b.WriteString(fmt.Sprintf("requires-python = %q\n", c.RequiresPython)) + fmt.Fprintf(&b, "name = %q\n", projectName) + fmt.Fprintf(&b, "requires-python = %q\n", c.RequiresPython) b.WriteString("\n") b.WriteString("[dependency-groups]\n") b.WriteString("dev = [\n") - b.WriteString(fmt.Sprintf(" %q,\n", c.DatabricksConnect)) + fmt.Fprintf(&b, " %q,\n", c.DatabricksConnect) b.WriteString("]\n") b.WriteString("\n") for _, line := range renderToolUvBlock(c.ConstraintDeps) { diff --git a/libs/dbconnect/pipeline.go b/libs/dbconnect/pipeline.go index 19f38914adb..ead9f095a3f 100644 --- a/libs/dbconnect/pipeline.go +++ b/libs/dbconnect/pipeline.go @@ -120,8 +120,8 @@ func (p *Pipeline) resolve(ctx context.Context, res *Result) (*TargetInfo, error phase.Status = "failed" phase.Detail = err.Error() res.Phases = append(res.Phases, phase) - var pe *PipelineError - if !errors.As(err, &pe) { + pe, ok := errors.AsType[*PipelineError](err) + if !ok { pe = NewError(ErrNoTargetSelected, err, "target resolution failed") } res.Error = pe @@ -142,8 +142,8 @@ func (p *Pipeline) fetch(ctx context.Context, res *Result, target *TargetInfo) ( phase.Status = "failed" phase.Detail = err.Error() res.Phases = append(res.Phases, phase) - var pe *PipelineError - if !errors.As(err, &pe) { + pe, ok := errors.AsType[*PipelineError](err) + if !ok { pe = NewError(ErrConstraintFetchFailed, err, "fetch constraints failed") } res.Error = pe @@ -238,7 +238,7 @@ func (p *Pipeline) mergePlan(_ context.Context, res *Result, c *Constraints) (*P } phase.Status = "ok" - phase.Detail = fmt.Sprintf("changed=%s", strings.Join(changedRegions, ",")) + phase.Detail = "changed=" + strings.Join(changedRegions, ",") res.Phases = append(res.Phases, phase) return plan, mergedBytes, nil } @@ -437,12 +437,12 @@ func majorVersion(v string) string { if v == "" { return "" } - dot := strings.Index(v, ".") - if dot < 0 { + before, _, ok := strings.Cut(v, ".") + if !ok { // No dot — the whole string is the major component. return v } - return v[:dot] + return before } // copyFile copies src to dst, creating or overwriting dst. diff --git a/libs/dbconnect/result_test.go b/libs/dbconnect/result_test.go index 53650ca3638..643f505151f 100644 --- a/libs/dbconnect/result_test.go +++ b/libs/dbconnect/result_test.go @@ -12,7 +12,7 @@ func TestPipelineErrorWrapsAndExposesCode(t *testing.T) { err := NewError(ErrConstraintFetchFailed, base, "fetch %s", "x") assert.Equal(t, "fetch x: boom", err.Error()) assert.Equal(t, ErrConstraintFetchFailed, err.Code) - assert.True(t, errors.Is(err, base)) + assert.ErrorIs(t, err, base) } func TestModeString(t *testing.T) { diff --git a/libs/dbconnect/target_test.go b/libs/dbconnect/target_test.go index 3d67e24657d..6e1bdb1d040 100644 --- a/libs/dbconnect/target_test.go +++ b/libs/dbconnect/target_test.go @@ -16,6 +16,7 @@ type stubCompute struct { func (s stubCompute) GetClusterSparkVersion(_ context.Context, _ string) (string, error) { return s.clusterVersion, s.clusterErr } + func (s stubCompute) GetJobSparkVersion(_ context.Context, _ string) (string, bool, string, error) { return "", false, "", nil } diff --git a/libs/dbconnect/uv.go b/libs/dbconnect/uv.go index ff9faee1405..f3a41917399 100644 --- a/libs/dbconnect/uv.go +++ b/libs/dbconnect/uv.go @@ -157,7 +157,7 @@ func discoverUv(ctx context.Context) (string, error) { return p, nil } - home, _ := os.UserHomeDir() + home, _ := env.UserHomeDir(ctx) // XDG_BIN_HOME defaults to $HOME/.local/bin when unset. xdgBinHome, _ := env.Lookup(ctx, "XDG_BIN_HOME") From 1ac61fe5d7ac579f86c918ed9af93f49b0ceccd8 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 11:44:58 +0200 Subject: [PATCH 29/33] Add dbconnect changelog entry; fix constraint repo URL comment Also standardize the serverless-json acceptance uv-version replacement regex to the unwrapped form used by the sibling cases. Co-authored-by: Isaac --- NEXT_CHANGELOG.md | 1 + acceptance/dbconnect/serverless-json/test.toml | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/NEXT_CHANGELOG.md b/NEXT_CHANGELOG.md index 1cec48c3f50..513d4980da9 100644 --- a/NEXT_CHANGELOG.md +++ b/NEXT_CHANGELOG.md @@ -5,6 +5,7 @@ ### Notable Changes ### CLI +* Add `databricks dbconnect init` and `databricks dbconnect sync` to provision a local Python environment (Python version, `databricks-connect` pin, and dependency constraints) matched to the selected Databricks compute target. ### Bundles * `bundle run` now prints the modern job run URL (`/jobs//runs/`) so that non-admin users permitted to view the run are taken to the run instead of the workspace homepage. diff --git a/acceptance/dbconnect/serverless-json/test.toml b/acceptance/dbconnect/serverless-json/test.toml index d274a7486c9..47881839976 100644 --- a/acceptance/dbconnect/serverless-json/test.toml +++ b/acceptance/dbconnect/serverless-json/test.toml @@ -17,5 +17,5 @@ constraint-dependencies = ["pyarrow<19", "pandas<3"] ''' [[Repls]] -Old = '"uv uv \S+(?: \([^)]+\))?"' -New = '"uv [UV_VERSION]"' +Old = 'uv uv \S+(?: \([^)]+\))?' +New = 'uv [UV_VERSION]' From 995b6a4f90b90f319fadb03975cfa3ae06ab64a0 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 11:55:09 +0200 Subject: [PATCH 30/33] Fix dbconnect JSON-mode exit code; populate spark_version; drop unused cluster-unsupported scaffolding Co-authored-by: Isaac --- acceptance/dbconnect/json-error/out.test.toml | 3 +++ acceptance/dbconnect/json-error/output.txt | 20 ++++++++++++++++++ acceptance/dbconnect/json-error/script | 1 + acceptance/dbconnect/json-error/test.toml | 5 +++++ acceptance/dbconnect/no-target/output.txt | 4 ++-- cmd/dbconnect/output.go | 11 +++++++++- libs/dbconnect/result.go | 18 +++++----------- libs/dbconnect/target.go | 21 +++++++++++-------- libs/dbconnect/target_test.go | 1 + 9 files changed, 59 insertions(+), 25 deletions(-) create mode 100644 acceptance/dbconnect/json-error/out.test.toml create mode 100644 acceptance/dbconnect/json-error/output.txt create mode 100644 acceptance/dbconnect/json-error/script create mode 100644 acceptance/dbconnect/json-error/test.toml diff --git a/acceptance/dbconnect/json-error/out.test.toml b/acceptance/dbconnect/json-error/out.test.toml new file mode 100644 index 00000000000..d6187dcb046 --- /dev/null +++ b/acceptance/dbconnect/json-error/out.test.toml @@ -0,0 +1,3 @@ +Local = true +Cloud = false +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] diff --git a/acceptance/dbconnect/json-error/output.txt b/acceptance/dbconnect/json-error/output.txt new file mode 100644 index 00000000000..39376a2cfce --- /dev/null +++ b/acceptance/dbconnect/json-error/output.txt @@ -0,0 +1,20 @@ +{ + "mode": "init", + "check": false, + "phases": [ + { + "name": "preflight", + "status": "ok", + "detail": "uv [UV_VERSION]" + }, + { + "name": "resolve", + "status": "failed", + "detail": "No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job" + } + ], + "error": { + "code": "no_target_selected", + "message": "No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job" + } +} diff --git a/acceptance/dbconnect/json-error/script b/acceptance/dbconnect/json-error/script new file mode 100644 index 00000000000..0a6837bb8f3 --- /dev/null +++ b/acceptance/dbconnect/json-error/script @@ -0,0 +1 @@ +musterr $CLI dbconnect init --output json diff --git a/acceptance/dbconnect/json-error/test.toml b/acceptance/dbconnect/json-error/test.toml new file mode 100644 index 00000000000..0d0481fb836 --- /dev/null +++ b/acceptance/dbconnect/json-error/test.toml @@ -0,0 +1,5 @@ +EnvMatrix.DATABRICKS_BUNDLE_ENGINE = [] + +[[Repls]] +Old = 'uv uv \S+(?: \([^)]+\))?' +New = 'uv [UV_VERSION]' diff --git a/acceptance/dbconnect/no-target/output.txt b/acceptance/dbconnect/no-target/output.txt index b4f70ed4bb3..e7908e7098c 100644 --- a/acceptance/dbconnect/no-target/output.txt +++ b/acceptance/dbconnect/no-target/output.txt @@ -1,5 +1,5 @@ === Phase 1: preflight === status=ok uv [UV_VERSION] === Phase 2: resolve === - status=failed No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job. -Error: No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job. + status=failed No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job +Error: No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job diff --git a/cmd/dbconnect/output.go b/cmd/dbconnect/output.go index dbe7a4a1a71..3a1dcc28b9a 100644 --- a/cmd/dbconnect/output.go +++ b/cmd/dbconnect/output.go @@ -31,7 +31,16 @@ func renderResult(ctx context.Context, cmd *cobra.Command, res *libsdbconnect.Re } if root.OutputType(cmd) == flags.OutputJSON { - return cmdio.Render(ctx, res) + if err := cmdio.Render(ctx, res); err != nil { + return err + } + // The JSON object is the only thing written to stdout. On failure we still + // need a non-zero exit, but returning pipelineErr would make the root print + // "Error: ..." to stderr. ErrAlreadyPrinted exits non-zero without that. + if pipelineErr != nil { + return root.ErrAlreadyPrinted + } + return nil } // Text mode: print phase headers. diff --git a/libs/dbconnect/result.go b/libs/dbconnect/result.go index 5fbcbd6ba8e..b0917b062be 100644 --- a/libs/dbconnect/result.go +++ b/libs/dbconnect/result.go @@ -23,7 +23,6 @@ type ErrorCode string const ( ErrNoTargetSelected ErrorCode = "no_target_selected" - ErrClusterUnsupported ErrorCode = "cluster_unsupported" ErrConstraintFetchFailed ErrorCode = "constraint_fetch_failed" ErrMergeFailed ErrorCode = "merge_failed" ErrProvisionFailed ErrorCode = "provision_failed" @@ -61,18 +60,11 @@ func NewError(code ErrorCode, err error, format string, args ...any) *PipelineEr // TargetInfo contains information about the target environment. type TargetInfo struct { - Kind string `json:"kind"` - ClusterID string `json:"cluster_id"` - SparkVersion string `json:"spark_version"` - EnvKey string `json:"env_key"` - PythonVersion string `json:"python_version"` - Fallback *FallbackInfo `json:"fallback,omitempty"` -} - -// FallbackInfo contains fallback information. -type FallbackInfo struct { - Requested string `json:"requested"` - Resolved string `json:"resolved"` + Kind string `json:"kind"` + ClusterID string `json:"cluster_id"` + SparkVersion string `json:"spark_version"` + EnvKey string `json:"env_key"` + PythonVersion string `json:"python_version"` } // ConstraintInfo contains constraint information. diff --git a/libs/dbconnect/target.go b/libs/dbconnect/target.go index 16f8ddadd62..b151cce49dc 100644 --- a/libs/dbconnect/target.go +++ b/libs/dbconnect/target.go @@ -60,9 +60,10 @@ func ResolveTarget(ctx context.Context, f TargetFlags, c ComputeClient, bt Bundl return nil, fmt.Errorf("resolving cluster %s: %w", f.Cluster, err) } return &TargetInfo{ - Kind: "cluster", - ClusterID: f.Cluster, - EnvKey: EnvKeyForSparkVersion(v), + Kind: "cluster", + ClusterID: f.Cluster, + SparkVersion: v, + EnvKey: EnvKeyForSparkVersion(v), }, nil } @@ -92,15 +93,16 @@ func ResolveTarget(ctx context.Context, f TargetFlags, c ComputeClient, bt Bundl }, nil } return &TargetInfo{ - Kind: "cluster", - EnvKey: EnvKeyForSparkVersion(version), + Kind: "cluster", + SparkVersion: version, + EnvKey: EnvKeyForSparkVersion(version), }, nil } // Fall back to bundle target. if !bt.Selected { return nil, NewError(ErrNoTargetSelected, nil, - "No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job.") + "No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job") } if bt.Serverless { @@ -118,9 +120,10 @@ func ResolveTarget(ctx context.Context, f TargetFlags, c ComputeClient, bt Bundl return nil, fmt.Errorf("resolving bundle cluster %s: %w", bt.ClusterID, err) } return &TargetInfo{ - Kind: "cluster", - ClusterID: bt.ClusterID, - EnvKey: EnvKeyForSparkVersion(v), + Kind: "cluster", + ClusterID: bt.ClusterID, + SparkVersion: v, + EnvKey: EnvKeyForSparkVersion(v), }, nil } diff --git a/libs/dbconnect/target_test.go b/libs/dbconnect/target_test.go index 6e1bdb1d040..24c1250f95a 100644 --- a/libs/dbconnect/target_test.go +++ b/libs/dbconnect/target_test.go @@ -33,6 +33,7 @@ func TestResolveClusterFlag(t *testing.T) { ti, err := ResolveTarget(t.Context(), TargetFlags{Cluster: "abc"}, c, BundleTarget{}) require.NoError(t, err) assert.Equal(t, "cluster", ti.Kind) + assert.Equal(t, "15.4.x-scala2.12", ti.SparkVersion) assert.Equal(t, "dbr/15.4.x-scala2.12", ti.EnvKey) assert.Equal(t, "abc", ti.ClusterID) } From bccdae8896bb585cc0f34c66b9c98a23a0c1d291 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 17:29:20 +0200 Subject: [PATCH 31/33] Bridge pip.conf index-url to UV_INDEX_URL and surface uv stderr in errors Co-authored-by: Isaac --- libs/dbconnect/uv.go | 88 ++++++++++++++++++++++++++++++---- libs/dbconnect/uv_test.go | 99 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 179 insertions(+), 8 deletions(-) diff --git a/libs/dbconnect/uv.go b/libs/dbconnect/uv.go index f3a41917399..81aea7262b4 100644 --- a/libs/dbconnect/uv.go +++ b/libs/dbconnect/uv.go @@ -1,10 +1,13 @@ package dbconnect import ( + "bufio" "context" + "errors" "os" "os/exec" "path/filepath" + "regexp" "runtime" "strings" @@ -58,7 +61,7 @@ func (m *uvManager) EnsureAvailable(ctx context.Context) (string, error) { // Use --version (not "version") to avoid project-scoped sub-command that requires pyproject.toml. version, err := process.Background(ctx, []string{m.bin, "--version"}) if err != nil { - return "", NewError(ErrProvisionFailed, err, "uv version check failed") + return "", uvFailure(ErrProvisionFailed, err, "uv version check") } return strings.TrimSpace(version), nil } @@ -66,9 +69,15 @@ func (m *uvManager) EnsureAvailable(ctx context.Context) (string, error) { // EnsurePython installs the requested Python minor version via uv. func (m *uvManager) EnsurePython(ctx context.Context, minor string) error { args := append([]string{m.bin}, m.pythonInstallArgs(minor)...) - _, err := process.Background(ctx, args) + indexURL := m.resolveIndexURL(ctx) + var err error + if indexURL != "" { + _, err = process.Background(ctx, args, process.WithEnv("UV_INDEX_URL", indexURL)) + } else { + _, err = process.Background(ctx, args) + } if err != nil { - return NewError(ErrProvisionFailed, err, "uv python install %s failed", minor) + return uvFailure(ErrProvisionFailed, err, "uv python install "+minor) } return nil } @@ -76,9 +85,15 @@ func (m *uvManager) EnsurePython(ctx context.Context, minor string) error { // Provision runs `uv sync` inside projectDir to install project dependencies. func (m *uvManager) Provision(ctx context.Context, projectDir string) error { args := append([]string{m.bin}, m.syncArgs()...) - _, err := process.Background(ctx, args, process.WithDir(projectDir)) + indexURL := m.resolveIndexURL(ctx) + var err error + if indexURL != "" { + _, err = process.Background(ctx, args, process.WithDir(projectDir), process.WithEnv("UV_INDEX_URL", indexURL)) + } else { + _, err = process.Background(ctx, args, process.WithDir(projectDir)) + } if err != nil { - return NewError(ErrProvisionFailed, err, "uv sync failed") + return uvFailure(ErrProvisionFailed, err, "uv sync") } return nil } @@ -102,9 +117,15 @@ func venvPython(projectDir string) string { // activated. func (m *uvManager) PostProvision(ctx context.Context, projectDir string) error { args := append([]string{m.bin}, m.pipSeedArgs(venvPython(projectDir))...) - _, err := process.Background(ctx, args, process.WithDir(projectDir)) + indexURL := m.resolveIndexURL(ctx) + var err error + if indexURL != "" { + _, err = process.Background(ctx, args, process.WithDir(projectDir), process.WithEnv("UV_INDEX_URL", indexURL)) + } else { + _, err = process.Background(ctx, args, process.WithDir(projectDir)) + } if err != nil { - return NewError(ErrProvisionFailed, err, "uv pip seed failed") + return uvFailure(ErrProvisionFailed, err, "uv pip seed") } return nil } @@ -120,7 +141,7 @@ func (m *uvManager) Validate(ctx context.Context, projectDir string) (string, st process.WithDir(projectDir), ) if err != nil { - return "", "", NewError(ErrValidationFailed, err, "uv run python validation failed") + return "", "", uvFailure(ErrValidationFailed, err, "uv run python validation") } lines := strings.Split(strings.TrimSpace(out), "\n") if len(lines) < 2 { @@ -144,6 +165,57 @@ func (m *uvManager) pipSeedArgs(venvPython string) []string { return []string{"pip", "install", "pip", "--python", venvPython} } +// pipIndexURLRe matches `index-url = ` lines in pip.conf. +var pipIndexURLRe = regexp.MustCompile(`(?i)^\s*index-url\s*=\s*(\S+)`) + +// pipConfIndexURL reads ~/.config/pip/pip.conf and returns the index-url value. +// uv ignores pip.conf; on Databricks-managed machines pypi.org is blocked and +// the corporate PyPI proxy is declared via pip.conf. Bridging the value through +// UV_INDEX_URL lets uv reach the proxy. +// https://pip.pypa.io/en/stable/topics/configuration/ +func pipConfIndexURL(ctx context.Context) string { + home, err := env.UserHomeDir(ctx) + if err != nil || home == "" { + return "" + } + confPath := filepath.Join(home, ".config", "pip", "pip.conf") + f, err := os.Open(confPath) + if err != nil { + return "" + } + defer f.Close() + + scanner := bufio.NewScanner(f) + for scanner.Scan() { + if m := pipIndexURLRe.FindStringSubmatch(scanner.Text()); m != nil { + return strings.TrimSpace(m[1]) + } + } + return "" +} + +// resolveIndexURL returns a UV_INDEX_URL value to inject, or "" when none is +// needed. It returns "" when UV_INDEX_URL is already set in the context env +// (so the caller's explicit value is never overridden) and also when pip.conf +// has no index-url entry. +func (m *uvManager) resolveIndexURL(ctx context.Context) string { + if _, ok := env.Lookup(ctx, "UV_INDEX_URL"); ok { + return "" + } + return pipConfIndexURL(ctx) +} + +// uvFailure builds a PipelineError from a failed uv invocation, appending uv's +// stderr to the message so callers can see the actual failure reason (e.g. +// "Connection refused") rather than just the exit code. +func uvFailure(code ErrorCode, err error, action string) *PipelineError { + msg := action + " failed" + if perr, ok := errors.AsType[*process.ProcessError](err); ok && strings.TrimSpace(perr.Stderr) != "" { + msg = msg + ": " + strings.TrimSpace(perr.Stderr) + } + return NewError(code, err, "%s", msg) +} + // discoverUv searches for the uv binary on PATH and in well-known install // locations. It returns NewError(ErrUvUnavailable, ...) if uv is not found. // diff --git a/libs/dbconnect/uv_test.go b/libs/dbconnect/uv_test.go index 6552cd60fa3..113a3d43726 100644 --- a/libs/dbconnect/uv_test.go +++ b/libs/dbconnect/uv_test.go @@ -1,10 +1,13 @@ package dbconnect import ( + "errors" "os" "path/filepath" "testing" + "github.com/databricks/cli/libs/env" + "github.com/databricks/cli/libs/process" "github.com/stretchr/testify/assert" "github.com/stretchr/testify/require" ) @@ -25,3 +28,99 @@ func TestDiscoverUvFindsBinOnPath(t *testing.T) { require.NoError(t, err) assert.Equal(t, bin, got) } + +func TestPipConfIndexURL(t *testing.T) { + t.Run("returns_url_from_pip_conf", func(t *testing.T) { + tmp := t.TempDir() + confDir := filepath.Join(tmp, ".config", "pip") + require.NoError(t, os.MkdirAll(confDir, 0o755)) + confContent := "[global]\nindex-url = https://proxy.example/simple\n" + require.NoError(t, os.WriteFile(filepath.Join(confDir, "pip.conf"), []byte(confContent), 0o644)) + + ctx := env.WithUserHomeDir(t.Context(), tmp) + got := pipConfIndexURL(ctx) + assert.Equal(t, "https://proxy.example/simple", got) + }) + + t.Run("returns_empty_when_no_pip_conf", func(t *testing.T) { + tmp := t.TempDir() + ctx := env.WithUserHomeDir(t.Context(), tmp) + got := pipConfIndexURL(ctx) + assert.Empty(t, got) + }) + + t.Run("returns_empty_when_no_index_url_in_conf", func(t *testing.T) { + tmp := t.TempDir() + confDir := filepath.Join(tmp, ".config", "pip") + require.NoError(t, os.MkdirAll(confDir, 0o755)) + confContent := "[global]\nextra-index-url = https://other.example/simple\n" + require.NoError(t, os.WriteFile(filepath.Join(confDir, "pip.conf"), []byte(confContent), 0o644)) + + ctx := env.WithUserHomeDir(t.Context(), tmp) + got := pipConfIndexURL(ctx) + assert.Empty(t, got) + }) +} + +func TestResolveIndexURLRespectsExistingEnv(t *testing.T) { + m := &uvManager{} + + t.Run("returns_empty_when_UV_INDEX_URL_already_set", func(t *testing.T) { + // When UV_INDEX_URL is in ctx, resolveIndexURL must not override it. + ctx := env.Set(t.Context(), "UV_INDEX_URL", "https://explicit.example/simple") + + // Set up a pip.conf that would otherwise be used. + tmp := t.TempDir() + confDir := filepath.Join(tmp, ".config", "pip") + require.NoError(t, os.MkdirAll(confDir, 0o755)) + confContent := "[global]\nindex-url = https://proxy.example/simple\n" + require.NoError(t, os.WriteFile(filepath.Join(confDir, "pip.conf"), []byte(confContent), 0o644)) + ctx = env.WithUserHomeDir(ctx, tmp) + + got := m.resolveIndexURL(ctx) + assert.Empty(t, got) + }) + + t.Run("returns_pip_conf_url_when_UV_INDEX_URL_unset", func(t *testing.T) { + tmp := t.TempDir() + confDir := filepath.Join(tmp, ".config", "pip") + require.NoError(t, os.MkdirAll(confDir, 0o755)) + confContent := "[global]\nindex-url = https://proxy.example/simple\n" + require.NoError(t, os.WriteFile(filepath.Join(confDir, "pip.conf"), []byte(confContent), 0o644)) + + ctx := env.WithUserHomeDir(t.Context(), tmp) + got := m.resolveIndexURL(ctx) + assert.Equal(t, "https://proxy.example/simple", got) + }) +} + +func TestUvFailureIncludesStderr(t *testing.T) { + t.Run("includes_stderr_when_present", func(t *testing.T) { + underlying := &process.ProcessError{ + Command: "uv sync", + Err: errors.New("exit status 2"), + Stderr: "error: Connection refused\n", + } + pe := uvFailure(ErrProvisionFailed, underlying, "uv sync") + assert.Equal(t, ErrProvisionFailed, pe.Code) + assert.Contains(t, pe.Msg, "Connection refused") + assert.NotEqual(t, '\n', pe.Msg[len(pe.Msg)-1], "Msg must not end with a newline") + }) + + t.Run("omits_stderr_suffix_when_empty", func(t *testing.T) { + underlying := &process.ProcessError{ + Command: "uv sync", + Err: errors.New("exit status 2"), + Stderr: "", + } + pe := uvFailure(ErrProvisionFailed, underlying, "uv sync") + assert.Equal(t, ErrProvisionFailed, pe.Code) + assert.Equal(t, "uv sync failed", pe.Msg) + }) + + t.Run("non_process_error_uses_action_only", func(t *testing.T) { + pe := uvFailure(ErrProvisionFailed, errors.New("some other error"), "uv sync") + assert.Equal(t, ErrProvisionFailed, pe.Code) + assert.Equal(t, "uv sync failed", pe.Msg) + }) +} From 1eb07e2c1a3017c60645b29b81df95e80cffcd84 Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Mon, 22 Jun 2026 17:34:48 +0200 Subject: [PATCH 32/33] Add diagnostic logging and a failure hint for remote troubleshooting Co-authored-by: Isaac --- acceptance/dbconnect/cluster-unsupported/output.txt | 1 + acceptance/dbconnect/no-target/output.txt | 1 + cmd/dbconnect/output.go | 1 + libs/dbconnect/constraints.go | 6 ++++-- libs/dbconnect/pipeline.go | 9 +++++++++ libs/dbconnect/uv.go | 11 ++++++++++- 6 files changed, 26 insertions(+), 3 deletions(-) diff --git a/acceptance/dbconnect/cluster-unsupported/output.txt b/acceptance/dbconnect/cluster-unsupported/output.txt index 63bc819b9b3..bbfaf955a4a 100644 --- a/acceptance/dbconnect/cluster-unsupported/output.txt +++ b/acceptance/dbconnect/cluster-unsupported/output.txt @@ -4,4 +4,5 @@ status=ok kind=cluster envKey=dbr/15.4.x-scala2.12 === Phase 3: fetch === status=failed fetch constraints for dbr/15.4.x-scala2.12: GET [DATABRICKS_URL]/dbr/15.4.x-scala2.12/pyproject.toml: unexpected status 404 Not Found +For more detail, re-run with --debug, or --output json to share a structured report. Error: fetch constraints for dbr/15.4.x-scala2.12: GET [DATABRICKS_URL]/dbr/15.4.x-scala2.12/pyproject.toml: unexpected status 404 Not Found diff --git a/acceptance/dbconnect/no-target/output.txt b/acceptance/dbconnect/no-target/output.txt index e7908e7098c..c2eb8434f86 100644 --- a/acceptance/dbconnect/no-target/output.txt +++ b/acceptance/dbconnect/no-target/output.txt @@ -2,4 +2,5 @@ status=ok uv [UV_VERSION] === Phase 2: resolve === status=failed No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job +For more detail, re-run with --debug, or --output json to share a structured report. Error: No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job diff --git a/cmd/dbconnect/output.go b/cmd/dbconnect/output.go index 3a1dcc28b9a..57c08f026c2 100644 --- a/cmd/dbconnect/output.go +++ b/cmd/dbconnect/output.go @@ -54,6 +54,7 @@ func renderResult(ctx context.Context, cmd *cobra.Command, res *libsdbconnect.Re } if pipelineErr != nil { + cmdio.LogString(ctx, "For more detail, re-run with --debug, or --output json to share a structured report.") return pipelineErr } diff --git a/libs/dbconnect/constraints.go b/libs/dbconnect/constraints.go index 5f1f0901fa2..517db457dc9 100644 --- a/libs/dbconnect/constraints.go +++ b/libs/dbconnect/constraints.go @@ -45,8 +45,10 @@ func FetchConstraints(ctx context.Context, baseURL, envKey, cacheDir string) (*C data, fetchErr := fetchURL(ctx, url) if fetchErr == nil { - // Write the cache copy; ignore errors so a read-only cacheDir is non-fatal. - _ = os.WriteFile(cachePath, data, 0o600) + // Write the cache copy; non-fatal so a read-only cacheDir doesn't break the command. + if err := os.WriteFile(cachePath, data, 0o600); err != nil { + log.Debugf(ctx, "failed to write constraint cache %s: %v", filepath.ToSlash(cachePath), err) + } rp, dbc, deps, err := parseConstraints(data) if err != nil { return nil, fmt.Errorf("parse constraints for %s: %w", envKey, err) diff --git a/libs/dbconnect/pipeline.go b/libs/dbconnect/pipeline.go index ead9f095a3f..3df333cefea 100644 --- a/libs/dbconnect/pipeline.go +++ b/libs/dbconnect/pipeline.go @@ -8,6 +8,7 @@ import ( "path/filepath" "strings" + "github.com/databricks/cli/libs/log" "github.com/hexops/gotextdiff" "github.com/hexops/gotextdiff/myers" "github.com/hexops/gotextdiff/span" @@ -29,6 +30,14 @@ type Pipeline struct { // Run executes all pipeline phases in order and returns a fully populated Result. // On a phase error, Result.Error is set and the same error is also returned. func (p *Pipeline) Run(ctx context.Context) (*Result, error) { + log.Debugf(ctx, "dbconnect: mode=%s project=%s cacheDir=%s constraintBaseURL=%s flags=%+v", + p.Mode, + filepath.ToSlash(p.ProjectDir), + filepath.ToSlash(p.CacheDir), + p.ConstraintBaseURL, + p.Flags, + ) + res := &Result{ Mode: p.Mode.String(), Check: p.Check, diff --git a/libs/dbconnect/uv.go b/libs/dbconnect/uv.go index 81aea7262b4..2348a1a9e46 100644 --- a/libs/dbconnect/uv.go +++ b/libs/dbconnect/uv.go @@ -12,6 +12,7 @@ import ( "strings" "github.com/databricks/cli/libs/env" + "github.com/databricks/cli/libs/log" "github.com/databricks/cli/libs/process" ) @@ -56,6 +57,7 @@ func (m *uvManager) EnsureAvailable(ctx context.Context) (string, error) { return "", err } } + log.Debugf(ctx, "uv: discovered binary at %s", bin) m.bin = bin // Use --version (not "version") to avoid project-scoped sub-command that requires pyproject.toml. @@ -200,9 +202,16 @@ func pipConfIndexURL(ctx context.Context) string { // has no index-url entry. func (m *uvManager) resolveIndexURL(ctx context.Context) string { if _, ok := env.Lookup(ctx, "UV_INDEX_URL"); ok { + log.Debugf(ctx, "uv: UV_INDEX_URL already set in environment, not overriding") return "" } - return pipConfIndexURL(ctx) + url := pipConfIndexURL(ctx) + if url != "" { + log.Debugf(ctx, "uv: using package index %s from pip.conf", url) + } else { + log.Debugf(ctx, "uv: no UV_INDEX_URL and no index-url in pip.conf; uv will use its default index (pypi.org)") + } + return url } // uvFailure builds a PipelineError from a failed uv invocation, appending uv's From 8f656d987b002380add93235477db7ea96aa23fc Mon Sep 17 00:00:00 2001 From: Grigory Panov Date: Tue, 23 Jun 2026 13:28:35 +0200 Subject: [PATCH 33/33] Remove superpowers design/plan docs from PR These are internal process artifacts and don't belong in the databricks/cli tree. Co-authored-by: Isaac --- .../plans/2026-06-19-dbconnect-init-sync.md | 1003 ----------------- .../2026-06-19-dbconnect-init-sync-design.md | 323 ------ 2 files changed, 1326 deletions(-) delete mode 100644 docs/superpowers/plans/2026-06-19-dbconnect-init-sync.md delete mode 100644 docs/superpowers/specs/2026-06-19-dbconnect-init-sync-design.md diff --git a/docs/superpowers/plans/2026-06-19-dbconnect-init-sync.md b/docs/superpowers/plans/2026-06-19-dbconnect-init-sync.md deleted file mode 100644 index b33f49bbc91..00000000000 --- a/docs/superpowers/plans/2026-06-19-dbconnect-init-sync.md +++ /dev/null @@ -1,1003 +0,0 @@ -# `databricks dbconnect init` / `sync` Implementation Plan - -> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. - -**Goal:** Add a `databricks dbconnect` command namespace with `init` and `sync` subcommands that provision a local Python `.venv` matched to the user's Databricks compute target (Python version, `databricks-connect` pin, and dependency constraints). - -**Architecture:** A thin Cobra layer in `cmd/dbconnect/` wires flags and rendering; all logic lives in a unit-testable `libs/dbconnect/` package built around a shared phase `Pipeline` (parameterized by `Mode = Init|Sync`) and a `PackageManager` interface (uv only in this PR). Target resolution uses the SDK Clusters/Jobs APIs and the bundle's configured target; constraints are fetched from a configurable base URL; the `pyproject.toml` merge is surgical (formatting/comment-preserving). - -**Tech Stack:** Go, Cobra, `github.com/databricks/databricks-sdk-go` (compute/jobs APIs), `github.com/BurntSushi/toml` (read-only parsing — already vendored), `libs/cmdio` (output), `libs/process` (uv shell-outs), `libs/cmdctx`/`cmd/root` (workspace client + bundle). - -## Global Constraints - -- **No new third-party dependency.** Use the already-vendored `github.com/BurntSushi/toml` for reading TOML; never use it to write the user's file. -- **Hand-written command, not codegen.** Do NOT touch `.codegen/` or run `generate-cligen`. -- **`--json` is the global `--output json` flag**, accessed via `root.OutputType(cmd)` returning `flags.Output` (`flags.OutputText`/`flags.OutputJSON`); render with `cmdio.Render(ctx, v)`. Do NOT add a custom `--json` flag. -- **Errors:** wrap with `%w`; compare with `errors.Is`/`errors.As` against sentinels, never `err.Error()` string content. Structured errors carry a stable `code`. -- **Env vars** in library/product code via `github.com/databricks/cli/libs/env` (`env.Get(ctx, ...)`/`env.Lookup(ctx, ...)`), not `os.Getenv`. -- **Logging** via `github.com/databricks/cli/libs/log` (`log.Warnf`/`Debugf`), stdout via `cmdio.LogString`. Paths printed with `filepath.ToSlash`. -- **Context** passed as an argument, never stored in a struct; never `context.Background()` outside `main`; tests use `t.Context()`. -- **Modern Go idioms:** `for i := range N`, `min`/`max` builtins, `switch` for same-decision alternatives, early-return for ordered precedence, collapse `if err != nil { return err }; return nil` to `return err`. -- **Test fixture hosts** use a reserved TLD (`.test`/`.invalid`). -- **Reference URLs in comments** when integrating an external tool/endpoint (uv installer, constraint repo, pip.conf). -- One focused PR; `NEXT_CHANGELOG.md` entry under **CLI**. - -## Constants (verbatim, used across tasks) - -- Default constraint base URL: `https://raw.githubusercontent.com/pietern/databricks-environments/main` -- Constraint base URL override env var: `DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE` -- envKey for serverless: `serverless/serverless-{vN}` (e.g. `serverless/serverless-v4`) -- envKey for clusters/jobs: `dbr/{spark_version}` where `{spark_version}` is the cluster's `SparkVersion` verbatim (e.g. `dbr/15.4.x-scala2.12`) -- Backup suffix: `.bak` -- Managed-block marker (start): `# managed by databricks dbconnect — do not edit` -- Managed-block marker (end): `# end managed by databricks dbconnect` -- uv installer URL (comment reference only): `https://astral.sh/uv/install.sh` - -## File Structure - -``` -cmd/dbconnect/ - dbconnect.go New() *cobra.Command; "dbconnect" group; registers init + sync - init.go newInitCommand(): flag wiring + RunE -> pipeline.Run(Init) - sync.go newSyncCommand(): flag wiring + RunE -> pipeline.Run(Sync) - output.go renderResult(ctx, cmd, *dbconnect.Result) — text vs JSON - -libs/dbconnect/ - result.go Mode, Result, Plan, TargetInfo, ConstraintInfo, PhaseResult, PipelineError, error codes - envkey.go EnvKeyForServerless, EnvKeyForSparkVersion, PythonMinorFromRequires - constraints.go Constraints struct; FetchConstraints(ctx, baseURL, envKey) (+ cache) - merge.go MergeManaged(target []byte, c Constraints) (merged []byte, regions []string, err error) - target.go TargetResolver, ResolveTarget(...) (*TargetInfo, error) - pkgmanager.go PackageManager interface - uv.go uvManager implements PackageManager - pipeline.go Pipeline struct + Run(ctx) - -acceptance/dbconnect/ - serverless-check/ , serverless-json/ , no-target/ , cluster-unsupported/ , flag-conflict/ -``` - ---- - -### Task 1: Scaffold the command namespace + registration - -**Files:** -- Create: `cmd/dbconnect/dbconnect.go` -- Create: `cmd/dbconnect/init.go` -- Create: `cmd/dbconnect/sync.go` -- Modify: `cmd/cmd.go` (import + `cli.AddCommand(dbconnect.New())`) -- Test: `acceptance/dbconnect/help/` (golden `output.txt`) - -**Interfaces:** -- Produces: `func New() *cobra.Command` (the `dbconnect` group); private `newInitCommand()`/`newSyncCommand() *cobra.Command`. - -- [ ] **Step 1: Create the namespace command.** `cmd/dbconnect/dbconnect.go`: - -```go -package dbconnect - -import "github.com/spf13/cobra" - -// New returns the `dbconnect` command group. -func New() *cobra.Command { - cmd := &cobra.Command{ - Use: "dbconnect", - Short: "Set up a local Python environment matched to your Databricks compute", - GroupID: "development", - Long: `Set up a local Python environment matched to your Databricks compute target. - -Derives the Python version, databricks-connect version, and dependency -constraints from the selected compute (cluster, serverless, or job) so that -local resolution matches the Databricks runtime.`, - } - cmd.AddCommand(newInitCommand()) - cmd.AddCommand(newSyncCommand()) - return cmd -} -``` - -- [ ] **Step 2: Create stub subcommands.** `cmd/dbconnect/init.go`: - -```go -package dbconnect - -import ( - "github.com/databricks/cli/cmd/root" - "github.com/spf13/cobra" -) - -func newInitCommand() *cobra.Command { - cmd := &cobra.Command{ - Use: "init", - Short: "Create a fresh pyproject.toml and provision a matched .venv", - } - cmd.PreRunE = root.MustWorkspaceClient - cmd.RunE = func(cmd *cobra.Command, args []string) error { - return nil - } - return cmd -} -``` - -`cmd/dbconnect/sync.go` is identical except `Use: "sync"`, `Short: "Merge managed dependencies into an existing pyproject.toml and re-provision"`, and `newSyncCommand`. - -- [ ] **Step 3: Register in `cmd/cmd.go`.** Add import `"github.com/databricks/cli/cmd/dbconnect"` (alphabetical, near the `psql` import) and, in the "other subcommands" block next to `cli.AddCommand(psql.New())`: - -```go - cli.AddCommand(dbconnect.New()) -``` - -- [ ] **Step 4: Build.** - -Run: `./task build` -Expected: builds clean. - -- [ ] **Step 5: Verify the command appears.** - -Run: `./bin/databricks dbconnect --help` -Expected: shows `init` and `sync` subcommands. - -- [ ] **Step 6: Add a help acceptance test.** Create `acceptance/dbconnect/help/script` containing: - -``` -$CLI dbconnect --help -$CLI dbconnect init --help -``` - -Then generate the golden output: - -Run: `go test ./acceptance -run 'TestAccept/dbconnect/help' -tail -test.v -update` -Expected: creates `acceptance/dbconnect/help/output.txt`; re-running without `-update` passes. - -- [ ] **Step 7: Commit.** - -```bash -git add cmd/dbconnect/ cmd/cmd.go acceptance/dbconnect/help/ -git commit -m "Add dbconnect command namespace scaffold" -``` - ---- - -### Task 2: Result types + error codes (`result.go`) - -**Files:** -- Create: `libs/dbconnect/result.go` -- Test: `libs/dbconnect/result_test.go` - -**Interfaces:** -- Produces: - - `type Mode int` with `const ( ModeInit Mode = iota; ModeSync )` and `func (m Mode) String() string` (`"init"`/`"sync"`). - - `type ErrorCode string` with consts: `ErrNoTargetSelected="no_target_selected"`, `ErrClusterUnsupported="cluster_unsupported"`, `ErrConstraintFetchFailed="constraint_fetch_failed"`, `ErrMergeFailed="merge_failed"`, `ErrProvisionFailed="provision_failed"`, `ErrValidationFailed="validation_failed"`, `ErrUvUnavailable="uv_unavailable"`. - - `type PipelineError struct { Code ErrorCode; Msg string; Err error }` with `func (e *PipelineError) Error() string` and `func (e *PipelineError) Unwrap() error`. - - `func NewError(code ErrorCode, err error, format string, args ...any) *PipelineError`. - - Structs (all with `json:"..."` tags matching the spec): `TargetInfo{Kind, ClusterID, SparkVersion, EnvKey, PythonVersion string; Fallback *FallbackInfo}`, `FallbackInfo{Requested, Resolved string}`, `ConstraintInfo{SourceURL string; FromCache bool; RequiresPython, DatabricksConnect string; ConstraintCount int}`, `Plan{PyprojectPath, BackupPath, Diff string; ChangedRegions []string}`, `PhaseResult{Name, Status, Detail string}`, `ResultDetail{Status, VenvPath, PythonVersion, DatabricksConnectInstalled string}`, `Result{Mode string; Check bool; Target *TargetInfo; Constraints *ConstraintInfo; Plan *Plan; Phases []PhaseResult; Result *ResultDetail; Error *PipelineError}`. - -- [ ] **Step 1: Write the failing test.** `libs/dbconnect/result_test.go`: - -```go -package dbconnect - -import ( - "errors" - "testing" - - "github.com/stretchr/testify/assert" -) - -func TestPipelineErrorWrapsAndExposesCode(t *testing.T) { - base := errors.New("boom") - err := NewError(ErrConstraintFetchFailed, base, "fetch %s", "x") - assert.Equal(t, "fetch x: boom", err.Error()) - assert.Equal(t, ErrConstraintFetchFailed, err.Code) - assert.True(t, errors.Is(err, base)) -} - -func TestModeString(t *testing.T) { - assert.Equal(t, "init", ModeInit.String()) - assert.Equal(t, "sync", ModeSync.String()) -} -``` - -- [ ] **Step 2: Run test to verify it fails.** - -Run: `go test ./libs/dbconnect/ -run 'TestPipelineError|TestModeString' -v` -Expected: FAIL (undefined symbols). - -- [ ] **Step 3: Implement `result.go`** with the types from the Interfaces block. `NewError` formats `Msg` via `fmt.Sprintf(format, args...)`; `Error()` returns `Msg` plus `": "+Err.Error()` when `Err != nil`; `Unwrap()` returns `Err`. - -- [ ] **Step 4: Run tests to verify they pass.** - -Run: `go test ./libs/dbconnect/ -run 'TestPipelineError|TestModeString' -v` -Expected: PASS. - -- [ ] **Step 5: Commit.** - -```bash -git add libs/dbconnect/result.go libs/dbconnect/result_test.go -git commit -m "Add dbconnect result types and error codes" -``` - ---- - -### Task 3: envKey mapping + Python-version parsing (`envkey.go`) - -**Files:** -- Create: `libs/dbconnect/envkey.go` -- Test: `libs/dbconnect/envkey_test.go` - -**Interfaces:** -- Produces: - - `func EnvKeyForServerless(version string) string` — normalizes `"4"`, `"v4"` → `"serverless/serverless-v4"`. - - `func EnvKeyForSparkVersion(sparkVersion string) string` — returns `"dbr/"+sparkVersion`. - - `func PythonMinorFromRequires(requiresPython string) (string, error)` — parses a PEP 440 `requires-python` (e.g. `"==3.12.*"`, `">=3.12"`, `"==3.12.3"`) and returns `"3.12"`. Error if no `MAJOR.MINOR` can be extracted. - -- [ ] **Step 1: Write the failing test.** `libs/dbconnect/envkey_test.go`: - -```go -package dbconnect - -import ( - "testing" - - "github.com/stretchr/testify/assert" - "github.com/stretchr/testify/require" -) - -func TestEnvKeyForServerless(t *testing.T) { - for _, in := range []string{"4", "v4", "V4"} { - assert.Equal(t, "serverless/serverless-v4", EnvKeyForServerless(in)) - } -} - -func TestEnvKeyForSparkVersion(t *testing.T) { - assert.Equal(t, "dbr/15.4.x-scala2.12", EnvKeyForSparkVersion("15.4.x-scala2.12")) -} - -func TestPythonMinorFromRequires(t *testing.T) { - cases := map[string]string{ - "==3.12.*": "3.12", - ">=3.12": "3.12", - "==3.12.3": "3.12", - "~=3.11": "3.11", - } - for in, want := range cases { - got, err := PythonMinorFromRequires(in) - require.NoError(t, err) - assert.Equal(t, want, got) - } - _, err := PythonMinorFromRequires("garbage") - assert.Error(t, err) -} -``` - -- [ ] **Step 2: Run test to verify it fails.** - -Run: `go test ./libs/dbconnect/ -run 'TestEnvKey|TestPythonMinor' -v` -Expected: FAIL (undefined). - -- [ ] **Step 3: Implement `envkey.go`.** `EnvKeyForServerless`: lowercase, strip leading `v`, format `serverless/serverless-v%s`. `EnvKeyForSparkVersion`: `"dbr/" + sparkVersion`. `PythonMinorFromRequires`: use `regexp.MustCompile(\`(\d+)\.(\d+)\`)`, `FindStringSubmatch`; on no match return `fmt.Errorf("cannot parse python version from %q", requiresPython)`. - -- [ ] **Step 4: Run tests to verify they pass.** - -Run: `go test ./libs/dbconnect/ -run 'TestEnvKey|TestPythonMinor' -v` -Expected: PASS. - -- [ ] **Step 5: Commit.** - -```bash -git add libs/dbconnect/envkey.go libs/dbconnect/envkey_test.go -git commit -m "Add dbconnect envKey mapping and python-version parsing" -``` - ---- - -### Task 4: Constraint fetch + cache + parse (`constraints.go`) - -**Files:** -- Create: `libs/dbconnect/constraints.go` -- Test: `libs/dbconnect/constraints_test.go` - -**Interfaces:** -- Consumes: `ErrConstraintFetchFailed`, `NewError` (Task 2); `PythonMinorFromRequires` (Task 3). -- Produces: - - `type Constraints struct { EnvKey, SourceURL string; FromCache bool; RequiresPython, DatabricksConnect string; ConstraintDeps []string }`. - - `func FetchConstraints(ctx context.Context, baseURL, envKey, cacheDir string) (*Constraints, error)` — GET `baseURL+"/"+envKey+"/pyproject.toml"`; on HTTP success, parse and write a cache copy to `cacheDir/.toml`; on network/HTTP error, fall back to the cached file with a `log.Warnf` if present, else return `NewError(ErrConstraintFetchFailed, ...)`. `FromCache` reflects which path served the bytes. - - `func parseConstraints(data []byte) (requiresPython, dbconnect string, deps []string, err error)` — uses `toml.Unmarshal` into a struct mirroring `project.requires-python`, `dependency-groups.dev`, `tool.uv.constraint-dependencies`; selects the `dev` element whose despaced value starts with `databricks-connect`. - -- [ ] **Step 1: Write the failing test.** `libs/dbconnect/constraints_test.go`: - -```go -package dbconnect - -import ( - "net/http" - "net/http/httptest" - "testing" - - "github.com/stretchr/testify/assert" - "github.com/stretchr/testify/require" -) - -const sampleToml = `[project] -requires-python = "==3.12.*" - -[dependency-groups] -dev = [ - "databricks-connect~=17.2.0", - "pytest~=8.0", -] - -[tool.uv] -constraint-dependencies = [ - "pydantic~=2.10.6", - "anyio~=4.6.2", -] -` - -func TestParseConstraints(t *testing.T) { - rp, dbc, deps, err := parseConstraints([]byte(sampleToml)) - require.NoError(t, err) - assert.Equal(t, "==3.12.*", rp) - assert.Equal(t, "databricks-connect~=17.2.0", dbc) - assert.Equal(t, []string{"pydantic~=2.10.6", "anyio~=4.6.2"}, deps) -} - -func TestFetchConstraintsHTTP(t *testing.T) { - srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - assert.Equal(t, "/serverless/serverless-v4/pyproject.toml", r.URL.Path) - _, _ = w.Write([]byte(sampleToml)) - })) - defer srv.Close() - - c, err := FetchConstraints(t.Context(), srv.URL, "serverless/serverless-v4", t.TempDir()) - require.NoError(t, err) - assert.False(t, c.FromCache) - assert.Equal(t, "databricks-connect~=17.2.0", c.DatabricksConnect) - assert.Len(t, c.ConstraintDeps, 2) -} - -func TestFetchConstraintsFallsBackToCache(t *testing.T) { - cacheDir := t.TempDir() - // First, a successful fetch populates the cache. - good := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - _, _ = w.Write([]byte(sampleToml)) - })) - _, err := FetchConstraints(t.Context(), good.URL, "serverless/serverless-v4", cacheDir) - require.NoError(t, err) - good.Close() - - // Now the server is down; fetch must serve the cache. - c, err := FetchConstraints(t.Context(), good.URL, "serverless/serverless-v4", cacheDir) - require.NoError(t, err) - assert.True(t, c.FromCache) -} -``` - -- [ ] **Step 2: Run test to verify it fails.** - -Run: `go test ./libs/dbconnect/ -run 'TestParseConstraints|TestFetchConstraints' -v` -Expected: FAIL (undefined). - -- [ ] **Step 3: Implement `constraints.go`.** Build the URL; use an `http.Client` with the request's context (`http.NewRequestWithContext`). On a 2xx, read the body, call `parseConstraints`, write bytes to `filepath.Join(cacheDir, sanitize(envKey)+".toml")` (sanitize replaces `/` with `__`), set `FromCache=false`. On any transport error or non-2xx, attempt to read the cache file: if present, parse it, set `FromCache=true`, `log.Warnf(ctx, "constraint fetch failed, using cached copy: %v", err)`; if absent, return `NewError(ErrConstraintFetchFailed, err, "fetch constraints for %s", envKey)`. `parseConstraints` despaces each dev entry with `strings.ReplaceAll(s, " ", "")` before the `HasPrefix("databricks-connect")` check, but stores the original string. Add a comment citing the constraint repo URL. - -- [ ] **Step 4: Run tests to verify they pass.** - -Run: `go test ./libs/dbconnect/ -run 'TestParseConstraints|TestFetchConstraints' -v` -Expected: PASS. - -- [ ] **Step 5: Commit.** - -```bash -git add libs/dbconnect/constraints.go libs/dbconnect/constraints_test.go -git commit -m "Add dbconnect constraint fetch with offline cache" -``` - ---- - -### Task 5: Surgical TOML merge (`merge.go`) - -**Files:** -- Create: `libs/dbconnect/merge.go` -- Test: `libs/dbconnect/merge_test.go` -- Test fixtures: `libs/dbconnect/testdata/merge/*.toml` - -**Interfaces:** -- Consumes: `Constraints` (Task 4). -- Produces: - - `func MergeManaged(target []byte, c Constraints) (merged []byte, regions []string, err error)` — applies the three managed transforms below, preserving all other bytes (comments/order/whitespace). `regions` lists which of `"requires-python"`, `"databricks-connect"`, `"tool.uv.constraint-dependencies"` were changed. Idempotent: `MergeManaged(MergeManaged(x)) == MergeManaged(x)`. - - `func RenderFreshPyproject(projectName string, c Constraints) []byte` — produces a complete managed `pyproject.toml` for `init` on a project that has none (used by Task 8 only when no file exists; if a file exists, `init` overwrites via MergeManaged after backup). - -The three transforms: -1. `[project].requires-python` — replace the value of an existing `requires-python = ...` line within the `[project]` table, preserving indentation. If `[project]` exists without the key, insert the line directly under the `[project]` header. -2. The `databricks-connect` element inside `[dependency-groups].dev` — replace the existing element matching `"databricks-connect..."` in place, preserving leading indentation and trailing comma. -3. `[tool.uv].constraint-dependencies` — replace the marker-bracketed managed block; if no managed block exists, drop any plain `[tool.uv]` table we own and append a freshly rendered, marker-bracketed `[tool.uv]` block at end of file. - -- [ ] **Step 1: Write the failing tests.** `libs/dbconnect/merge_test.go`: - -```go -package dbconnect - -import ( - "testing" - - "github.com/stretchr/testify/assert" - "github.com/stretchr/testify/require" -) - -func testConstraints() Constraints { - return Constraints{ - RequiresPython: "==3.12.*", - DatabricksConnect: "databricks-connect~=17.2.0", - ConstraintDeps: []string{"pydantic~=2.10.6", "anyio~=4.6.2"}, - } -} - -func TestMergeReplacesRequiresPythonPreservingComments(t *testing.T) { - in := []byte(`[project] -name = "demo" -# keep this comment -requires-python = ">=3.10" - -[dependency-groups] -dev = [ - "databricks-connect~=16.0.0", - "pytest~=8.0", -] -`) - out, regions, err := MergeManaged(in, testConstraints()) - require.NoError(t, err) - assert.Contains(t, string(out), `requires-python = "==3.12.*"`) - assert.Contains(t, string(out), "# keep this comment") - assert.Contains(t, string(out), `"databricks-connect~=17.2.0",`) - assert.Contains(t, string(out), `"pytest~=8.0",`) - assert.Contains(t, regions, "requires-python") - assert.Contains(t, regions, "databricks-connect") - assert.Contains(t, regions, "tool.uv.constraint-dependencies") - assert.Contains(t, string(out), "pydantic~=2.10.6") -} - -func TestMergeIsIdempotent(t *testing.T) { - in := []byte(`[project] -requires-python = ">=3.10" - -[dependency-groups] -dev = [ - "databricks-connect~=16.0.0", -] -`) - once, _, err := MergeManaged(in, testConstraints()) - require.NoError(t, err) - twice, _, err := MergeManaged(once, testConstraints()) - require.NoError(t, err) - assert.Equal(t, string(once), string(twice)) -} - -func TestMergeInsertsRequiresPythonWhenMissing(t *testing.T) { - in := []byte(`[project] -name = "demo" - -[dependency-groups] -dev = ["databricks-connect~=16.0.0"] -`) - out, _, err := MergeManaged(in, testConstraints()) - require.NoError(t, err) - assert.Contains(t, string(out), `requires-python = "==3.12.*"`) -} - -func TestMergeReplacesExistingManagedToolUvBlock(t *testing.T) { - in := []byte(`[project] -requires-python = ">=3.10" - -[dependency-groups] -dev = ["databricks-connect~=16.0.0"] - -` + managedMarkerStart + ` -[tool.uv] -constraint-dependencies = [ - "stale~=1.0.0", -] -` + managedMarkerEnd + ` -`) - out, _, err := MergeManaged(in, testConstraints()) - require.NoError(t, err) - assert.NotContains(t, string(out), "stale~=1.0.0") - assert.Contains(t, string(out), "pydantic~=2.10.6") - // Only one managed block remains. - assert.Equal(t, 1, countOccurrences(string(out), managedMarkerStart)) -} -``` - -Add a tiny `countOccurrences` helper at the bottom of the test file using `strings.Count`. - -- [ ] **Step 2: Run tests to verify they fail.** - -Run: `go test ./libs/dbconnect/ -run 'TestMerge' -v` -Expected: FAIL (undefined `MergeManaged`, `managedMarkerStart`). - -- [ ] **Step 3: Implement `merge.go`.** Define `const managedMarkerStart = "# managed by databricks dbconnect — do not edit"` and `const managedMarkerEnd = "# end managed by databricks dbconnect"`. Normalize CRLF→LF on entry, restore the original line ending on exit (detect by presence of `\r\n` in input). Work on `strings.Split(s, "\n")`: - - **requires-python:** scan for a line matching `^\s*requires-python\s*=` (regexp) after the `[project]` header and before the next `^\[`; replace its value preserving the leading whitespace capture group. If absent, insert `requires-python = ""` right after the `[project]` header line. - - **databricks-connect:** scan within `[dependency-groups]` for a line containing `"databricks-connect`; replace the quoted token, preserving indentation and a trailing comma if the original had one. Record region only if a replacement happened. - - **tool.uv block:** if a marker-bracketed block exists, replace the lines between (and including) the markers; else remove any existing `[tool.uv]` table (header to next `^\[` or EOF) and append a freshly rendered block. Render: - ``` - - [tool.uv] - constraint-dependencies = [ - "dep1", - "dep2", - ] - - ``` - separated from prior content by exactly one blank line; file ends with a single trailing newline. - - `RenderFreshPyproject` builds a minimal `[project]` + `[dependency-groups].dev` (with the dbconnect pin) + the marker-bracketed `[tool.uv]` block. - -- [ ] **Step 4: Run tests to verify they pass.** - -Run: `go test ./libs/dbconnect/ -run 'TestMerge' -v` -Expected: PASS. - -- [ ] **Step 5: Add CRLF + quote-style edge-case tests** and make them pass (extend `merge_test.go`): - -```go -func TestMergePreservesCRLF(t *testing.T) { - in := []byte("[project]\r\nrequires-python = \">=3.10\"\r\n\r\n[dependency-groups]\r\ndev = [\"databricks-connect~=16.0.0\"]\r\n") - out, _, err := MergeManaged(in, testConstraints()) - require.NoError(t, err) - assert.Contains(t, string(out), "\r\n") - assert.Contains(t, string(out), `requires-python = "==3.12.*"`) -} -``` - -Run: `go test ./libs/dbconnect/ -run 'TestMerge' -v` -Expected: PASS. - -- [ ] **Step 6: Commit.** - -```bash -git add libs/dbconnect/merge.go libs/dbconnect/merge_test.go -git commit -m "Add surgical formatting-preserving pyproject.toml merge" -``` - ---- - -### Task 6: Target resolution (`target.go`) - -**Files:** -- Create: `libs/dbconnect/target.go` -- Test: `libs/dbconnect/target_test.go` - -**Interfaces:** -- Consumes: `TargetInfo`, `ErrNoTargetSelected`, `ErrClusterUnsupported`, `NewError` (Task 2); `EnvKeyForServerless`, `EnvKeyForSparkVersion` (Task 3). -- Produces: - - `type ComputeClient interface { GetClusterSparkVersion(ctx context.Context, clusterID string) (string, error); GetJobSparkVersion(ctx context.Context, jobID string) (string, isServerless bool, version string, err error) }` — a narrow seam over the SDK so tests stub it. (Job returns either a spark version or serverless marker.) - - `type TargetFlags struct { Cluster, Serverless, Job string }`. - - `type BundleTarget struct { ClusterID string; Serverless bool; Selected bool }` — the three-state result of reading the bundle's configured target (`Selected=false` ⇒ nothing selected). - - `func ResolveTarget(ctx context.Context, f TargetFlags, c ComputeClient, bt BundleTarget) (*TargetInfo, error)` — precedence: cluster flag → serverless flag → job flag → bundle target. Produces `TargetInfo` with `EnvKey` set; `PythonVersion` is filled later from the fetched constraints (left empty here). Three-state errors when falling back to the bundle. - - `func ValidateTargetFlags(f TargetFlags) error` — at most one of the three set (the Cobra layer also marks them mutually exclusive; this guards the library path). - -- [ ] **Step 1: Write the failing test.** `libs/dbconnect/target_test.go`: - -```go -package dbconnect - -import ( - "context" - "testing" - - "github.com/stretchr/testify/assert" - "github.com/stretchr/testify/require" -) - -type stubCompute struct { - clusterVersion string - clusterErr error -} - -func (s stubCompute) GetClusterSparkVersion(_ context.Context, _ string) (string, error) { - return s.clusterVersion, s.clusterErr -} -func (s stubCompute) GetJobSparkVersion(_ context.Context, _ string) (string, bool, string, error) { - return "", false, "", nil -} - -func TestResolveServerlessFlag(t *testing.T) { - ti, err := ResolveTarget(t.Context(), TargetFlags{Serverless: "v4"}, stubCompute{}, BundleTarget{}) - require.NoError(t, err) - assert.Equal(t, "serverless", ti.Kind) - assert.Equal(t, "serverless/serverless-v4", ti.EnvKey) -} - -func TestResolveClusterFlag(t *testing.T) { - c := stubCompute{clusterVersion: "15.4.x-scala2.12"} - ti, err := ResolveTarget(t.Context(), TargetFlags{Cluster: "abc"}, c, BundleTarget{}) - require.NoError(t, err) - assert.Equal(t, "cluster", ti.Kind) - assert.Equal(t, "dbr/15.4.x-scala2.12", ti.EnvKey) - assert.Equal(t, "abc", ti.ClusterID) -} - -func TestResolveBundleNothingSelected(t *testing.T) { - _, err := ResolveTarget(t.Context(), TargetFlags{}, stubCompute{}, BundleTarget{Selected: false}) - var pe *PipelineError - require.ErrorAs(t, err, &pe) - assert.Equal(t, ErrNoTargetSelected, pe.Code) -} - -func TestResolveBundleServerless(t *testing.T) { - ti, err := ResolveTarget(t.Context(), TargetFlags{}, stubCompute{}, BundleTarget{Selected: true, Serverless: true}) - require.NoError(t, err) - assert.Equal(t, "serverless/serverless-v4", ti.EnvKey) -} - -func TestValidateTargetFlagsMutuallyExclusive(t *testing.T) { - assert.Error(t, ValidateTargetFlags(TargetFlags{Cluster: "a", Serverless: "v4"})) - assert.NoError(t, ValidateTargetFlags(TargetFlags{Cluster: "a"})) -} -``` - -Note: `TestResolveBundleServerless` encodes the spec rule that a bundle serverless target with no recorded version defaults to `v4` (the script's documented stand-in). Add a code comment to that effect. - -- [ ] **Step 2: Run test to verify it fails.** - -Run: `go test ./libs/dbconnect/ -run 'TestResolve|TestValidateTargetFlags' -v` -Expected: FAIL (undefined). - -- [ ] **Step 3: Implement `target.go`** with ordered-precedence early returns. Cluster flag → `GetClusterSparkVersion` → `Kind:"cluster"`, `EnvKey: EnvKeyForSparkVersion(v)`. Serverless flag → normalize, `EnvKey: EnvKeyForServerless(v)`. Job flag → `GetJobSparkVersion`; serverless job → serverless envKey (default `v4`), else cluster envKey. No flag → read `bt`: `!Selected` → `NewError(ErrNoTargetSelected, nil, "No compute target is selected. Select a cluster or serverless target, or pass --cluster/--serverless/--job.")`; `Serverless` → serverless `v4` (with the documented-default comment); `ClusterID != ""` → resolve via `GetClusterSparkVersion`. `ValidateTargetFlags` counts non-empty fields; >1 → error naming the conflicting flags. - -- [ ] **Step 4: Run tests to verify they pass.** - -Run: `go test ./libs/dbconnect/ -run 'TestResolve|TestValidateTargetFlags' -v` -Expected: PASS. - -- [ ] **Step 5: Commit.** - -```bash -git add libs/dbconnect/target.go libs/dbconnect/target_test.go -git commit -m "Add dbconnect target resolution with three-state messaging" -``` - ---- - -### Task 7: PackageManager interface + uv implementation (`pkgmanager.go`, `uv.go`) - -**Files:** -- Create: `libs/dbconnect/pkgmanager.go` -- Create: `libs/dbconnect/uv.go` -- Test: `libs/dbconnect/uv_test.go` - -**Interfaces:** -- Consumes: `libs/process` for shell-outs; `ErrUvUnavailable`, `ErrProvisionFailed`, `NewError` (Task 2). -- Produces: - - `type PackageManager interface { Name() string; EnsureAvailable(ctx context.Context) (version string, err error); EnsurePython(ctx context.Context, minor string) error; Provision(ctx context.Context, projectDir string) error; PostProvision(ctx context.Context, projectDir string) error; Validate(ctx context.Context, projectDir string) (pythonVersion, dbconnectVersion string, err error) }`. - - `type uvManager struct { bin string }` implementing it; `func newUvManager() *uvManager`. - - `func discoverUv(ctx context.Context) (string, error)` — search `exec.LookPath`, then `~/.local/bin/uv`, `$XDG_BIN_HOME/uv`, `/opt/homebrew/bin/uv`, `/usr/local/bin/uv`. Returns the path or `NewError(ErrUvUnavailable, ...)`. (Bootstrapping via the installer is invoked by `EnsureAvailable` when discovery fails — guarded so tests can stub.) - -Because real uv shell-outs can't run in unit tests, `uv_test.go` covers `discoverUv` path logic (with a fake bin dir on a temp `PATH`) and the argument construction of each command via a small indirection: `uvManager` builds `[]string` arg slices through unexported helpers (`syncArgs()`, `pipSeedArgs(py string)`, `pythonInstallArgs(minor string)`) that are unit-tested directly. - -- [ ] **Step 1: Write the failing test.** `libs/dbconnect/uv_test.go`: - -```go -package dbconnect - -import ( - "os" - "path/filepath" - "testing" - - "github.com/stretchr/testify/assert" - "github.com/stretchr/testify/require" -) - -func TestUvArgs(t *testing.T) { - m := &uvManager{bin: "uv"} - assert.Equal(t, []string{"sync"}, m.syncArgs()) - assert.Equal(t, []string{"python", "install", "3.12"}, m.pythonInstallArgs("3.12")) - assert.Equal(t, []string{"pip", "install", "pip", "--python", "/p/.venv/bin/python"}, m.pipSeedArgs("/p/.venv/bin/python")) -} - -func TestDiscoverUvFindsBinOnPath(t *testing.T) { - dir := t.TempDir() - bin := filepath.Join(dir, "uv") - require.NoError(t, os.WriteFile(bin, []byte("#!/bin/sh\n"), 0o755)) - t.Setenv("PATH", dir) - got, err := discoverUv(t.Context()) - require.NoError(t, err) - assert.Equal(t, bin, got) -} -``` - -- [ ] **Step 2: Run test to verify it fails.** - -Run: `go test ./libs/dbconnect/ -run 'TestUv|TestDiscoverUv' -v` -Expected: FAIL (undefined). - -- [ ] **Step 3: Implement `pkgmanager.go` (interface only) and `uv.go`.** `discoverUv` uses `exec.LookPath` first, then the candidate list (expand `~` via `os.UserHomeDir`, read `XDG_BIN_HOME` via `env.Lookup`). The arg helpers return the slices asserted above. `Provision` runs `uv sync` in `projectDir` via `process.Background` (or the repo's standard `process` runner) with `ctx`. `PostProvision` runs `uv pip install pip --python ` and carries the full Phase 7 rationale comment from the script (VS Code pip fallback; uv venvs lack pip; `uv sync` strips pip). `Validate` runs `uv run --no-project python -c` to read the Python minor and `importlib.metadata.version("databricks-connect")`. `EnsureAvailable` calls `discoverUv`; on failure, runs the installer (`curl ... | sh`) with a reference-URL comment, then re-discovers; on still-missing, returns `ErrUvUnavailable`. - -- [ ] **Step 4: Run tests to verify they pass.** - -Run: `go test ./libs/dbconnect/ -run 'TestUv|TestDiscoverUv' -v` -Expected: PASS. - -- [ ] **Step 5: Commit.** - -```bash -git add libs/dbconnect/pkgmanager.go libs/dbconnect/uv.go libs/dbconnect/uv_test.go -git commit -m "Add PackageManager interface and uv implementation" -``` - ---- - -### Task 8: The pipeline (`pipeline.go`) - -**Files:** -- Create: `libs/dbconnect/pipeline.go` -- Test: `libs/dbconnect/pipeline_test.go` - -**Interfaces:** -- Consumes: every type above. -- Produces: - - `type Pipeline struct { Mode Mode; Check bool; ProjectDir string; ConstraintBaseURL string; CacheDir string; Flags TargetFlags; Compute ComputeClient; Bundle BundleTarget; PM PackageManager }`. - - `func (p *Pipeline) Run(ctx context.Context) (*Result, error)` — executes phases 1–8 (preflight folded into PM `EnsureAvailable`), honoring `Check` (stop after computing the plan/diff; no mutation). Returns a fully populated `*Result`; on a phase error, sets `Result.Error` and returns the error too. - - Phase methods are unexported (`resolve`, `fetch`, `mergePlan`, `applyMerge`, `provision`, `validate`), each appending a `PhaseResult`. - -Mode behavior: `ModeInit` — if `pyproject.toml` exists, back up to `.bak` then `MergeManaged`; if absent, `RenderFreshPyproject`. `ModeSync` — restore from `.bak` if present (else back up), then `MergeManaged`. - -- [ ] **Step 1: Write the failing test** (drives the full pipeline with stubbed Compute + PM + httptest constraint server, against a temp project dir). `libs/dbconnect/pipeline_test.go`: - -```go -package dbconnect - -import ( - "context" - "net/http" - "net/http/httptest" - "os" - "path/filepath" - "testing" - - "github.com/stretchr/testify/assert" - "github.com/stretchr/testify/require" -) - -type fakePM struct{ py, dbc string } - -func (fakePM) Name() string { return "fake" } -func (fakePM) EnsureAvailable(context.Context) (string, error) { return "fake 1.0", nil } -func (fakePM) EnsurePython(context.Context, string) error { return nil } -func (fakePM) Provision(context.Context, string) error { return nil } -func (fakePM) PostProvision(context.Context, string) error { return nil } -func (f fakePM) Validate(context.Context, string) (string, string, error) { - return f.py, f.dbc, nil -} - -func writeProject(t *testing.T) string { - dir := t.TempDir() - require.NoError(t, os.WriteFile(filepath.Join(dir, "pyproject.toml"), []byte(`[project] -name = "demo" -requires-python = ">=3.10" - -[dependency-groups] -dev = ["databricks-connect~=16.0.0"] -`), 0o644)) - return dir -} - -func newTestServer(t *testing.T) *httptest.Server { - return httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { - _, _ = w.Write([]byte(sampleToml)) - })) -} - -func TestPipelineCheckMutatesNothing(t *testing.T) { - dir := writeProject(t) - before, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) - srv := newTestServer(t) - defer srv.Close() - - p := &Pipeline{ - Mode: ModeSync, Check: true, ProjectDir: dir, - ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), - Flags: TargetFlags{Serverless: "v4"}, - Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, - } - res, err := p.Run(t.Context()) - require.NoError(t, err) - assert.True(t, res.Check) - require.NotNil(t, res.Plan) - assert.Contains(t, res.Plan.Diff, "==3.12.*") - after, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) - assert.Equal(t, string(before), string(after)) // unchanged -} - -func TestPipelineSyncProvisionsAndValidates(t *testing.T) { - dir := writeProject(t) - srv := newTestServer(t) - defer srv.Close() - - p := &Pipeline{ - Mode: ModeSync, ProjectDir: dir, - ConstraintBaseURL: srv.URL, CacheDir: t.TempDir(), - Flags: TargetFlags{Serverless: "v4"}, - Compute: stubCompute{}, PM: fakePM{py: "3.12", dbc: "17.2.0"}, - } - res, err := p.Run(t.Context()) - require.NoError(t, err) - require.NotNil(t, res.Result) - assert.Equal(t, "success", res.Result.Status) - assert.Equal(t, "3.12", res.Result.PythonVersion) - merged, _ := os.ReadFile(filepath.Join(dir, "pyproject.toml")) - assert.Contains(t, string(merged), `"databricks-connect~=17.2.0",`) - assert.FileExists(t, filepath.Join(dir, "pyproject.toml.bak")) -} -``` - -(`sampleToml`, `stubCompute` come from earlier test files in the same package.) - -- [ ] **Step 2: Run tests to verify they fail.** - -Run: `go test ./libs/dbconnect/ -run 'TestPipeline' -v` -Expected: FAIL (undefined `Pipeline`). - -- [ ] **Step 3: Implement `pipeline.go`.** `Run`: `EnsureAvailable` (record phase + `ErrUvUnavailable` on fail) → `resolve` (ResolveTarget) → `fetch` (FetchConstraints; fill `TargetInfo.PythonVersion` from `PythonMinorFromRequires(c.RequiresPython)`, build `ConstraintInfo`) → `mergePlan` (read existing file or empty; compute merged bytes via `MergeManaged`/`RenderFreshPyproject`; build `Plan` with a unified diff — use a small diff helper or `libs/textutil` if present, else a minimal line diff; set `ChangedRegions`). If `Check`, populate `Result` (Mode, Check, Target, Constraints, Plan) and return. Else `applyMerge` (Mode-specific backup/restore then write bytes) → `EnsurePython(py)` → `Provision` → `PostProvision` → `Validate` (assert minor==`py`; `databricks-connect` major matches the pin's major, else `ErrValidationFailed`) → populate `Result.Result`. Each phase appends a `PhaseResult{Name,Status,Detail}`. - -- [ ] **Step 4: Run tests to verify they pass.** - -Run: `go test ./libs/dbconnect/ -run 'TestPipeline' -v` -Expected: PASS. - -- [ ] **Step 5: Run the whole package.** - -Run: `go test ./libs/dbconnect/ -v` -Expected: all PASS. - -- [ ] **Step 6: Commit.** - -```bash -git add libs/dbconnect/pipeline.go libs/dbconnect/pipeline_test.go -git commit -m "Add dbconnect pipeline orchestrating all phases" -``` - ---- - -### Task 9: Wire the Cobra layer (flags, bundle/compute adapters, rendering) - -**Files:** -- Modify: `cmd/dbconnect/init.go`, `cmd/dbconnect/sync.go` -- Create: `cmd/dbconnect/output.go` -- Create: `cmd/dbconnect/compute.go` (SDK adapter implementing `dbconnect.ComputeClient`) - -**Interfaces:** -- Consumes: `dbconnect.Pipeline`, `dbconnect.ComputeClient`, `dbconnect.Result`, `root.OutputType`, `cmdctx.WorkspaceClient`, `root.TryConfigureBundle`. -- Produces: `func runPipeline(cmd *cobra.Command, mode dbconnect.Mode) error`; `type sdkCompute struct{ w *databricks.WorkspaceClient }` implementing `ComputeClient` via `w.Clusters.GetByClusterId` (→ `.SparkVersion`) and `w.Jobs.Get`. - -- [ ] **Step 1: Implement the shared `runPipeline`** in `init.go` (sync.go calls it with `ModeSync`). Read flags (`--cluster/--serverless/--job/--check/--constraint-source`), build `TargetFlags`, `ValidateTargetFlags`, resolve `ProjectDir` (cwd), `CacheDir` (`os.UserCacheDir()/databricks/dbconnect`), `ConstraintBaseURL` (flag → `env.Lookup(ctx, "DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE")` → default constant), `Compute: sdkCompute{w}`, `Bundle:` from `root.TryConfigureBundle` (map `ClusterId`/serverless mode → `BundleTarget`), `PM: newUvManager()`. Mark the three target flags mutually exclusive via `cmd.MarkFlagsMutuallyExclusive`. Call `p.Run(ctx)`, then `renderResult`. - -- [ ] **Step 2: Implement `output.go`** `renderResult(cmd, res, err)`: when `root.OutputType(cmd) == flags.OutputJSON`, `cmdio.Render(ctx, res)`; else print the phase headers + a success/plan summary mirroring the script (`=== Phase N ===` style via `cmdio.LogString`). On error, JSON path still renders `res` (with `res.Error` set); text path returns the wrapped error. - -- [ ] **Step 3: Implement `compute.go`** adapter. `GetClusterSparkVersion`: `d, err := w.Clusters.GetByClusterId(ctx, id)`; return `d.SparkVersion`. `GetJobSparkVersion`: `w.Jobs.Get`; inspect the job's first task/job-cluster for a `SparkVersion` or serverless. Add a comment if the job compute shape is non-obvious. - -- [ ] **Step 4: Build + manual smoke.** - -Run: `./task build && ./bin/databricks dbconnect init --serverless v4 --check --output json` -Expected: prints the JSON plan; no files changed. (Network to the constraint repo required; if offline, expect the `constraint_fetch_failed` code.) - -- [ ] **Step 5: Commit.** - -```bash -git add cmd/dbconnect/ -git commit -m "Wire dbconnect Cobra layer: flags, compute adapter, rendering" -``` - ---- - -### Task 10: Acceptance tests - -**Files:** -- Create: `acceptance/dbconnect/serverless-check/{script,output.txt}` -- Create: `acceptance/dbconnect/no-target/{script,output.txt}` -- Create: `acceptance/dbconnect/cluster-unsupported/{script,output.txt}` -- Create: `acceptance/dbconnect/flag-conflict/{script,output.txt}` -- Create: `acceptance/dbconnect/serverless-json/{script,output.txt}` -- Possibly: per-case `test.toml` to stub the constraint server + workspace (follow `acceptance/quickstart/` and any testserver-backed case). - -**Interfaces:** Consumes the built CLI via the acceptance harness `$CLI`. - -- [ ] **Step 1: Inspect an existing testserver-backed acceptance case** to copy the pattern for stubbing HTTP + the workspace client. - -Run: `ls acceptance/cmd/ acceptance/auth/ && sed -n '1,40p' acceptance/quickstart/script 2>/dev/null` -Expected: see how `script`, `output.txt`, and `test.toml` cooperate (env, replacements, stubbed server). - -- [ ] **Step 2: Write `flag-conflict`** (no network needed). `script`: - -``` -$CLI dbconnect init --cluster abc --serverless v4 -``` - -Generate golden: - -Run: `go test ./acceptance -run 'TestAccept/dbconnect/flag-conflict' -tail -test.v -update` -Expected: `output.txt` shows the mutual-exclusion error and non-zero exit. - -- [ ] **Step 3: Write `no-target`** (bundle with no compute selected, no flags). Provide a minimal `databricks.yml` fixture in the case dir; `script`: - -``` -$CLI dbconnect init -``` - -Run: `go test ./acceptance -run 'TestAccept/dbconnect/no-target' -tail -test.v -update` -Expected: `output.txt` shows the "No compute target is selected…" message. - -- [ ] **Step 4: Write `serverless-check`, `serverless-json`, `cluster-unsupported`** using the stubbed constraint server (point `DATABRICKS_DBCONNECT_CONSTRAINT_SOURCE` at the test server via `test.toml`/`script`). `serverless-check` runs `--serverless v4 --check`; `serverless-json` adds `--output json`; `cluster-unsupported` points `--cluster` at a stubbed cluster whose DBR has no constraint dir (server 404) → `cluster_unsupported`/`constraint_fetch_failed`. - -Run: `go test ./acceptance -run 'TestAccept/dbconnect' -tail -test.v -update` -Expected: all goldens created. - -- [ ] **Step 5: Verify without `-update`.** - -Run: `go test ./acceptance -run 'TestAccept/dbconnect' -tail -test.v` -Expected: all PASS. - -- [ ] **Step 6: Commit.** - -```bash -git add acceptance/dbconnect/ -git commit -m "Add dbconnect acceptance tests" -``` - ---- - -### Task 11: Changelog, lint, fmt, full suite - -**Files:** -- Modify: `NEXT_CHANGELOG.md` - -- [ ] **Step 1: Add the changelog entry** under `### CLI` in `NEXT_CHANGELOG.md`: - -```markdown -* Add `databricks dbconnect init` and `databricks dbconnect sync` to provision a local Python environment (Python version, `databricks-connect` pin, and dependency constraints) matched to the selected Databricks compute target. -``` - -- [ ] **Step 2: Format changed files.** - -Run: `./task fmt-q` -Expected: no diff or auto-applied formatting only. - -- [ ] **Step 3: Lint changed files.** - -Run: `./task lint-q` -Expected: clean (fix anything reported). - -- [ ] **Step 4: Full test suite.** - -Run: `./task test` -Expected: all PASS. - -- [ ] **Step 5: Commit.** - -```bash -git add NEXT_CHANGELOG.md -git commit -m "Add changelog entry for dbconnect init/sync" -``` - ---- - -## Self-Review - -**Spec coverage:** -- Namespace + `init`/`sync` → Tasks 1, 9. ✓ -- Phase pipeline (0–8) → Task 8 (preflight folded into PM.EnsureAvailable, Task 7). ✓ -- Shared flags `--cluster/--serverless/--job/--check/--json` → Task 9; `--json` realized as global `--output json` per Global Constraints. ✓ -- Target resolution via API + three-state messaging + full cluster/job → Tasks 6, 9. ✓ -- Robust surgical TOML merge of the 3 managed regions → Task 5. ✓ -- Constraint fetch (configurable URL) + offline cache → Task 4. ✓ -- Structured `--json` schema + `--check` dry-run → Tasks 2, 8, 9. ✓ -- uv branch incl. pip-seed (Phase 7) rationale → Task 7. ✓ -- Acceptance cases (serverless happy/check, no-target, cluster-stub→unsupported, --check, --json) → Task 10. ✓ -- Unit tests for merge/envkey/target/constraints → Tasks 3–8. ✓ -- Changelog + lint + fmt → Task 11. ✓ -- "uv only now, pip/conda later" → PackageManager interface (Task 7), no pip/conda files. ✓ -- No new dependency → uses vendored BurntSushi (read-only) + stdlib. ✓ - -**Placeholder scan:** No "TBD"/"handle edge cases"/"similar to". Each code step shows code; each run step shows command + expected output. The one explicit investigation step (Task 10 Step 1) is a deliberate "inspect existing pattern" action, not a placeholder. - -**Type consistency:** `MergeManaged`, `FetchConstraints`, `ResolveTarget`, `TargetFlags`, `BundleTarget`, `ComputeClient`, `PackageManager`, `Pipeline`, `Result`/`Plan`/`TargetInfo`/`ConstraintInfo` names are used identically across Tasks 2–9. `managedMarkerStart`/`managedMarkerEnd` consistent between Task 5 impl and tests. uv arg-helper names (`syncArgs`,`pythonInstallArgs`,`pipSeedArgs`) consistent between Task 7 impl and tests. - -**Known follow-ups (out of scope, noted for the implementer):** confirm the exact `databricks.yml` shape used to derive `BundleTarget` from `TryConfigureBundle` (cluster_id vs serverless mode) during Task 9; the SDK `Jobs.Get` compute shape may need a small comment per the repo's "non-obvious backend quirk" rule. diff --git a/docs/superpowers/specs/2026-06-19-dbconnect-init-sync-design.md b/docs/superpowers/specs/2026-06-19-dbconnect-init-sync-design.md deleted file mode 100644 index 47d93f6bf47..00000000000 --- a/docs/superpowers/specs/2026-06-19-dbconnect-init-sync-design.md +++ /dev/null @@ -1,323 +0,0 @@ -# `databricks dbconnect init` / `sync` — Design - -**Date:** 2026-06-19 -**Status:** Approved for planning -**Branch context:** Databricks CLI (Go) - -## Summary - -Promote the proven `dbconnect-init.sh` demo into a real CLI subcommand namespace, -`databricks dbconnect`, with two commands: `init` and `sync`. Starting from the -compute target the user already selected (cluster / serverless / job), the -command derives and provisions a matching local Python environment: the right -Python version, the right `databricks-connect` version, and dependency -constraints so local resolution matches the Databricks runtime — no version -guessing. - -The behavior is already implemented and verified as a 367-line bash script -(`dbconnect-init.sh` in the `databricks-vscode` repo). This design ports the -same phase pipeline to Go, with real API calls, a robust TOML merge, a -package-manager seam, and structured output. - -## Reference implementation (the spec) - -- **Script (source of truth for the pipeline):** - `/Users/grigory.panov/work/databricks-vscode/packages/databricks-vscode/resources/python/dbconnect-init.sh` -- **VS Code consumer (context on how it's invoked + the `--json` consumer):** - `/Users/grigory.panov/work/databricks-vscode/packages/databricks-vscode/src/language/VpexEnvironmentSetup.ts` - -The script logs each `=== Phase N ===` header; the Go port matches those -outcomes. We can diff Go behavior against a live run of the script. - -## Design decisions (resolved during brainstorming) - -1. **Constraint source of truth:** configurable base URL, defaulting to the - existing `databricks-environments` GitHub raw repo. Swap the default when an - official endpoint exists. (Overridable via flag + env var.) -2. **Default target resolution:** when no `--cluster`/`--serverless`/`--job` - flag is given, resolve from the **bundle's configured target** (the way - bundle commands do), NOT from the VS Code `vscode.overrides.json` artifact. - The standalone CLI does not read VS Code files. -3. **Package managers:** **uv only** in this PR, at full parity with the script. - A `PackageManager` interface is the seam; pip and conda land in later PRs as - additional files in the same package (no subpackages, no speculative stubs). -4. **`--json`:** a clean, documented, stable schema. VS Code adapts to it; we - are not bound by the current TypeScript interface (which today parses phase - headers from stdout, not JSON). -5. **TOML merge:** **surgical line edits** that preserve the user's formatting - and comments. There is no format-preserving TOML editor in Go (`go-toml/v2` - reformats just like the already-vendored `BurntSushi/toml`), so we use - BurntSushi only to READ the fetched values and validate structure, and apply - targeted line edits to write. No new dependency. -6. **Target resolution scope:** serverless is the working happy path; **cluster - and job compute resolution are also real** in this PR (SDK - `GetByClusterId` → `SparkVersion` → DBR → envKey, with nearest-supported - fallback). Unsupported runtimes return a clear error, never a crash or a - hard stub. - -## Architecture - -### Package layout - -``` -cmd/dbconnect/ - dbconnect.go New() *cobra.Command — "dbconnect" group, registers init + sync - init.go init subcommand: flag wiring + RunE -> pipeline.Run(Init) - sync.go sync subcommand: flag wiring + RunE -> pipeline.Run(Sync) - output.go text + --json rendering of the result/plan/errors - -libs/dbconnect/ - pipeline.go the shared phase pipeline (Mode = Init|Sync); orchestrates phases - target.go target resolution: flags + bundle target -> ResolvedTarget (envKey) - envkey.go DBR/serverless version -> envKey mapping (+ nearest-supported fallback) - constraints.go fetch constraint pyproject.toml (configurable base URL) + offline cache - merge.go surgical TOML merge of the 3 managed regions - pkgmanager.go PackageManager interface; uvManager implementation (uv.go) - result.go structured Result/Plan/Phase types (the --json schema) -``` - -**Rationale:** `cmd/dbconnect/` stays thin (Cobra wiring + rendering), mirroring -`cmd/psql/psql.go`. All logic lives in `libs/dbconnect/` so it is unit-testable -without a Cobra command. The `PackageManager` **interface** — not a directory -split — is what lets pip/conda land cleanly later; subpackages would create -import-cycle pressure (pipeline → pkgmanager → shared types) and would be -speculative scaffolding for deferred code. - -**Registration:** one line in `cmd/cmd.go`: -`cli.AddCommand(dbconnect.New())`, in the `development` ("Developer Tools") -group alongside `psql`. Hand-written workflow command — does NOT touch -`.codegen/` or run `generate-cligen`. - -### Control flow - -`init` and `sync` build the same `Pipeline` and call `Run(ctx)`. They differ in -exactly one phase behavior (Phase 3/4: write-fresh vs merge-into-existing), -selected by `Mode`. Every other phase is shared and runs once. - -## The phase pipeline - -`Pipeline.Run(ctx)` executes the script's phases in order. Each phase is a -method that returns an error and appends a `PhaseResult` to the accumulating -`Result`. `Mode` (Init|Sync) only changes Phase 3/4. - -| # | Phase | Go behavior | Δ from script | -|---|-------|-------------|---------------| -| 0 | Preflight | We *are* the CLI; auth comes from the resolved workspace client (`root.MustWorkspaceClient`). Discover uv from PATH + standard install locations (`~/.local/bin`, `$XDG_BIN_HOME`, Homebrew bins); bootstrap via the official installer if missing. Honor `UV_INDEX_URL` from `~/.config/pip/pip.conf` if unset. | No `databricks` binary probe. Auth via SDK, not `current-user me` shell-out. | -| 1 | Resolve target → envKey | Flags first (`--cluster`/`--serverless`/`--job`); else the bundle's configured target. Produce `ResolvedTarget{envKey, pythonVersion?}`. Preserve three-state messaging. | API calls, not file read. | -| 2 | Fetch constraints | GET `{baseURL}/{envKey}/pyproject.toml` via the CLI's HTTP client. Offline cache under the user cache dir; on network failure fall back to cache with a warning, else a clear error. | Configurable base URL + cache. | -| 3 | Baseline / idempotency | **Init:** write a fresh managed `pyproject.toml` (back up any existing to `.bak`). **Sync:** restore from `.bak` if present, else back it up, then merge. | Same idempotency model. | -| 4 | Merge managed regions | Surgical line edits to the 3 managed regions (see Merge section). | Robust merge, not regex. | -| 5 | Ensure Python | `PackageManager.EnsurePython(version)` — version from the resolved target, not hardcoded. | Version from target. | -| 6 | Provision | `PackageManager.Provision()` → `.venv` (uv: `uv sync`). | Interface seam. | -| 7 | Post-provision (pip seed) | `PackageManager.PostProvision()` — uv seeds pip into `.venv`; carries the script's full rationale comment (VS Code's `ms-python.vscode-python-envs` falls back to `python -m pip list` when its `uv --version` probe fails on the GUI PATH; uv venvs have no pip; `uv sync` strips pip, so seed runs after every sync). | uv-specific, behind the interface. | -| 8 | Validate | Assert `.venv` Python minor == target; `databricks-connect` matches the pin read from the fetched file. Populate `Result`. | Same asserts, structured output. | - -**`--check` (dry-run):** runs phases 0–2 (read-only: discover, resolve, fetch), -then computes and prints the plan + the unified diff that phase 4 would write, -and stops before any mutation. Mutating phases (3–8) are gated on `!check`. - -**Errors:** each phase wraps with `%w` and context. Structured errors carry a -stable `code` (e.g. `no_target_selected`, `cluster_unsupported`, -`constraint_fetch_failed`) so consumers branch on the code, never on message -text (repo rule: compare errors with sentinels, never `err.Error()` strings). - -**Cancellation:** phases respect `ctx`; long shell-outs (uv) run via -`libs/process` with the context so Ctrl-C / VS Code cancel terminates them. - -## Target resolution → envKey - -### Stage A — pick the target (ordered precedence, early-return style) - -1. `--cluster ` → SDK `w.Clusters.GetByClusterId(id)` → `SparkVersion` (the - DBR string, e.g. `15.4.x-scala2.12`). -2. `--serverless ` → serverless target, version `N`. -3. `--job ` → `w.Jobs.Get(id)`, read the job's compute (job cluster - `SparkVersion`, or serverless if the task is serverless). -4. No flag → the **bundle's configured target** (loaded the same way bundle - commands load it), read the selected target's compute. - -Flags are mutually exclusive (`cmd.MarkFlagsMutuallyExclusive`), rejected at -parse time (repo rule: reject incompatible inputs early with an actionable -error). - -### Stage B — three-state messaging (preserved from script lines 179–192) - -- **serverless selected** → proceed. -- **cluster selected** → resolve its DBR → envKey (implemented, not a stub). If - the DBR maps to no supported envKey, a clear "runtime X not yet supported" - error. -- **nothing selected** (bundle has no compute target) → actionable error: "No - compute target is selected. Select a cluster or serverless target, or pass - --cluster/--serverless/--job." - -### Stage C — version → envKey mapping (`envkey.go`) - -- **Serverless:** `vN` → `serverless/serverless-vN`. -- **Cluster/job DBR:** parse major.minor from `SparkVersion`, map to an envKey - via a small in-repo table, with **nearest-supported fallback** — if the exact - DBR isn't in the table, pick the closest supported one and warn, naming both. - -The table maps version → envKey *path* only. The constraint *pins* always come -from the fetched file, never from the table. - -## Surgical TOML merge (`merge.go`) - -**Goal:** touch only the 3 managed regions; preserve every byte the user owns -(comments, ordering, whitespace, their dependencies). - -**Read side (BurntSushi, already vendored):** parse the *fetched* env file into -a struct to extract the managed values authoritatively: -- `project.requires-python` (string) -- the `databricks-connect` pin from `dependency-groups.dev` -- `tool.uv.constraint-dependencies` ([]string) - -Also parse the *target* file with BurntSushi purely to validate it is -well-formed and to locate which regions exist before editing. We never write via -BurntSushi. - -**Write side (structured line edits)** — three idempotent transforms: - -1. **`requires-python`** — replace the value of the existing `requires-python =` - line under `[project]`, preserving indentation; if `[project]` exists but the - key doesn't, insert it. -2. **`databricks-connect` pin** — within `[dependency-groups].dev`, replace the - existing `"databricks-connect..."` element in place (preserve indentation and - trailing-comma style). -3. **`[tool.uv].constraint-dependencies`** — replace the whole managed block: - drop any existing block we previously wrote, append a freshly rendered one. - Bracketed with a discreet `# managed by databricks dbconnect` marker so - re-merges replace exactly our block without clobbering a user's own - `[tool.uv]` settings. - -**Edge cases the tests must cover** (where the script's regex breaks): -- multiline vs single-line arrays for the dev group and constraints -- single vs double quotes, trailing commas, comment lines inside arrays -- `[project]` present but no `requires-python` -- no `[tool.uv]` yet vs a pre-existing one (ours or the user's) -- CRLF files (Windows) — normalize on read, restore on write - -**Idempotency:** merging twice produces byte-identical output. - -**`--check` diff:** the merge produces the new content in memory; `--check` -renders a unified diff (old vs new) and writes nothing. - -## Output, flags & the `--json` schema - -### Flags (shared by both subcommands) - -| Flag | Type | Meaning | -|------|------|---------| -| `--cluster` | string | target a cluster (mutually exclusive) | -| `--serverless` | string | target serverless `vN` (mutually exclusive) | -| `--job` | string | target a job's compute (mutually exclusive) | -| `--check` | bool | dry-run: print plan + diff, mutate nothing | -| `--json` | bool | machine-readable output (wired via existing `cmdio` output plumbing) | -| `--constraint-source` | string | override the constraints base URL; default = `databricks-environments` repo. Also via env var. Advanced/hidden. | - -### `--json` schema (the documented contract) - -```jsonc -{ - "mode": "init" | "sync", - "check": false, - "target": { - "kind": "serverless" | "cluster" | "job", - "cluster_id": "…", - "spark_version": "15.4.x-…", - "env_key": "serverless/serverless-v4", - "python_version": "3.12", - "fallback": { "requested": "…", "resolved": "…" } - }, - "constraints": { - "source_url": "https://…/serverless-v4/pyproject.toml", - "from_cache": false, - "requires_python": ">=3.12", - "databricks_connect": "databricks-connect~=17.2.0", - "constraint_count": 42 - }, - "plan": { - "pyproject_path": "/abs/pyproject.toml", - "backup_path": "/abs/pyproject.toml.bak", - "diff": "--- …\n+++ …\n@@ …", - "changed_regions": ["requires-python", "databricks-connect", "tool.uv.constraint-dependencies"] - }, - "phases": [ - {"name": "preflight", "status": "ok", "detail": "uv 0.5.1"}, - {"name": "provision", "status": "ok"} - ], - "result": { - "status": "success" | "failed", - "venv_path": "/abs/.venv", - "python_version": "3.12", - "databricks_connect_installed": "17.2.0" - }, - "error": { "code": "no_target_selected", "message": "…" } -} -``` - -- Under `--check`, `plan` is computed and emitted; `phases` and `result` are - empty/omitted. -- `error` is present only on failure; `error.code` is an enumerated, documented, - stable set. - -**Text output** mirrors the script's `=== Phase N ===` headers and final success -summary (so VS Code's phase-regex narration keeps working). `--json` emits the -struct above and suppresses decorative phase logging. - -## Testing - -### Unit tests (`libs/dbconnect/`, table-driven) - -- **`merge_test.go`** — golden input pyproject + fetched constraints → expected - merged output, covering every edge case above. Idempotency test (merge twice → - identical). Diff test for `--check`. -- **`envkey_test.go`** — version→envKey incl. nearest-supported fallback and the - unsupported-runtime error. -- **`target_test.go`** — precedence (flag > bundle), mutual exclusivity, the - three states with their exact messages; SDK calls behind a small stubbed - interface. -- **`constraints_test.go`** — fetch success, cache hit on network failure, hard - failure with clear error; uses `httptest`. - -### Acceptance tests (`acceptance/dbconnect//`) - -Golden `output.txt` per the repo pattern (`acceptance/quickstart/`). uv and -network are unavailable in the sandbox, so these cover the deterministic, -mockable surface (resolution, messaging, merge, `--check`, `--json` shape) using -`libs/testserver` for the constraint fetch and stubbed compute: - -- `serverless-check` — `--serverless v4 --check`: plan + diff, no mutation. -- `serverless-json` — `--json` shape on the resolve+plan path. -- `no-target` — the "nothing selected" error + message. -- `cluster-unsupported` — a DBR with no envKey → clear error. -- `flag-conflict` — `--cluster x --serverless y` rejected at parse. - -Phases needing a live uv/`.venv` (5–8) are exercised by unit tests with the -package-manager interface stubbed. A full end-to-end uv run is validated -manually against the script ("diff against a live script run") and noted as a -manual check, not an acceptance test. - -### Build / quality gate - -`./task build`, `./task test`, `./task lint-q`, `./task fmt-q` all green. -`NEXT_CHANGELOG.md` entry under **CLI**: new `databricks dbconnect init` / -`sync` commands. - -## Out of scope (this PR) - -- pip & conda managers (interface only). -- Flipping the `--constraint-source` default to an official endpoint. -- Any new third-party dependency. - -## Risks to verify during planning - -1. **Cluster-DBR envKey data:** the `databricks-environments` repo currently - publishes `serverless/serverless-vN` paths. Full cluster/job resolution needs - real DBR→envKey paths (e.g. `cluster/dbr-15.4`). If the repo doesn't publish - them yet, the envKey table is the gap — surface it explicitly and decide - whether to seed the table from another source or narrow to the runtimes the - repo actually publishes. The nearest-supported fallback + "runtime X not yet - supported" error covers the rest gracefully. -2. **`--json` / `cmdio` wiring:** confirm the exact mechanism the CLI uses for - JSON output (global `--output json` vs a local `--json` flag) and follow the - existing convention rather than inventing a parallel switch.