Skip to content

Onboarding friction: non-git dbt projects log as project_id=global + schema_inspect fails before a warehouse is configured #962

Description

@anandgupta42

Summary

Two onboarding/activation bugs degrade the first-run experience when a user runs altimate-code
on a fresh dbt project — especially outside a git repository:

  1. Project not detected for non-git dbt projectsproject_id resolves to the
    global/no-project bucket, so per-project context/state is never established.
  2. schema_inspect onboarding wall → the agent calls schema_inspect before a warehouse
    is configured (or with a guessed argument) and gets a single generic error, so it retries the
    same failing call instead of guiding the user to connect first.

Both are independently actionable and can be split into separate issues if preferred.


Failure mode 1 — Non-git dbt projects are not detected → project_id = "global"

Observed behavior

Running altimate-code inside a dbt project that is not a git repo (or opened from a
subdirectory, or in a fresh container/sandbox with no git) produces no stable project identity.
The session is bucketed under the global/no-project id even though the user is doing clearly
project-scoped dbt work (dbt_profilesapply_patch on models → sql_analyze
altimate_core_validatedbt_lineage).

Root cause

Project identity is keyed on git/VCS, not on the dbt project:

  • packages/opencode/src/session/prompt.ts:365
    Telemetry.setContext({ sessionId: sessionID, projectId: Instance.project?.id ?? "" })
    When Instance.project is undefined, projectId is empty and falls into ProjectID.global.
  • packages/opencode/src/project/project.ts — sessions default to ProjectID.global and are
    only re-homed to a real id on git init (initGit, if (input.project.vcs === "git") ...).
    No git → stays global.
  • packages/opencode/src/altimate/fingerprint/index.ts:81 already detects dbt_project.yml
    (plus profiles.yml, adapter type, etc.) and tags the session dbt / data-engineering
    but this detection is not used to establish project identity.

So a dbt project without a git repo gets no stable project id.

Impact

  • Lost per-project context/state: anything keyed on project id (session continuity, memory,
    cached schema/connection, per-project config) doesn't persist or carry across sessions →
    every run feels cold.
  • Inconsistent identity between the data-engineering fingerprint (which does recognize the
    dbt project) and the project/session layer (which does not).

Proposed fix

  • When Instance.project is absent, derive a stable project identity from the
    data-engineering fingerprint that already runs — anchor on the nearest ancestor directory
    containing dbt_project.yml (hash of its absolute path, same shape as the existing project
    id), and use that for both Telemetry.setContext and session/project re-homing.
  • Treat "dbt project root" as a first-class project anchor alongside git root in project.ts
    (walk up to find dbt_project.yml if VCS detection fails).
  • Keep global strictly for genuinely no-project sessions.

Acceptance criteria

  • Running altimate-code inside a non-git dbt project (with dbt_project.yml) yields a
    stable, non-global project_id, stable across sessions in the same directory.
  • Opening from a subdirectory of the dbt project resolves to the same id.
  • Unit test: fingerprint detects dbt_project.yml in a temp non-git dir → project id is
    non-empty and deterministic.

Failure mode 2 — schema_inspect onboarding wall (fails before/without a warehouse)

Observed behavior

When schema_inspect is called before any warehouse connection exists, or with a
guessed/invalid table/warehouse name, the tool fails with a single generic error. The agent
cannot distinguish "no warehouse configured" from "bad argument" from "permission denied", so it
retries the same failing call. A user wiring up a warehouse sees repeated schema_inspect errors
interleaved with warehouse_list / warehouse_add / warehouse_test.

Root cause

  • No warehouse configured: the agent invokes schema_inspect before a connection exists.
    The tool returns a generic failure rather than guiding the user to connect first.
  • Guessed/invalid args: schema_inspect is called with a table/warehouse name the agent
    invented, which fails argument validation.
  • Generic error surfacepackages/opencode/src/altimate/tools/schema-inspect.ts:64:
    output: `Failed to inspect schema: ${msg}\n\nEnsure the dispatcher is running and a warehouse connection is configured.`
    The same string is returned for "no warehouse configured", "bad table name", and "permission
    denied", so the agent can't branch per cause and retries the same failing call. A
    post-connect-suggestions.ts flow exists, but there is no pre-connect guardrail.

Proposed fix

  • Preflight in schema_inspect: if no warehouse is configured, short-circuit with a
    distinct, actionable result (not a generic error) — e.g. "No warehouse connected. Connect a
    warehouse with warehouse_add, then retry" — and emit a consistent not_configured class so
    the agent branches into the connect flow instead of retrying.
  • Agent guidance (builder.txt / analyst.txt): do not call schema_inspect until a
    warehouse connection is confirmed; if not configured, switch to the connect flow.
  • Distinct, actionable errors surfaced to the model: not_configured vs validation vs
    permission vs connection, each with its own recovery hint, instead of one generic string.
  • Arg validation: when table/warehouse are not known/resolvable, prompt to discover
    (warehouse_list / schema_index) rather than calling with a guessed name.

Acceptance criteria

  • Calling schema_inspect with no warehouse configured returns a non-error, actionable
    not_configured result; the agent responds by starting the connect flow rather than retrying.
  • Calling with an invalid table/warehouse returns a validation result with a discovery
    hint.
  • Tests: (a) no-warehouse → actionable not_configured, no exception; (b) invalid arg →
    validation with a discovery hint; (c) prompt/golden test asserts the connect-first ordering.

Notes

  • Surfaced from observed first-run behavior; the two failure modes are independent and can be
    split into separate issues.
  • The known success-path-silent telemetry artifact for read/edit/write/glob is unrelated
    to either failure mode here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    dbtdbt integrationerror-handlingError messages and recoveryonboardingFirst-run and setup experiencepriority:highHigh priorityuxUser experience improvementswarehouseWarehouse connectivity and drivers

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions