Skip to content

Ingest dbt manifest.json into Model/Source nodes + DEPENDS_ON lineage #576

Description

@alexisperinger-ux

What problem does this solve?

dbt computes a resolved dependency graph into target/manifest.json, but it is never ingested, so a project's model and source lineage is invisible to the index even though dbt already calculated it. Model-to-source lineage in particular cannot be reconstructed from source files alone.

Public test bed: dbt-labs/jaffle_shop (run dbt compile to produce target/manifest.json).

Proposed solution

Add an explicit, opt-in MCP tool ingest_dbt_manifest { project, manifest_path } that runs after index_repository against an existing project store:

  • Parse manifest.json with the vendored yyjson.
  • Emit model / seed / snapshot as Model / Seed / Snapshot nodes and source as Source nodes, keyed by dbt unique_id.
  • Emit each node's depends_on.nodes as DEPENDS_ON edges (model-to-model and model-to-source).

Zero schema change: reuses cbm_store_upsert_node / cbm_store_insert_edge and the existing DEPENDS_ON edge. Only lineage-bearing resources become nodes; test / analysis / operation are skipped. This is the authoritative dbt path, complementing source-level extraction (#575).

Alternatives considered

  • Auto-running ingestion inside index_repository when target/manifest.json is present: deferred so this adds only an explicit, opt-in tool and leaves default indexing untouched.
  • Ingesting test / analysis nodes by default: skipped to keep the graph lineage-only.

Confirmations

  • I searched existing issues and this is not a duplicate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestparsing/qualityGraph extraction bugs, false positives, missing edges

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions