Skip to content

fix(storage): scope graph nodes per project (fix #78)#84

Open
serhiizghama wants to merge 2 commits into
agentctxhq:mainfrom
serhiizghama:fix/per-project-node-uniqueness
Open

fix(storage): scope graph nodes per project (fix #78)#84
serhiizghama wants to merge 2 commits into
agentctxhq:mainfrom
serhiizghama:fix/per-project-node-uniqueness

Conversation

@serhiizghama

Copy link
Copy Markdown
Contributor

Problem

Closes #78.

Graph nodes are project-scoped data, but their uniqueness is enforced globally on name, and upsertNode fetches by name with no project filter:

db.prepare("INSERT OR IGNORE INTO nodes (id, project_id, kind, name) VALUES (?, ?, ?, ?)")
  .run(ulid(), projectId, kind, name);
const row = db.prepare("SELECT id FROM nodes WHERE name = ?").get(name); // ← no project_id

When two projects produce a node with the same name, project B's INSERT OR IGNORE is dropped by the global UNIQUE, and the unscoped SELECT … WHERE name = ? returns project A's node id — so linkRecordToEntity links project B's record to project A's entity.

This is reached in practice, not just theory: branch nodes use the bare branch name (upsertNode(db, projectId, "branch", branch) in post-tool-use.ts), and main / master / develop are universal, so essentially every multi-project install collides. That nodes are meant to be project-scoped is confirmed by deleteProjectData (nodes WHERE project_id = ?) — which also means a reset of project A deletes the shared node project B still points at, leaving project B with dangling links.

What this changes

  • Schema migration V2 rebuilds nodes with UNIQUE(project_id, name) instead of the global name UNIQUE, as a new append-only MIGRATIONS entry. Node ids (the PRIMARY KEY referenced by record_entities/edges) are preserved, so existing links survive the rebuild.
  • upsertNode scopes its lookup with AND project_id = ?. Both halves are required — the SELECT fix alone can't work while the global UNIQUE still blocks the second project's insert.
  • applyMigrations disables foreign keys around the migration and re-checks integrity (foreign_key_check) before re-enabling them, following the SQLite procedure for table-rebuild schema changes.
  • SPEC §3.1 updated to match (the schema's normative source).

Why this approach

The migration rebuilds nodes (CREATE new → copy → DROP old → RENAME), which drops the table that record_entities/edges reference. PRAGMA foreign_keys is a no-op inside a transaction, and defer_foreign_keys is not sufficient here: dropping a parent table leaves a deferred-constraint count that a later RENAME can't clear, so COMMIT fails even when foreign_key_check reports no dangling rows. Toggling foreign_keys around the per-migration transactions (the documented SQLite approach) is the correct fix and benefits any future table-rebuild migration. The repo is pre-release, so no production data-dedup is required.

No change to single-project behavior.

Testing

npm run check is green (biome + typecheck + 305 tests).

  • test/storage/entities.test.ts — two projects sharing a node name (main) get distinct ids; a record in project B links to project B's node (verified down to project_id); single-project upsert stays idempotent; uniqueness holds per (project_id, name).
  • test/storage/migrations.test.ts — the V1→V2 upgrade preserves node ids and their record_entities links, re-enables foreign keys with zero foreign_key_check violations, and admits the same node name in a different project afterward.

Confirmed as a true regression guard: reverting just the scoped SELECT (keeping the migration) fails the cross-project tests, since the unscoped lookup still returns the first project's node.

Graph `nodes` are project-scoped data, but uniqueness was enforced
globally on `name` and `upsertNode` fetched by `name` with no project
filter. When two projects produced a node with the same name — e.g. a
`main`/`master` branch node, which is the common case — the second
project's `INSERT OR IGNORE` was dropped and the unscoped `SELECT`
returned the first project's node id, so its records were linked to
another project's entity. A `reset` of the first project then deleted
the shared node, leaving the second project with dangling links.

Make node uniqueness per project:

- Schema migration V2 rebuilds `nodes` with `UNIQUE(project_id, name)`
  instead of the global `name UNIQUE`. Node ids (referenced by
  `record_entities`/`edges`) are preserved, so existing links survive.
- `upsertNode` scopes its lookup with `AND project_id = ?`. Both halves
  are required: the SELECT fix alone can't work while the global UNIQUE
  still blocks the second project's insert.

The rebuild drops the table that `record_entities`/`edges` reference, so
`applyMigrations` now disables foreign keys around the migration per the
SQLite schema-change procedure and re-checks integrity before re-enabling
them — `PRAGMA foreign_keys` is a no-op inside a transaction, and
`defer_foreign_keys` is insufficient because dropping a parent leaves a
deferred-constraint count a later rename can't clear.

Closes agentctxhq#78
- entities.test.ts: two projects sharing a node name get distinct ids,
  a record links to its own project's node, single-project upsert stays
  idempotent, and uniqueness holds per (project_id, name).
- migrations.test.ts: the V1->V2 rebuild preserves node ids and their
  record links, re-enables foreign keys with no violations, and admits
  the same node name in a different project afterward.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant