Skip to content

feat(parser): migrate db-backed providers#783

Draft
mariusvniekerk wants to merge 1 commit into
provider-antigravity-familyfrom
provider-db-backed-family
Draft

feat(parser): migrate db-backed providers#783
mariusvniekerk wants to merge 1 commit into
provider-antigravity-familyfrom
provider-db-backed-family

Conversation

@mariusvniekerk

Copy link
Copy Markdown
Collaborator

Forge, Piebald, and Warp now use concrete parser providers over their existing per-session SQLite virtual paths. The shared DB-backed source helper keeps discovery metadata-driven, watches the provider root shallowly for database and SQLite sidecar updates, maps changed DB events to current virtual session sources, and preserves force-replace parsing plus Piebald fork fan-out. Review should focus on the shared DB-backed source-set behavior and the parser-specific spec adapters.

@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (8ab78fe)

Summary verdict: One medium-severity issue needs attention; no critical or high findings were reported.

Medium

  • Location: internal/parser/db_backed_provider.go:476
  • Problem: Piebald FindSource collapses any raw ID at the first -, so piebald:42-999 is treated as chat 42 and returns a source whenever that base chat exists, even if the requested fork session does not. Existing sync code explicitly validates fork IDs before treating them as found, so this provider path can report invalid fork sessions as found and parse unrelated results.
  • Fix: For suffixed Piebald IDs, validate the requested full session ID against parsed results or a lightweight branch lookup before returning a source; add provider tests for valid and unknown forks.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 5m30s), codex_security (codex/security, done, 1m51s) | Total: 7m27s

@roborev-ci

roborev-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown

roborev: Combined Review (eabba4c)

Medium finding blocks clean approval.

Medium

  • internal/parser/db_backed_provider.go:467 - Piebald FindSource strips every raw ID at -, so a nonexistent fork such as piebald:42-999 resolves as chat 42 whenever the base chat exists. A caller can treat an invalid fork session as found and parse or write unrelated base-chat results.
    • Fix: Preserve the requested fork ID and validate it exists, for example by parsing/querying the chat and checking emitted session IDs before returning a source. Add a test for an unknown fork.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 3m53s), codex_security (codex/security, done, 2m39s) | Total: 6m38s

@roborev-ci

roborev-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown

roborev: Combined Review (d94f7a4)

Summary verdict: one medium correctness issue should be fixed before merging.

Medium

  • internal/parser/db_backed_provider.go:117 - All sql.ErrNoRows parse errors are treated as SkipNoSession, but Piebald can return sql.ErrNoRows from nested part/tool-call lookups, not just from a missing chat row. That can silently convert a malformed or partially written chat into a tombstone instead of surfacing a parse error.

    Fix: Only map the top-level missing-session lookup to SkipNoSession, for example with a dedicated sentinel error or by verifying the session row is absent before suppressing the error.


Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 5m57s), codex_security (codex/security, done, 1m32s) | Total: 7m35s

@mariusvniekerk

mariusvniekerk commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator Author

This change is part of the following stack:

Change managed by git-spice.

@roborev-ci

roborev-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown

roborev: Combined Review (56a8a67)

The PR has 2 medium findings; no high or critical issues were reported.

Medium

  • internal/parser/db_backed_provider.go:267: FindSource resolves stored/fingerprint virtual paths with allowMissing=true regardless of RequireFreshSource, and Piebald fork IDs are normalized to the base chat before any freshness check. A caller that needs a current source can get a SourceRef for deleted DB rows/files or for a non-existent Piebald fork as long as the base chat exists.
    Fix: When RequireFreshSource is true, verify the DB file and logical row still exist before returning; for Piebald fork IDs, preserve the requested ID and verify the parsed result set contains that session.

  • internal/parser/db_backed_provider.go:202: DB-backed tombstone recovery depends on ChangedPathRequest.StoredSourcePaths, but the watch plan provides no signal that these roots require stored source hints. Generic changed-path sync will not know to populate StoredSourcePaths, so row/DB deletion tombstones only work in tests that manually pass hints.
    Fix: Add the documented stored-source hint mode to WatchRoot and set it on these DB-backed watch roots, then have the caller load scoped persisted paths for roots that request it.


Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 6m37s), codex_security (codex/security, done, 2m11s) | Total: 8m57s

@roborev-ci

roborev-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown

roborev: Combined Review (81b0cad)

Medium severity finding:

  • internal/parser/db_backed_provider.go:202 - DB-backed tombstone recovery depends on ChangedPathRequest.StoredSourcePaths, but returned watch roots do not provide a machine-readable hint requesting those paths, and WatchRoot still lacks the documented StoredSourcePathHints field. A generic changed-path caller cannot know to load persisted source paths without provider-specific inference, so deleted DB rows/files may not emit tombstone sources once current metadata no longer lists them.

    Suggested fix: add the stored-source hint mode to parser.WatchRoot, set it on these DB-backed watch roots, and cover it in provider/watch-plan tests.


Panel: ci_default_security | Synthesis: codex, 5s | Members: codex_default (codex/default, done, 5m5s), codex_security (codex/security, done, 2m33s) | Total: 7m43s

@mariusvniekerk mariusvniekerk force-pushed the provider-db-backed-family branch 4 times, most recently from 82f9e12 to e8149e5 Compare June 20, 2026 01:02
@mariusvniekerk mariusvniekerk force-pushed the provider-antigravity-family branch from cd3a876 to f23eac4 Compare June 20, 2026 01:42
@mariusvniekerk mariusvniekerk force-pushed the provider-db-backed-family branch from e8149e5 to 83f8b73 Compare June 20, 2026 01:42
@roborev-ci

roborev-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown

roborev: Review Unavailable (83f8b73)

The review agent repeatedly failed to run (likely an agent or configuration error). roborev will try again on the next commit.

Last error: agent: claude-code failed stream: stream errors: You've hit your session limit · resets 5:50am (UTC): exit status 1

@mariusvniekerk mariusvniekerk force-pushed the provider-antigravity-family branch from f23eac4 to 20558cf Compare June 21, 2026 00:41
@mariusvniekerk mariusvniekerk force-pushed the provider-db-backed-family branch from 83f8b73 to 1f17524 Compare June 21, 2026 00:41
@roborev-ci

roborev-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

roborev: Combined Review (1f17524)

Medium: New DB-backed providers are not watched in normal server mode.

  • internal/parser/db_backed_provider.go:199, cmd/agentsview/main.go:591
  • The new Forge/Piebald/Warp providers expose WatchPlan and changed-path handling, but those agents remain FileBased: false, while the production watcher skips all non-file-based agents. As a result, these DB roots are never watched in normal server mode. The changed-path provider path only works for manual SyncPaths calls/tests, and live DB updates still wait for periodic full sync.
  • Fix: Wire provider WatchPlan roots into watcher root collection for these non-file-based providers, or add equivalent registry watch-root support, with a watcher/root collection test.

Panel: ci_default_security | Synthesis: codex, 10s | Members: codex_default (codex/default, done, 9m31s), codex_security (codex/security, done, 2m29s) | Total: 12m10s

@roborev-ci

roborev-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

roborev: Combined Review (5157aa7)

Medium severity finding remains.

  • Medium - internal/sync/engine.go:4559: Clean provider freshness skips for Forge/Warp are returned with cacheSkip: true, so collectAndBatch persists them into skipped_files. On the next run, the skip-cache fast path checks only fingerprint mtime and bypasses GetDataVersionByPath in shouldSkipProviderSource, so a parser data-version bump can leave these sessions stale until the source mtime changes.
    • Fix: Avoid writing clean persisted-freshness skips to the generic skip cache, e.g. return cacheSkip: false / noCacheSkip: true for this path, or make the skip-cache fast path verify the stored session row and current data version before skipping.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 6m35s), codex_security (codex/security, done, 3m27s) | Total: 10m9s

@mariusvniekerk mariusvniekerk force-pushed the provider-antigravity-family branch from 175c894 to 581c996 Compare June 21, 2026 01:47
@mariusvniekerk mariusvniekerk force-pushed the provider-db-backed-family branch from 5157aa7 to 058ca18 Compare June 21, 2026 01:47
@roborev-ci

roborev-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

roborev: Combined Review (4f27d46)

No Medium, High, or Critical findings were reported.

The only finding was Low severity and has been omitted per instructions.


Panel: ci_default_security | Synthesis: codex, 24s | Members: codex_default (codex/default, done, 10m20s), codex_security (codex/security, done, 2m3s) | Total: 12m47s

@mariusvniekerk mariusvniekerk force-pushed the provider-antigravity-family branch from 581c996 to 01ca28a Compare June 23, 2026 23:56
@mariusvniekerk mariusvniekerk force-pushed the provider-db-backed-family branch from 4f27d46 to 39a2870 Compare June 23, 2026 23:56
@roborev-ci

roborev-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

roborev: Combined Review (39a2870)

Medium issue found; no High or Critical findings.

Medium

  • internal/parser/sqlite_fanout_source_set.go:106 - SourcesForChangedPath compares req.WatchRoot to the configured root, but WatchPlan publishes filepath.Dir(canonicalDBPath(root)). When FindDB maps a root to a DB in a subdirectory, the engine passes the watch root from WatchPlan; this check rejects it, so DB/WAL changes produce no sources.
    Fix: Compare req.WatchRoot against filepath.Dir(s.canonicalDBPath(root)), or remove this precheck and let dbPathForEvent decide. Update the canonical-path test to use plan.Roots[0].Path.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 6m23s), codex_security (codex/security, done, 3m3s) | Total: 9m32s

Move Forge, Piebald, and Warp DB discovery and per-session parse ownership
onto their shared db-backed provider implementation, deleting the
package-level legacy entrypoints and per-agent engine sync dispatch. The
three become provider-authoritative.

Full-sync change detection runs through syncProviderDBBackedAgent, which
enumerates provider sources and skips those whose fingerprint mtime
matches the stored data-version mtime, so a repeat sync of unchanged data
stays a no-op. FindSourceFile, SourceMtime, and SyncSingleSession route
these agents through the provider facade, preserving Piebald's
chat-source-resolves-fork semantics including rejection of unknown forks.

Assert the provider-authoritative skip in the Piebald process test: an
unchanged chat skips on its per-chat updated_at fingerprint, matching the
legacy piebaldPendingSessionIDs skip and the Forge sibling. The prior test
asserted the opposite, a stale shadow-compare expectation that reparsed an
untouched session on every full sync.

refactor(parser): delete db-backed legacy whole-database parsers

The db-backed provider migration left the exported whole-database and
single-session parse free functions (ParseForgeDB, ParsePiebaldDB,
ParsePiebaldSession, ParseWarpDB) in place: the provider routes through
the lowercase per-session helpers (parseForgeSession,
parsePiebaldSessionResults, parseWarpSession), so these survived only as
dead production code kept alive by their own tests. ParsePiebaldDB had no
references at all.

Delete the four functions and the now-orphaned chain
(loadForgeConversations, loadWarpConversations, loadPiebaldChats, and the
ForgeSession/WarpSession/PiebaldSession bundle types). The retained parse
tests now drive the provider facade (Discover + Fingerprint + Parse) via a
shared parseDBBackedAll helper instead of the deleted free functions, so
they exercise the production path. Extend the db-backed deletion guard to
assert the four names stay gone.

fix(parser): honor sqlite fanout watch roots

SQLite fanout providers can publish a filesystem watch root that differs from the configured source root when FindDB resolves a canonical database under a subdirectory. Changed-path classification still compared WatchRoot to the configured root, so real DB/WAL/SHM events from the planned watch root produced no sources.\n\nAccept the emitted canonical DB directory as the matching watch root while keeping the configured-root compatibility path, and cover the FindDB subdirectory case with a WatchPlan-driven WAL event regression. The commit also removes two unused Codex fixture restats that blocked the existing staticcheck hook.\n\nValidation: go test -tags "fts5" ./internal/parser -run 'TestSQLiteFanoutSourceSet|TestDBBacked' -count=1 -v; go test -tags "fts5" ./internal/parser -count=1; go test -tags "fts5" ./internal/sync -run 'Test.*Codex' -count=1; go fmt ./...; go vet ./...

style(docs): mdformat provider facade design spec
@mariusvniekerk mariusvniekerk force-pushed the provider-antigravity-family branch from 01ca28a to 033752a Compare June 25, 2026 05:48
@mariusvniekerk mariusvniekerk force-pushed the provider-db-backed-family branch from 39a2870 to 7d46117 Compare June 25, 2026 05:49
@roborev-ci

roborev-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown

roborev: Combined Review (7d46117)

Summary verdict: PR has two medium-risk regressions in DB-backed sync behavior; no security findings.

Medium

  • internal/sync/engine.go:2707 - Full sync now discovers all configured Forge/Piebald/Warp roots through one provider instance, and dbBackedSourceSet.Discover returns immediately on the first listMeta error. A single corrupt or temporarily unreadable DB in one configured root can prevent syncing healthy roots for the same agent. Previously, per-root loops logged the bad root and continued.

    • Fix: Preserve per-root isolation for DB-backed sync, either by iterating roots with one provider per root or by collecting successful roots while reporting per-root failures.
  • internal/parser/db_backed_provider.go:350 - Fingerprint reloads and scans all session metadata for the DB on every source. Since syncProviderDBBacked discovers all sources and then fingerprints each one, an unchanged DB with N sessions can perform roughly N full metadata scans, a significant regression from the legacy path that loaded metadata once per DB.

    • Fix: Reuse metadata from discovery or cache a per-DB sessionID -> mtime map for the sync pass so each DB is scanned once.

Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 8m2s), codex_security (codex/security, done, 1m54s) | Total: 10m5s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant