Skip to content

feat(parser): migrate gemini copilot providers#776

Draft
mariusvniekerk wants to merge 1 commit into
provider-codexfrom
provider-gemini-copilot
Draft

feat(parser): migrate gemini copilot providers#776
mariusvniekerk wants to merge 1 commit into
provider-codexfrom
provider-gemini-copilot

Conversation

@mariusvniekerk

Copy link
Copy Markdown
Collaborator

Gemini and Copilot now have concrete parser providers. Gemini owns tmp//chats/session-* discovery, project hints, lookup, change classification, hashing, and parse output through the existing Gemini parser. Copilot owns session-state bare and directory layouts, directory-over-bare precedence, workspace.yaml classification/freshness, lookup, hashing, usage events, and parse output through the existing Copilot parser.

This keeps both providers on the shared provider interface without changing their parser normalization behavior.

@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (8148c32)

Summary verdict: one medium correctness issue should be fixed before merge; no security issues were reported.

Medium

  • internal/parser/copilot_provider.go:104 - Copilot aggregate usage events are dropped from the provider result. ParseCopilotSession returns usage, but the new provider only assigns it to sess.UsageEvents; downstream provider consumers use ParseResult.UsageEvents, so shutdown token usage will not be persisted.

    Fix: Populate UsageEvents: usage in the returned ParseResult and add a provider test with a session.shutdown event.


Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 3m32s), codex_security (codex/security, done, 1m42s) | Total: 5m20s

@roborev-ci

roborev-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown

roborev: Review Unavailable (ad6c624)

The review agent repeatedly failed to run (likely an agent or configuration error). roborev will try again on the next commit.

Last error: agent: claude-code failed stream: stream errors: You've hit your session limit · resets 5:50am (UTC): exit status 1

@roborev-ci

roborev-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

roborev: Combined Review (9c2ea96)

Summary verdict: one medium issue remains; no high or critical findings were reported.

Medium

  • internal/parser/gemini_provider.go:176
    Gemini metadata refresh only works when projects.json / trustedFolders.json is explicitly passed to SyncPaths. The live watcher still uses parser.Registry watch roots rather than the provider WatchPlan, and Gemini still only watches tmp. As a result, metadata-only project changes will not be observed, and full sync will skip unchanged session files.
    Fix: Wire provider watch plans into watcher setup, or add a Gemini shallow watch root for the Gemini config directory so metadata file changes are delivered to SyncPaths.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 5m14s), codex_security (codex/security, done, 1m45s) | Total: 7m5s

@mariusvniekerk

mariusvniekerk commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator Author

This change is part of the following stack:

Change managed by git-spice.

@roborev-ci

roborev-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

roborev: Combined Review (1f46f8a)

Medium findings:

  • internal/parser/gemini_provider.go:176 - Gemini project metadata changes are classified only if SyncPaths receives projects.json/trustedFolders.json, but the live watcher is still registry-driven and Gemini only watches tmp. Metadata-only edits will not trigger this path, so full sync can still skip unchanged session files.
    • Fix: Add the Gemini root as a shallow watched root or wire provider WatchPlan into watcher setup, and ensure metadata events force reparses.

Panel: ci_default_security | Synthesis: codex, 15s | Members: codex_default (codex/default, done, 5m41s), codex_security (codex/security, done, 1m17s) | Total: 7m13s

@roborev-ci

roborev-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

roborev: Combined Review (c2c6845)

Migration has two medium behavioral regressions; no security issues were found.

Medium

  • internal/parser/types.go:147 - Gemini provider classification now includes projects.json and trustedFolders.json, but normal watcher setup still uses the registry WatchSubdirs: []string{"tmp"} path and does not consume provider WatchPlan. As a result, root metadata file changes are not watched during normal server operation, so filesystem events will not trigger SyncPaths.

    • Fix: Wire provider WatchPlan into collectWatchRoots, or add Gemini-specific watch roots for those metadata files.
  • internal/parser/provider_migration.go:23 - Moving Copilot and Gemini to provider-authoritative mode drops their legacy DB freshness skips. The generic provider path only skips DB-fresh sources for Codex, and successful parses are not kept in the skip cache, so unchanged Copilot/Gemini sessions are reparsed and rewritten on every full sync, bumping sync metadata.

    • Fix: Extend provider DB freshness skipping to Copilot/Gemini using their provider fingerprints, preserving the legacy size/mtime behavior.

Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 7m9s), codex_security (codex/security, done, 3m7s) | Total: 10m24s

Gemini and Copilot are direct local file sources, but each has source-shape details that were still coupled to the legacy adapter path. Moving them behind concrete providers keeps Gemini tmp/<project>/chats discovery and Copilot bare-vs-directory precedence explicit.

The providers preserve raw and full ID lookup, changed-path classification, source hashing, Gemini project hints, Copilot workspace.yaml freshness, aggregate usage events, and parser output normalization.

fix(parser): preserve gemini copilot provider freshness

Gemini and Copilot now advertise provider-owned watch classification, so remove and rename events need to map back to syntactic source refs even after the filesystem entry has disappeared. Without that fallback, watcher-driven sync can leave stale provider sessions until a wider resync happens.\n\nCopilot also exposes a composite fingerprint that includes workspace.yaml freshness and shutdown aggregate usage. The provider parse result has to carry that same file metadata and usage event slice because sync consumes ParseResult, not only ParsedSession.\n\nValidation: go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

fix(parser): include gemini project metadata freshness

Gemini project names can come from projects.json or trustedFolders.json, so treating only the transcript as the provider source leaves metadata-only changes stale. The provider now watches those root-level sidecars, classifies their changes back to discovered sessions, and folds their contents into the source fingerprint.\n\nValidation: go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

fix(parser): hash copilot workspace metadata

Copilot workspace.yaml can change the provider-visible title without changing the event stream. Size and mtime are useful freshness guards, but the provider hash should also include the workspace file contents so same-length title edits cannot be skipped.\n\nValidation: go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

fix(sync): bridge provider path classification

Concrete providers own source sidecars that legacy path classifiers do not know about. SyncPaths now falls back to provider changed-path classification after the legacy classifiers miss, and provider-classified files force a full parse so metadata-only events can refresh stored session state.\n\nLegacy classification remains authoritative when it recognizes a path, preserving existing project extraction and optimized sidecar filters while still letting migrated providers cover new sidecar surfaces.\n\nValidation: go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go vet ./...; git diff --check

fix(sync): preserve provider sidecar reparses

Provider sidecar events can map to the same session file as a legacy path event in one watcher batch. Keeping only the first classified file made the result order-dependent and could drop the force-parse signal that metadata-only changes rely on.

Per-file forced parses also need to bypass the generic skip cache, not just the agent-specific mtime checks, because sidecar updates may leave the transcript mtime untouched while still changing parsed session metadata.

Validation: go test -tags "fts5" ./internal/sync -run 'TestSyncPathsGeminiProjectMetadataEventRefreshesProject' -count=1; go test -tags "fts5" ./internal/sync -count=1; go test -tags "fts5" ./internal/parser -run 'Test(Gemini|Copilot|ProviderMigration)' -count=1; go vet ./...; git diff --check

fix(sync): skip removed provider source events

Provider changed-path classification can return syntactic source refs for deleted files so providers can model remove events. While legacy file processing is still authoritative, enqueueing an exact missing source path makes SyncPaths fail at the initial stat instead of treating the watcher remove as a no-op.

Keep sidecar fanout intact for existing sources, because metadata changes such as Gemini projects.json still need to force a reparse even when the transcript mtime is unchanged.

Validation: go test -tags "fts5" ./internal/sync -run 'TestEngine_ClassifyPathsProvider(RemoveSkipsMissingGeminiSource|SidecarKeepsExistingGeminiSources)|TestSyncPathsGeminiProjectMetadataEventRefreshesProject' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...; git diff --check

test(sync): compare gemini copilot shadow parity

Gemini and Copilot are migrated through concrete providers on this branch, so reviewers need a sync-level parity check that exercises the provider observation contract rather than only parser-local behavior.

The fixtures cover their sidecar-sensitive source shapes: Gemini project metadata feeds the resolved project hint, and Copilot workspace.yaml participates in both title selection and the composite fingerprint.

Validation: go test -tags "fts5" ./internal/sync -run 'TestObserveProviderSourceMatches(Gemini|Copilot)LegacyParser' -count=1; go test -tags "fts5" ./internal/parser -run 'Test(Gemini|Copilot)Provider' -count=1; go vet ./...; git diff --check. Full go test -tags "fts5" ./internal/parser ./internal/sync -count=1 currently fails in existing TestSyncPathsCodexIndexEventRefreshesStoredDuplicate.

refactor(parser): fold gemini and copilot into providers

Move Gemini and Copilot source discovery, lookup, and parse ownership onto
the concrete geminiProvider and copilotProvider and delete the six
package-level legacy entrypoints: DiscoverGeminiSessions,
FindGeminiSourceFile, ParseGeminiSession, DiscoverCopilotSessions,
FindCopilotSourceFile, and ParseCopilotSession.

Discovery and find-source bodies now live as provider-owned source-set
helpers (discoverSessionPaths and findSourceFile on each source set), the
gemini confirmGeminiSessionID guard moves to the provider file, and the
parsers become the providers' parseSession methods. The copilot source set's
bare/dir precedence and dedup, and the gemini session-filename matching, are
reproduced on the provider exactly as before.

Gemini project resolution is preserved on the provider: sourceRef already
resolves the project via BuildGeminiProjectMap/ResolveGeminiProject for both
discovery and changed-path classification, so removing the engine's gemini
project-map plumbing loses no project names. BuildGeminiProjectMap and
ResolveGeminiProject stay exported package helpers used by the provider.

Make both Gemini and Copilot provider-authoritative and drop their legacy
sync dispatch: the classifyOnePath copilot and gemini blocks (and the now
unused geminiProjectsByDir parameter threaded through classifyOnePath and
classifyPaths), the processFile case arms, and the processGemini,
processCopilot, and shouldSkipCopilot methods. copilotEffectiveMtime stays as
a shared composite-mtime helper used by discoveredFileMtime.

Wire the provider facade into parse-diff: agents that dropped their
DiscoverFunc are now discovered through discoverProviderSources (filtered to
the resolved, provider-discoverable agents), and resolveParseDiffAgents
accepts file-based agents backed by a shadow-compare or
provider-authoritative provider. Without this, a provider-authoritative agent
would silently fall out of parse-diff once its DiscoverFunc was removed.

Drop the Gemini and Copilot AgentDef DiscoverFunc/FindSourceFunc hooks, remove
both files from the pending shim scan list, delete the shared shadow-baseline
test file, and replace it with provider-API coverage plus guards asserting the
legacy entrypoints stay gone. Package and engine tests route through the
provider methods via new test helpers.

test(sync): drop duplicate shadow source helper def

The canonical writeProviderShadowSourceFile now lives at the Codex fold,
so this redeclaration in provider_shadow_test.go conflicts with it. Drop
the local copy and its now-unused os/path filepath imports; callers use
the inherited shared helper.

test(sync): restore provider-aware classify tests at gemini fold

The original restack mis-merged engine_test.go on this branch, reverting
the OpenCode SQLite, OpenCode removed-file, Claude stat-error, and Vibe
meta-only classification tests to their stale pre-fold shapes (fake
opencode.db bytes instead of a seeded session, dropped
seedOpenCodeSQLiteSession helper) and re-adding a classify_vibe_test.go
that exists on no lower branch. Those stale tests asserted the legacy
direct-classification behavior and failed against the provider-routed
path. Restore the correct versions inherited from the codex branch, keep
this branch's two new Gemini provider classify tests, and drop the
spurious classify_vibe_test.go.

test(sync): restore gemini provider classify tests at gemini fold

Re-add the two Gemini changed-path classify tests
(TestEngine_ClassifyPathsProviderRemoveSkipsMissingGeminiSource and
TestEngine_ClassifyPathsProviderSidecarKeepsExistingGeminiSources) that
were dropped while restoring this branch's mis-merged engine_test.go to
its provider-aware shape.

fix(sync): skip fresh gemini copilot before hashing

Gemini and Copilot lost their legacy DB freshness gates when the provider-authoritative path took over. That made unchanged sessions reach provider fingerprinting and parsing during normal full syncs, which is unnecessary work and no longer matches the old processGemini/processCopilot behavior.\n\nRestore the cheap pre-fingerprint checks for those two agents: Gemini compares the stored file path size and mtime, while Copilot compares transcript size plus the workspace.yaml effective mtime. Force-parse paths still flow through the provider so sidecar-driven reparses and parse-diff are not suppressed.\n\nValidation: go test -tags "fts5" ./internal/sync -run 'TestProcessFileProviderAuthoritativeSkipsFresh(Gemini|Copilot)BeforeFingerprint|TestProcessCodexAppendedStaleProject(DoesFullReparse|CarriesForceReplace)' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go vet ./...; git diff --check

fix(sync): restore discover fields on shadowCallerProvider

The rebase onto origin/main dropped the discoverSources and discoverErr
fields from the shadowCallerProvider test struct while keeping the Discover
method that reads them, leaving this branch and every branch stacked above
it uncompilable. Restore the two fields so the Discover stub resolves.
@roborev-ci

roborev-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown

roborev: Combined Review (80c61a9)

Code is not clean: two medium-severity sync correctness issues need attention before merge.

Medium

  • Location: internal/parser/types.go:147
    Problem: Gemini metadata changes rely on the new provider watch plan, but the production watcher is still driven by AgentDef.WatchSubdirs, and Gemini only watches tmp. Changes to projects.json or trustedFolders.json will not trigger SyncPaths in normal live use.
    Fix: Wire provider WatchPlan into watcher setup, or add a Gemini shallow watch root for the config directory alongside the recursive tmp watch.

  • Location: internal/sync/engine.go:4642
    Problem: Gemini fingerprints include projects.json and trustedFolders.json, but this fast skip checks only the session file size/mtime before computing that fingerprint. If metadata changes while agentsview is stopped or the watcher misses it, the next full sync skips the unchanged session file and leaves project names stale.
    Fix: Make Gemini freshness metadata-aware, or move Gemini skipping after a provider fingerprint comparison that includes the metadata files.


Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 9m9s), codex_security (codex/security, done, 2m34s) | Total: 11m51s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant