Skip to content

feat(parser): migrate copilot ide providers#778

Draft
mariusvniekerk wants to merge 1 commit into
provider-gemini-copilotfrom
provider-copilot-ides
Draft

feat(parser): migrate copilot ide providers#778
mariusvniekerk wants to merge 1 commit into
provider-gemini-copilotfrom
provider-copilot-ides

Conversation

@mariusvniekerk

Copy link
Copy Markdown
Collaborator

VS Code Copilot and Visual Studio Copilot now have concrete parser providers. VS Code owns workspaceStorage and globalStorage chat discovery, .jsonl-over-.json source selection, lookup, watch classification, hashing, project hints, and parse output through the existing VS Code parser. Visual Studio owns top-level trace discovery, virtual per-conversation source paths, physical trace fan-out for watcher events, strict sibling-aware trace fingerprints, lookup, hashing, force-replace parse output, and the existing Visual Studio trace parser.

This keeps both Copilot IDE formats on the shared provider interface while preserving their existing parser normalization and freshness behavior.

@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (8621003)

Medium severity issue remains before merge.

Medium

  • internal/parser/vscode_copilot_provider.go:107 - The provider advertises aggregate usage events, and ParseVSCodeCopilotSession populates sess.UsageEvents, but the returned ParseResult leaves UsageEvents empty. Provider-based sync paths that persist ParseResult.UsageEvents will drop VS Code Copilot usage/cost rows.
    • Fix: Set UsageEvents: sess.UsageEvents in the returned ParseResult and add a provider test with token metadata.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 6m11s), codex_security (codex/security, done, 2m19s) | Total: 8m36s

@roborev-ci

roborev-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown

roborev: Review Unavailable (a20c95d)

The review agent repeatedly failed to run (likely an agent or configuration error). roborev will try again on the next commit.

Last error: agent: claude-code failed stream: stream errors: You've hit your session limit · resets 5:50am (UTC): exit status 1

@mariusvniekerk mariusvniekerk force-pushed the provider-gemini-copilot branch from ad6c624 to 9c2ea96 Compare June 21, 2026 00:41
@mariusvniekerk mariusvniekerk force-pushed the provider-copilot-ides branch from a20c95d to febe74c Compare June 21, 2026 00:41
@roborev-ci

roborev-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

roborev: Combined Review (febe74c)

Medium issues remain in the VSCode Copilot provider; no high or critical findings were reported.

Medium

  • internal/parser/vscode_copilot_provider.go:107
    The VSCode Copilot provider drops aggregate usage events by returning ParseResult without UsageEvents, even though ParseVSCodeCopilotSession populates sess.UsageEvents and the legacy sync path writes them.
    Fix: Set UsageEvents: sess.UsageEvents in the provider ParseResult and add a fixture with token metadata to the shadow parity test.

  • internal/parser/vscode_copilot_provider.go:102
    The provider computes a composite fingerprint that includes workspace.json, but only copies the hash onto the parsed session. File.Size and File.Mtime remain the raw chat file values, diverging from the updated legacy path and breaking workspace-metadata freshness parity.
    Fix: Copy req.Fingerprint.Size and req.Fingerprint.MTimeNS into sess.File when present, matching processVSCodeCopilot.


Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 6m7s), codex_security (codex/security, done, 2m25s) | Total: 8m40s

@mariusvniekerk mariusvniekerk force-pushed the provider-gemini-copilot branch from 9c2ea96 to 4a1ce3f Compare June 21, 2026 01:36
@mariusvniekerk mariusvniekerk force-pushed the provider-copilot-ides branch from febe74c to 756b2c9 Compare June 21, 2026 01:36
@mariusvniekerk

mariusvniekerk commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator Author

This change is part of the following stack:

Change managed by git-spice.

@roborev-ci

roborev-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

roborev: Combined Review (1476bd0)

Provider migration has two medium parity regressions to fix; no critical or high issues were reported.

Medium

  • internal/parser/vscode_copilot_provider.go:102: Parse only copies the composite fingerprint hash back onto the session, leaving File.Size and File.Mtime as the raw chat file values from ParseVSCodeCopilotSession. Since the fingerprint now includes workspace.json, shadow comparisons will differ from the updated legacy path and authoritative provider mode would store stale freshness metadata.

    • Fix: Copy req.Fingerprint.Size and req.Fingerprint.MTimeNS onto sess.File when the fingerprint is present.
  • internal/parser/vscode_copilot_provider.go:107: The provider drops VSCode Copilot usage events by returning only Session and Messages. The legacy sync path forwards sess.UsageEvents, so provider mode would stop writing usage-event rows for sessions with token metadata.

    • Fix: Set UsageEvents: sess.UsageEvents in the returned ParseResult and add a provider parity fixture that includes result.metadata token data.

Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 6m9s), codex_security (codex/security, done, 2m24s) | Total: 8m41s

@mariusvniekerk mariusvniekerk force-pushed the provider-gemini-copilot branch from 1f46f8a to c2c6845 Compare June 23, 2026 23:56
@mariusvniekerk mariusvniekerk force-pushed the provider-copilot-ides branch from 1476bd0 to 2b66725 Compare June 23, 2026 23:56
@roborev-ci

roborev-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

roborev: Combined Review (2b66725)

High-confidence regression found: VS Code Copilot usage events are dropped, and Visual Studio Copilot project metadata can be overwritten on resync.

High

  • internal/parser/vscode_copilot_provider.go:108
    The provider drops VS Code Copilot aggregate usage events. parseSession still populates sess.UsageEvents, but the new ParseResult only carries Session and Messages; the sync write path persists pr.UsageEvents, so the next sync replaces existing usage events with an empty set.
    Fix: Populate UsageEvents: sess.UsageEvents in the provider ParseResult and add a provider-level test that verifies usage events survive through the new provider path.

Medium

  • internal/parser/visualstudio_copilot_provider.go:108
    Visual Studio Copilot parsing now hardcodes the project to "visualstudio", ignoring req.Source.ProjectHint. SyncSingleSession still preserves the stored project on file.Project, and processProviderFile copies it into Source.ProjectHint, but this provider discards it and will overwrite preserved project metadata on resync.
    Fix: Use firstNonEmptyJSONLString(req.Source.ProjectHint, "visualstudio") and pass that value to parseConversation.

Panel: ci_default_security | Synthesis: codex, 11s | Members: codex_default (codex/default, done, 5m9s), codex_security (codex/security, done, 2m41s) | Total: 8m1s

VS Code Copilot and Visual Studio Copilot both needed concrete providers because their source identity is richer than a plain parser callback. VS Code needs workspace and global chat discovery with .jsonl preference, while Visual Studio needs virtual per-conversation trace sources with sibling-aware freshness.

The providers preserve raw and full ID lookup, watch classification, source hashing, VS Code project hints, Visual Studio physical trace fan-out, strict composite trace fingerprints, force-replace parse semantics, and parser output normalization.

fix(parser): classify copilot ide source changes

The Copilot IDE providers advertised changed-path classification, but the initial migration only accepted source paths that still existed. That dropped deletion and metadata-only events before the sync layer could make a refresh or removal decision.

Classify syntactically valid removed VS Code chat files and Visual Studio trace files, fan workspace.json changes out to current workspace chat sessions, and cover Visual Studio physical trace fan-out with multiple conversations.

fix(parser): include vscode workspace metadata freshness

VS Code Copilot project names come from workspace.json, so classifying manifest writes is not enough if the source fingerprint still only reflects the chat transcript. An unchanged chat file could skip the parse that refreshes Session.Project.

Fold workspace.json size, mtime, and content hash into workspace chat fingerprints while leaving global chat fingerprints unchanged, and cover metadata-only freshness in the provider tests.

fix(sync): refresh vscode copilot workspace metadata

VS Code Copilot was provider-aware for workspace.json freshness, but this stack still runs legacy sync writes. Without mirroring that freshness in the legacy process path, metadata-only workspace renames could be classified but then skipped against the unchanged chat transcript.

Move the Copilot IDE providers into shadow compare on their migration branch, preserve .jsonl priority during provider changed-path classification, and store composite workspace freshness for VS Code Copilot sessions while both shapes run.

Validation: go test -tags "fts5" ./internal/sync -run 'TestSyncPathsVSCodeCopilot(JSONLPriority|WorkspaceMetadataRefreshesProject)' -count=1; go test -tags "fts5" ./internal/parser -run 'Test(VSCodeCopilotProvider|VisualStudioCopilotProvider|ProviderMigrationModes)' -count=1; go test -tags "fts5" ./internal/sync -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

test(sync): compare copilot ide shadow parity

VS Code Copilot and Visual Studio Copilot are already opted into shadow comparison on this branch, but provider method tests alone do not prove the migration path still matches the legacy parser output consumed by sync.

Cover the workspace-backed VS Code JSONL source and Visual Studio virtual trace source through ObserveProviderSource so reviewers can see provider observation, data-version planning, and legacy parser parity in one place.

Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestObserveProviderSourceMatches(VSCodeCopilot|VisualStudioCopilot)LegacyParser|TestCopilotIDEProvider|Test(VSCodeCopilotProvider|VisualStudioCopilotProvider)' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...; git diff --check

refactor(parser): fold copilot IDE providers

Move VSCode Copilot and Visual Studio Copilot source discovery, lookup, and
parse ownership onto their concrete providers and delete the seven legacy
package-level free functions: DiscoverVSCodeCopilotSessions,
FindVSCodeCopilotSourceFile, ParseVSCodeCopilotSession,
DiscoverVisualStudioCopilotSessions, FindVisualStudioCopilotSourceFile,
ParseVisualStudioCopilotConversation, and ParseVisualStudioCopilotVirtualPath.

VSCode Copilot: discoverSessionFiles and findSourceFile become source-set
helpers, parseSession becomes a provider method, and the shared
discoverVSCodeSessionFiles helper stays in discovery.go.

Visual Studio Copilot: discoverSessionFiles and findSourceFile become
source-set helpers (over the retained findVisualStudioCopilotTraceSourceFile
and discoverVisualStudioCopilotSessionFiles helpers), and parseConversation
becomes a provider method. The virtual-path resolution is reproduced on the
provider via the provider-neutral ParseVirtualSourcePath helper plus the
trace-file and conversation-ID predicates (splitVisualStudioCopilotVirtualPath),
replacing the deleted ParseVisualStudioCopilotVirtualPath. External callers
(session export, direct service, parsediff, engine skip-path checks) use the
new exported SplitVisualStudioCopilotVirtualPath, which wraps the same neutral
splitter. The provider's discovery now surfaces an unreadable physical trace
file as a source so the read failure is reported instead of being dropped.

Make both providers provider-authoritative and drop their legacy sync dispatch:
the classifyOnePath VSCode block, classifyVisualStudioCopilotPath and its call,
the processFile case arms, processVSCodeCopilot and its vscodeCopilot* helpers,
processVisualStudioCopilot, the vscodeJSONLSiblingExists helper, and the
now-dead legacy-preamble references to these agents.

Drop the AgentDef DiscoverFunc/FindSourceFunc hooks for both, remove both
provider files from the pending shim scan list, and replace the shadow-baseline
test with provider API coverage plus a guard asserting the legacy entrypoints
stay gone. Re-home the shared writeProviderShadowSourceFile test helper into
provider_shadow_test.go so the sync test package builds.

fix(parser): preserve copilot provider metadata

Provider-authoritative Copilot sync consumes ParseResult side channels, not only fields stored on ParsedSession. VS Code Copilot was parsing aggregate token usage but returning an empty ParseResult.UsageEvents slice, so a provider resync could erase usage rows.

Visual Studio Copilot single-session resyncs carry the stored project through Source.ProjectHint. Honoring that hint prevents the provider default from overwriting preserved project metadata, while VS Code now also carries the composite fingerprint size and mtime alongside the hash.

Validation: go test -tags "fts5" ./internal/parser -run 'Test(VSCodeCopilotProviderSourceMethods|VisualStudioCopilotProviderSourceMethods)' -count=1; go test -tags "fts5" ./internal/sync -run 'TestSyncPathsVSCodeCopilotPersistsUsageEvents|TestSyncSingleSessionContextVisualStudioCopilotPreservesProject' -count=1; go test -tags "fts5" ./internal/parser -run 'Test.*Copilot.*Provider|TestParseVSCodeCopilotSession_TokenUsage|TestParseVisualStudioCopilot' -count=1; go test -tags "fts5" ./internal/sync -run 'Test.*(VSCodeCopilot|VisualStudioCopilot).*' -count=1; go vet ./...; git diff --check

test(parser): guard visual studio copilot session fold

The Copilot IDE fold deleted ParseVisualStudioCopilotSession along with the other Visual Studio Copilot legacy entrypoints, but the regression guard did not name that symbol. Adding it prevents a future shim from reappearing unnoticed.

Validation: go test -tags "fts5" ./internal/parser -run 'TestCopilotIDEProvidersOwnLegacyEntrypoints|Test(VSCodeCopilotProviderSourceMethods|VisualStudioCopilotProviderSourceMethods)' -count=1; git diff --check
@mariusvniekerk mariusvniekerk force-pushed the provider-gemini-copilot branch from c2c6845 to 80c61a9 Compare June 25, 2026 05:48
@mariusvniekerk mariusvniekerk force-pushed the provider-copilot-ides branch from 2b66725 to 6349447 Compare June 25, 2026 05:48
@roborev-ci

roborev-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown

roborev: Combined Review (6349447)

Summary verdict: Changes need fixes before merge due to one failing test and one sync performance regression.

High

  • internal/parser/provider_shim_scan_test.go:54
    copilot_provider.go and gemini_provider.go are added to pendingShimProviderFiles, but the test requires pending files to still reference legacy entrypoints. These providers do not, so the test will fail.
    Fix: Remove those two entries from pendingShimProviderFiles.

Medium

  • internal/parser/provider_migration.go:33
    Moving VS Code and Visual Studio Copilot to provider-authoritative routing drops their previous DB freshness skip, so unchanged sessions are reparsed and rewritten on every full sync.
    Fix: Add provider DB-fingerprint skip support for these agents using their provider fingerprints before enabling provider-authoritative processing.

Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 9m43s), codex_security (codex/security, done, 1m52s) | Total: 11m44s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant