Skip to content

feat(parser): migrate cortex provider#763

Draft
mariusvniekerk wants to merge 1 commit into
provider-workbuddyfrom
provider-cortex
Draft

feat(parser): migrate cortex provider#763
mariusvniekerk wants to merge 1 commit into
provider-workbuddyfrom
provider-cortex

Conversation

@mariusvniekerk

@mariusvniekerk mariusvniekerk commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Cortex has a shallow metadata-file source shape, with optional companion history JSONL handled inside the existing parser. Moving it behind a concrete provider keeps source discovery and lookup explicit without adding a new source abstraction.

The provider preserves the legacy Cortex session-file predicate, backup/history companion exclusions, symlinked file behavior, deleted-path classification, source fingerprinting, and parse normalization for session names, cwd, and tool content.

Split-history companion files are modeled as part of the same provider source: watch plans include .history.jsonl files, companion changes classify back to the metadata .json source, and fingerprints include both files while keeping the persisted source key on the metadata path.

@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (4223cec)

Medium issue found: Cortex split-history sidecar updates can be missed by provider watch/classification/fingerprinting.

Medium

  • internal/parser/cortex_provider.go:111 - The provider source set only recognizes *.json, but ParseCortexSession reads <uuid>.history.jsonl for split-history sessions. Provider-driven watch/classification/fingerprinting can miss sidecar-only changes, leaving updated messages stale. Map .history.jsonl events to the sibling .json source and include the sidecar in effective fingerprint/mtime/hash coverage when present.

Panel: ci_default_security | Synthesis: codex, 5s | Members: codex_default (codex/default, done, 3m10s), codex_security (codex/security, done, 1m13s) | Total: 4m28s

@roborev-ci

roborev-ci Bot commented Jun 19, 2026

Copy link
Copy Markdown

roborev: Combined Review (7c2915e)

Summary verdict: One medium correctness issue remains; no high or critical findings were reported.

Medium

  • internal/parser/cortex_provider.go:157: Fingerprint includes the .history.jsonl companion size and mtime, but Parse only copies the composite hash into sess.File. The parsed session keeps ParseCortexSession's metadata-only File.Size and File.Mtime, so provider-backed skip checks that persist Session.File will not store the same freshness data they compare against for split-history sessions.

    Fix: When a fingerprint is supplied, also copy nonzero req.Fingerprint.Size and req.Fingerprint.MTimeNS into sess.File.Size and sess.File.Mtime, with a test covering split-history provider parse metadata.


Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 3m38s), codex_security (codex/security, done, 43s) | Total: 4m28s

@roborev-ci

roborev-ci Bot commented Jun 20, 2026

Copy link
Copy Markdown

roborev: Combined Review (b052ddc)

No issues found.


Panel: ci_default_security | Synthesis: codex | Members: codex_default (claude-code/default, done, 5m52s), codex_security (claude-code/security, done, 1m52s) | Total: 7m44s

@roborev-ci

roborev-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

roborev: Combined Review (9278aed)

Medium issue found; no Critical or High findings.

Medium

  • internal/parser/provider_migration.go:46 - Cortex is only opted into ProviderMigrationShadowCompare, so the new history-aware provider fingerprint is not used by normal full/periodic sync. processCortex still skips based on the .json metadata mtime, so a .history.jsonl-only change missed by the live watcher can remain unsynced.

    Fix: Either make the legacy Cortex path use an effective mtime/hash that includes the history companion, or move Cortex to provider-authoritative once the provider is ready. Add a sync test for history-only companion changes outside changed-path classification.


Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 5m35s), codex_security (codex/security, done, 1m11s) | Total: 6m52s

@mariusvniekerk

mariusvniekerk commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator Author

This change is part of the following stack:

Change managed by git-spice.

@roborev-ci

roborev-ci Bot commented Jun 21, 2026

Copy link
Copy Markdown

roborev: Combined Review (86cdee8)

Medium issue found; no Critical or High findings.

Medium

  • internal/parser/provider_migration.go:46 - Cortex is only moved to ProviderMigrationShadowCompare, so the new composite fingerprint for .history.jsonl is not used by the authoritative legacy sync path. For an already-synced split-history session, a history-only write/delete can still be skipped because processCortex checks only the .json metadata stat via shouldSkipByPath; the shadow provider is then not comparable because the legacy result is skip.

    Fix: Make the legacy Cortex effective file info/skip check include the history companion, or make Cortex provider-authoritative with matching persisted file metadata. Add a changed-path test that mutates/removes only .history.jsonl after the .json row is current.


Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 5m26s), codex_security (codex/security, done, 1m4s) | Total: 6m36s

@roborev-ci

roborev-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

roborev: Combined Review (67dba5e)

Summary verdict: Cortex provider migration has two medium-risk regression issues; no security findings were reported.

Medium

  • Location: internal/parser/types.go:442
    Problem: Removing Cortex's DiscoverFunc/FindSourceFunc regresses command paths that still depend on those registry hooks. parse-diff now rejects or omits Cortex, and session usage/token-use cannot resolve and on-demand sync an unsynced Cortex session from disk because their disk probes skip agents with nil FindSourceFunc.
    Fix: Migrate those command paths to use the provider facade for provider-authoritative agents, or keep the legacy hooks until all callers are provider-aware.

  • Location: internal/sync/engine.go:3314
    Problem: SyncAllSince filters provider-discovered Cortex sources by the metadata .json mtime before the new provider fingerprint can include <uuid>.history.jsonl. A split-session history-only update with unchanged metadata can be filtered out and never parsed.
    Fix: Add a Cortex effective mtime/size path that includes the history companion for since filtering and stored FileInfo, or apply provider fingerprints before the cutoff filter.


Panel: ci_default_security | Synthesis: codex, 9s | Members: codex_default (codex/default, done, 5m34s), codex_security (codex/security, done, 1m37s) | Total: 7m20s

Cortex has a shallow metadata-file source shape, with optional companion history JSONL handled inside the existing parser. Moving it behind a concrete provider keeps source discovery and lookup explicit without adding a new source abstraction.

The provider preserves the legacy Cortex session-file predicate, backup/history companion exclusions, symlinked file behavior, deleted-path classification, source fingerprinting, and parse normalization for session names, cwd, and tool content.

Validation: go fmt ./...; go test -tags "fts5" ./internal/parser -run TestCortexProvider -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; make test-short; git diff --check

fix(parser): include cortex history companions

Cortex split-history sessions parse messages from a sibling .history.jsonl file, so provider-backed live sync has to treat that companion as part of the same source. Otherwise a history-only append can be watched but never mapped back to the metadata session, or can keep the same freshness identity and skip reparsing.

This keeps the persisted source key on the .json metadata file while adding companion watch classification and a composite fingerprint over the metadata and history files when the companion exists.

Validation: go fmt ./...; go test -tags "fts5" ./internal/parser -run TestCortexProviderClassifiesAndFingerprintsHistoryCompanion -count=1; go test -tags "fts5" ./internal/parser -run TestCortexProvider -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; make test-short; git diff --check; make nilaway

test(parser): opt cortex into provider shadow

Cortex now has a concrete facade provider on this branch, so its migration mode should enter shadow comparison instead of staying legacy-only and additive.

Lower provider opt-ins stay inherited and later provider branches own their own manifest changes.

Validation: go test -tags "fts5" ./internal/parser -run TestProviderMigrationModes -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check

test(sync): compare cortex shadow parity

Cortex is shadow-compared on this branch, so add source-level migration coverage that compares provider observation with ParseCortexSession.

The fixture uses Cortex's split metadata/history format so the test proves the provider path preserves companion-history parse behavior while still planning the primary session ID.

Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestObserveProviderSourceMatchesCortexLegacyParser|TestCortexProvider|TestParseCortex' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...

refactor(parser): fold cortex into provider

Move Cortex parse ownership onto the concrete provider and remove the package-level discover/find/parse entrypoints. Route Cortex sync classification and processing through provider changed-path handling so this branch migrates the provider instead of adding another shim.

fix(sync): include cortex companion mtimes in quick sync

Provider-authoritative Cortex discovery emits the metadata JSON as the source, but its freshness identity also includes the split .history.jsonl companion. SyncAllSince was still filtering on the metadata file mtime before provider fingerprinting, so history-only updates could be dropped during quick sync.\n\nUse provider fingerprint mtimes for provider-process discovered files before applying the since cutoff, falling back to the existing per-agent stat logic when the provider has no mtime. Cortex full parses now replace messages as well, because split history rewrites can change existing ordinals rather than only append.\n\nValidation: go test -tags "fts5" ./internal/sync -run TestSyncAllSinceCortexHistoryUpdateTriggersResync -count=1; go test -tags "fts5" ./internal/parser -run Cortex -count=1; go test -tags "fts5" ./internal/sync -run 'Cortex|TestClassifyOnePath_Cortex|TestSyncAllSinceCortexHistoryUpdateTriggersResync' -count=1; go test -tags "fts5" ./internal/sync -count=1; go fmt ./...; go vet ./...; git diff --check

fix(parser): thread ctx through cortex source lookups
@roborev-ci

roborev-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown

roborev: Combined Review (dea0767)

Summary verdict: two medium issues remain; no high or critical findings were reported.

Medium

  • internal/parser/types.go:442
    Cortex now drops DiscoverFunc / FindSourceFunc, but some callers still only use those legacy hooks. session usage / token-use disk resolution may no longer recognize unsynced Cortex sessions, and SSH remote resolve may skip Cortex dirs entirely.
    Fix: Update those callers to use provider discovery/lookup for provider-authoritative agents, or keep compatibility hooks until all consumers are migrated.

  • internal/parser/cortex_provider.go:165
    The Cortex fingerprint includes .history.jsonl size/mtime/hash, but Parse only copies the hash into the session. History-only updates can sync new messages while persisting the old metadata .json file_mtime / size, leaving modified-range queries and session watcher fallback stale.
    Fix: Copy fingerprint size/mtime into sess.File when present, and make Cortex SourceMtime use the same companion-aware mtime.


Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 6m34s), codex_security (codex/security, done, 1m39s) | Total: 8m21s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant