feat(parser): migrate claude provider by mariusvniekerk · Pull Request #772 · kenn-io/agentsview

mariusvniekerk · 2026-06-19T21:28:03Z

Claude now uses a concrete provider for regular project transcripts and nested subagent transcripts. The provider keeps recursive project discovery, symlinked project directories, standard and subagent lookup, changed-path classification, content hashing, project normalization, excluded-session reporting, and relationship inference.

The provider also exposes Claude's existing incremental append parser as an optional provider capability so linear JSONL growth can continue to avoid full reparses while full-parse fallback remains available for DAG or row-rewrite cases.

roborev-ci · 2026-06-19T21:33:32Z

roborev: Combined Review (`2a56766`)

The PR has two medium correctness issues in the Claude provider; no high or critical security findings were reported.

Medium

internal/parser/claude_provider.go:125
ParseIncremental reports IncrementalNoNewData when the current source size is smaller than the stored offset. A truncated or atomically replaced file is not an append-only no-op, so this can leave stale messages and file metadata instead of forcing a full parse.
Fix: Return IncrementalNeedsFullParse when Fingerprint.Size < Offset, and reserve IncrementalNoNewData for the equal-size unchanged case.
internal/parser/claude_provider.go:330
Changed-path classification rejects deleted or renamed Claude source paths because sourceRef requires IsRegularFile(path). Other JSONL providers classify missing remove/rename paths by shape so the watcher can still route the event; this provider silently drops those events despite advertising ClassifyChangedPath.
Fix: Add missing-path classification for remove/rename events that validates the path under the root and Claude source shape without requiring the file to still exist, plus a regression test.

Panel: ci_default_security | Synthesis: codex, 8s | Members: codex_default (codex/default, done, 4m55s), codex_security (codex/security, done, 2m6s) | Total: 7m9s

roborev-ci · 2026-06-20T02:06:32Z

roborev: Combined Review (`fc3595c`)

I've reviewed the Claude provider migration commit. This commit wires the existing Claude session parser into the Provider interface (factory, source discovery, fingerprinting, parse/incremental-parse), replacing the legacy adapter. The actual parsing, discovery, and lookup logic (ParseClaudeSessionWithExclusions, ParseClaudeSessionFrom, DiscoverClaudeProjects, FindClaudeSourceFile) lives in unchanged existing code.

I focused on the security-relevant surface — the path-handling and source-resolution paths, since FindSource/SourcesForChangedPath could in principle receive externally-influenced inputs (RawSessionID, stored paths, watcher paths):

RawSessionID → FindClaudeSourceFile: The only lookup that builds a filesystem path from a session-ID-shaped input is gated by IsValidSessionID, which restricts to [a-zA-Z0-9_-] (no ., /, or separators), so path traversal via the session ID is not reachable.
StoredFilePath/FingerprintKey/watcher Path → sourceForPath → sourceRef: Every candidate is run through claudeProjectHintFromPath, which uses filepath.Rel and explicitly rejects .. escapes, plus enforces a strict <project>/<session>.jsonl or <project>/.../subagents/.../agent-*.jsonl shape and an IsRegularFile check before producing a SourceRef. Paths that escape the configured root are dropped.
Symlink following (covered by TestClaudeProviderDiscoversSymlinkedProjectDirectory) and any TOCTOU between stat/hash/parse operate only on user-owned files under the configured roots — excluded by the project threat model (local same-user access, user-owned ~/.agentsview data).
The ref.Opaque.(claudeSource) unchecked type assertion at claude_provider.go:315 is reachable only with values built by sourceRef, which always sets Opaque: claudeSource{...}, so it cannot panic in practice and has no security impact.

No SQL, command execution, HTML rendering, network calls, secret handling, or auth-boundary logic is introduced or modified. Input validation on the externally-influenceable path is present and adequate, and the behavior mirrors the legacy adapter it replaces.

No issues found.

Panel: ci_default_security | Synthesis: claude-code | Members: codex_default (claude-code/default, failed, 1s), codex_security (claude-code/security, done, 1m12s) | Total: 1m13s

roborev-ci · 2026-06-21T01:14:07Z

roborev: Combined Review (`59b1626`)

Medium confidence: one Medium issue should be fixed before merge; no Critical or High findings were reported.

Medium

internal/parser/claude_provider.go:298 - The new Claude provider fingerprint/parse path does not preserve file inode/device, while the legacy Claude sync path sets Session.File.Inode and Session.File.Device. With Claude in shadow-compare mode, Unix shadow comparisons can report mismatches for otherwise identical parses, and the provider result is not metadata-parity with legacy. Populate inode/device from the stat result and copy them onto each parsed session, matching processClaude.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 8m37s), codex_security (codex/security, done, 2m12s) | Total: 10m55s

mariusvniekerk · 2026-06-21T01:36:59Z

This change is part of the following stack:

Design parser provider facade layer #748
- Add parser provider facade core #751
  - Add JSONL source set helper #752
    - feat(parser): add directory JSONL source helper #756
      - feat(parser): migrate commandcode and iflow providers #757
        
        Migrate gptme to parser provider facade #753
        
        feat(parser): migrate deepseek tui provider #758
        
        feat(parser): migrate amp and zencoder providers #759
        
        feat(parser): migrate pi provider #760
        
        feat(parser): migrate qwen provider #761
        feat(parser): migrate workbuddy provider #762
        feat(parser): migrate cortex provider #763
        feat(parser): migrate kimi provider #764
        feat(parser): migrate claw providers #766
        feat(parser): migrate qwenpaw provider #767
        feat(parser): migrate openhands provider #768
        feat(parser): migrate cursor provider #769
        feat(parser): migrate vibe provider #770
        feat(parser): migrate hermes provider #771
        feat(parser): migrate claude provider #772 ◀
        feat(parser): migrate cowork provider #773
        feat(parser): migrate opencode-family providers #774
        feat(parser): migrate codex provider #775
        feat(parser): migrate gemini copilot providers #776
        feat(parser): migrate copilot ide providers #778
        feat(parser): migrate positron provider #779
        feat(parser): migrate zed shelley providers #780
        feat(parser): migrate kiro providers #781
        feat(parser): migrate antigravity providers #782
        feat(parser): migrate db-backed providers #783
        fix(parser): require explicit provider factories #784

_{Change managed by git-spice.}

roborev-ci · 2026-06-21T02:08:05Z

roborev: Combined Review (`ea93a29`)

Medium-risk issue found: Claude provider shadow comparison can falsely report mismatches because file identity fields are not populated.

Medium

internal/parser/claude_provider.go:93
The Claude provider copies the hash onto parsed sessions but never populates file inode/device, while the legacy Claude sync path does before shadow comparison. On Unix this makes otherwise matching Claude shadow parses report session mismatches, undermining the new shadow-compare mode.
Fix: Populate inode/device in the provider fingerprint and copy them onto each parsed session, or enrich provider results in sync before comparison/writes.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 5m10s), codex_security (codex/security, done, 2m17s) | Total: 7m33s

roborev-ci · 2026-06-24T00:07:40Z

roborev: Combined Review (`35a9f85`)

Summary verdict: Two medium issues remain; no high or critical findings were reported.

Medium

cmd/agentsview/token_use.go:95: resolveRawSessionID still only probes FindSourceFunc, but Claude now has no FindSourceFunc. Unsynced on-disk Claude sessions are treated as unknown, so token-use / session usage skips on-demand sync and returns not found.
- Fix: Add a provider-backed disk probe for migrated providers using ProviderFactoryByType(...).NewProvider(...).FindSource(...) in both canonical and raw lookup paths.
internal/sync/engine.go:4067: Claude provider sync computes a full file hash in Fingerprint before the DB freshness skip, so every unchanged Claude file is reread on full sync. Legacy Claude skipped by size/mtime before hashing.
- Fix: Perform the stat/data-version freshness check before hashing, or split Claude fingerprinting so the hash is computed only when the source will actually be parsed/written.

Panel: ci_default_security | Synthesis: codex, 7s | Members: codex_default (codex/default, done, 5m17s), codex_security (codex/security, done, 2m11s) | Total: 7m35s

Claude has both regular project transcripts and nested subagent transcripts, plus an existing append-only incremental parser. Moving it behind a concrete provider keeps those source shapes and optional incremental capability explicit at the provider boundary.\n\nThe provider preserves recursive project discovery, symlinked project directories, standard and subagent raw-ID lookup, changed-path classification, content hashing, project-name normalization, excluded-session reporting, relationship inference, and incremental append parsing for linear JSONL growth. fix(parser): preserve claude provider edge events Claude provider sync must distinguish true append idleness from files that were truncated or replaced, and watcher classification must still identify deleted primary and subagent transcripts after the file is gone. Otherwise provider-path sync can retain stale messages or miss removals. Return full-parse status for truncated incremental inputs, add missing-path classification for valid Claude source shapes, and make raw subagent lookup follow symlinked project directories like discovery does. This branch now opts Claude into shadow comparison. Validation: go test -tags "fts5" ./internal/parser -run 'Test(ClaudeProvider|FindClaudeSourceFile|ProviderMigrationModes)' -count=1; go test -tags "fts5" ./internal/parser -count=1; go vet ./...; git diff --check fix(sync): replace claude content after file rewrites Claude incremental parsing is append-oriented, so any fallback caused by truncation or file replacement must replace persisted messages instead of flowing through the append-preserving write path. Otherwise stale higher ordinals or stale tool rows can survive a full parse fallback. The provider now marks truncated incremental inputs as force-replace, and the legacy engine path carries forceReplace when file identity changes or the file shrinks before falling back to a full parse. Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestClaudeProviderParseIncremental|TestIncrementalSync_Claude(FileReplaced|TruncatedFileReplacesStoredMessages|SameSizeFileReplaceUsesFullParse|MidStreamSplitFallsBackToFullParse|AgentIDFallbackUpdatesStoredToolCall)' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...; git diff --check fix(sync): replace claude same-size rewrites A same-size rewrite can reach the full-parse fallback when the normal skip check did not skip the file, which means the content changed even though the byte count did not. That fallback must replace persisted rows, or stale higher ordinals and tool rows can survive the parse. The regression rewrites a Claude file in place to the same byte length with fewer logical messages and verifies the stale assistant row is deleted. Validation: go test -tags "fts5" ./internal/parser ./internal/sync -run 'TestObserveProviderSourceMatchesClaudeLegacyParser|TestClaudeProviderParseIncremental|TestIncrementalSync_Claude(FileReplaced|TruncatedFileReplacesStoredMessages|SameSizeFileReplaceUsesFullParse|SameSizeInPlaceRewriteClearsStaleRows|MidStreamSplitFallsBackToFullParse|AgentIDFallbackUpdatesStoredToolCall)' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go fmt ./...; go vet ./...; ./custom-gcl run --config .golangci.nilaway.yml ./internal/parser/... ./internal/sync/...; git diff --check test(sync): compare claude shadow parity Claude is shadow-compared on this branch, so add source-level migration coverage that compares provider observation with ParseClaudeSessionWithExclusions. The fixture exercises the project-directory source shape and verifies session, message, usage, exclusion, and data-version planning parity while preserving provider-computed file hashes. Validation: go test -tags "fts5" ./internal/sync -run TestObserveProviderSourceMatchesClaudeLegacyParser -count=1 test(sync): cover claude provider usage exclusions Roborev job 2721 caught that the Claude shadow parity fixture only compared a plain exchange, so it did not prove provider parity for per-message token usage or /usage-only session exclusions. Add assistant message usage metadata to the normal fixture and a separate /usage-only source discovered by the provider, then assert non-empty token metadata and excluded IDs against the legacy parser. Validation: go test -tags "fts5" ./internal/sync -run TestObserveProviderSourceMatchesClaudeLegacyParser -count=1; go fmt ./...; go vet ./...; git diff --check refactor(parser): fold claude into provider Move Claude source discovery, lookup, full parse, exclusion handling, and append-only incremental parse ownership onto the concrete claudeProvider and delete the package-level DiscoverClaudeProjects, FindClaudeSourceFile, ParseClaudeSessionFrom, and ParseClaudeSessionWithExclusions free functions. The discover and find-source bodies stay as provider-neutral helpers (ClaudeProjectSessionFiles, claudeFindSourceFile) and the parse bodies become claudeParseWithExclusions and claudeParseSessionFrom; the public ParseClaudeSession wrapper and the Cowork parser (which reuses the Claude transcript format) call the shared helper, so no provider file references a legacy Discover/Find/Parse entrypoint. Make Claude provider-authoritative and drop its legacy sync dispatch: the classifyOnePath Claude block, the processFile case arm, and the processClaude method. Source classification, project resolution, and exclusion handling are reproduced through the provider's changed-path and parse paths. The provider's SourcesForChangedPath also reproduces the legacy "classify despite a transient stat error" behavior so a changed path under a momentarily unreadable parent is not dropped. Wire the provider-authoritative engine path to preserve Claude's DB-aware single-file semantics, which a stateless provider cannot do alone: - tryProviderIncrementalAppend drives the provider's ParseIncremental through the shared tryIncrementalJSONL bookkeeping (session lookup, data-version and inode/device identity guards, ordinal resume, cross-sync split detection, cumulative counters, and forceReplace fallback), so append-only syncs keep the stored file hash and append rows instead of recomputing and rewriting. - providerSingleSessionFresh reproduces the shouldSkipFile gate so an unchanged, already-synced session is skipped instead of re-parsed every full sync and a single-session resync does not reapply a worktree project mapping to an unchanged file. - stampProviderFileIdentity stamps inode/device on parsed results so the incremental path can later detect an atomic file replacement. - processProviderFile honors a caller-supplied file.Project as the source ProjectHint when no explicit ProviderSource was given, so a SyncSingleSession does not revert a user's project override. The engine's expandClaudeDuplicateCandidates and dedupeClaudeDiscoveredFiles stay as provider-neutral engine-level dedup plumbing; expansion now enumerates via ClaudeProjectSessionFiles. The duplicate-candidate expansion and session-ID dedup/precedence behavior is unchanged. Because dropping the Claude DiscoverFunc would otherwise remove Claude from surfaces that gate on DiscoverFunc != nil, parse-diff (engine and CLI flag validation) and the SSH remote resolve script now also include file-based agents that have left legacy-only mode through the provider facade, restoring Claude (and the other already-folded agents) to those surfaces. Drop the Claude AgentDef DiscoverFunc/FindSourceFunc hooks, set its provider migration mode to ProviderAuthoritative, remove claude_provider.go from the pending shim scan list, replace the shadow baseline test with provider-API coverage plus a guard asserting the four legacy entrypoints stay gone, and re-vehicle the generic shadow-mechanism caller tests onto the still-legacy Cowork agent since Claude no longer has a legacy process arm to observe in shadow. refactor(parser): fold ParseClaudeSession onto the Claude provider Delete the ParseClaudeSession free function and route its only production caller (the session upload handler) plus the test suite through the Claude provider's new ParseUploadedTranscript method, exposed via the ClaudeUploadParser interface. Uploads live outside any configured root, so the method parses the staged transcript directly under the caller-supplied project. That project stays authoritative rather than being overridden by the transcript's recorded cwd, matching the prior upload behavior and unlike the discovered-session Parse path. Unexport ClassifyClaudeSystemMessage to classifyClaudeSystemMessage; it is a Claude-internal classifier with no callers outside the package. Both removals clear the last provider-specific legacy parse/classify entrypoints this branch owned. fix(sync): skip fresh claude before fingerprinting The Claude provider migration preserved DB freshness skipping, but only after provider fingerprinting had already hashed the whole transcript. That lost the legacy cheap size/mtime/data-version gate for unchanged files.\n\nRun the single-session freshness check before provider fingerprinting, and pass the computed fingerprint into incremental parsing so truncation detection can distinguish appended files from zero-byte rewrites. Zero-byte truncation now forces a full replacement parse instead of reporting no new data.\n\nValidation: go test -tags "fts5" ./internal/parser -run 'TestClaudeProviderParseIncremental(Truncated|EmptyTruncation)NeedsFullParse' -count=1; go test -tags "fts5" ./internal/sync -run 'TestIncrementalSync_ClaudeAppend|TestProcessFileProviderAuthoritativeSkipsFreshClaudeBeforeFingerprint' -count=1; go test -tags "fts5" ./internal/parser ./internal/sync -count=1; go vet ./...; git diff --check

roborev-ci · 2026-06-25T05:59:17Z

roborev: Combined Review (`7f39c89`)

Summary verdict: One medium issue remains; no critical or high findings were reported.

Medium

Location: internal/parser/types.go:99 / cmd/agentsview/token_use.go:91
Problem: Claude drops FindSourceFunc, but resolveRawSessionID still only probes on-disk sessions through def.FindSourceFunc. An unsynced Claude session ID that exists on disk now resolves as unknown, so session usage / token-use skips the on-demand SyncSingleSession and reports not found.
Fix: Make the resolver use provider FindSource for provider-authoritative file-based agents, or keep a Claude lookup hook until the CLI resolver is migrated.

Panel: ci_default_security | Synthesis: codex, 6s | Members: codex_default (codex/default, done, 4m25s), codex_security (codex/security, done, 3m3s) | Total: 7m34s

This was referenced Jun 20, 2026

fix(parser): require explicit provider factories #784

Draft

feat(parser): migrate db-backed providers #783

Draft

feat(parser): migrate antigravity providers #782

Draft

mariusvniekerk force-pushed the provider-hermes branch from e64fb73 to 40a682f Compare June 20, 2026 01:42

mariusvniekerk force-pushed the provider-claude branch from 2a56766 to fc3595c Compare June 20, 2026 01:42

mariusvniekerk force-pushed the provider-hermes branch from 40a682f to eb4ee67 Compare June 21, 2026 00:41

mariusvniekerk force-pushed the provider-claude branch from fc3595c to 59b1626 Compare June 21, 2026 00:41

mariusvniekerk force-pushed the provider-hermes branch from eb4ee67 to 58b65ac Compare June 21, 2026 01:36

mariusvniekerk force-pushed the provider-claude branch from 59b1626 to dc6dcd6 Compare June 21, 2026 01:36

mariusvniekerk force-pushed the provider-hermes branch from 58b65ac to 384821f Compare June 21, 2026 01:47

mariusvniekerk force-pushed the provider-claude branch from dc6dcd6 to ea93a29 Compare June 21, 2026 01:47

mariusvniekerk force-pushed the provider-hermes branch from 384821f to 31dd18e Compare June 23, 2026 23:55

mariusvniekerk force-pushed the provider-claude branch from ea93a29 to 35a9f85 Compare June 23, 2026 23:55

mariusvniekerk force-pushed the provider-hermes branch from 31dd18e to e0d9089 Compare June 25, 2026 05:48

mariusvniekerk force-pushed the provider-claude branch from 35a9f85 to 7f39c89 Compare June 25, 2026 05:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(parser): migrate claude provider#772

feat(parser): migrate claude provider#772
mariusvniekerk wants to merge 1 commit into
provider-hermesfrom
provider-claude

mariusvniekerk commented Jun 19, 2026

Uh oh!

roborev-ci Bot commented Jun 19, 2026

Uh oh!

roborev-ci Bot commented Jun 20, 2026

Uh oh!

roborev-ci Bot commented Jun 21, 2026

Uh oh!

mariusvniekerk commented Jun 21, 2026 •

edited

Loading

Uh oh!

roborev-ci Bot commented Jun 21, 2026

Uh oh!

roborev-ci Bot commented Jun 24, 2026

Uh oh!

roborev-ci Bot commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mariusvniekerk commented Jun 19, 2026

Uh oh!

roborev-ci Bot commented Jun 19, 2026

roborev: Combined Review (2a56766)

Medium

Uh oh!

roborev-ci Bot commented Jun 20, 2026

roborev: Combined Review (fc3595c)

Uh oh!

roborev-ci Bot commented Jun 21, 2026

roborev: Combined Review (59b1626)

Medium

Uh oh!

mariusvniekerk commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

roborev-ci Bot commented Jun 21, 2026

roborev: Combined Review (ea93a29)

Medium

Uh oh!

roborev-ci Bot commented Jun 24, 2026

roborev: Combined Review (35a9f85)

Medium

Uh oh!

roborev-ci Bot commented Jun 25, 2026

roborev: Combined Review (7f39c89)

Medium

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

roborev: Combined Review (`2a56766`)

roborev: Combined Review (`fc3595c`)

roborev: Combined Review (`59b1626`)

mariusvniekerk commented Jun 21, 2026 •

edited

Loading

roborev: Combined Review (`ea93a29`)

roborev: Combined Review (`35a9f85`)

roborev: Combined Review (`7f39c89`)