feat: add repo source browser API#581
Conversation
roborev: Combined Review (
|
cc31215 to
7a7ebce
Compare
26b31c0 to
8c79567
Compare
roborev: Combined Review (
|
8c79567 to
2a26169
Compare
roborev: Combined Review (
|
2a26169 to
f678969
Compare
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
|
This change is part of the following stack: Change managed by git-spice. |
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
7e63b29 to
c05a86b
Compare
756c000 to
3e03f54
Compare
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
roborev: Combined Review (
|
Maintainers need the repo browser UI to read files from the local clone through provider-aware repository identity, including nested repo paths and non-default hosts. This creates the backend contract first so the later UI branches can reuse generated clients and shared route helpers instead of inventing ad hoc fetch paths. The API intentionally stays read-only and stateless: refs resolve from the refreshed clone, tree and history work is bounded, blob responses preserve binary and too-large states, and raw assets are served with explicit content headers for markdown previews. Validation: make test-short; go test ./internal/gitclone ./internal/server -run 'TestRepoBrowser' -shuffle=on; go test ./internal/github -run 'TestSyncMRDiffPreservesCloneContextCancellation' -shuffle=on; node node_modules/vite-plus/bin/vp lint packages/ui/src/api/provider-routes.ts --no-error-on-unmatched-pattern --threads=1; git diff --check. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Resolve mutable browser refs once per request and return stale metadata instead of echoing the requested SHA. Route repo browser requests through canonical owner/name identity when repo_path is absent, and restrict raw asset bytes to inert image media types so repository HTML, SVG, and scripts are not served as same-origin executable content.
Commit detail reads now prove the requested SHA appears in the selected path history at the resolved ref before returning details. This prevents arbitrary clone commits from rendering in a file-history detail view and documents resolved ref metadata in the generated API schema.
The repo browser API should not make shared clone fetches prune tags or let large repositories force unbounded tree and history reads from request handlers. This keeps the hot path bounded while preserving the middleman-owned clone state outside explicit maintenance work. Asset responses are raw image bytes, so the OpenAPI contract and generated clients need to model them as binary image content instead of JSON strings. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Repo browser ref enumeration and file history queries run on request paths, so they need bounded output and literal handling for caller-selected file names. This keeps large ref sets from bloating the refs response and prevents Git pathspec magic in file names from widening history or commit-detail scope. The hot clone fetch path still avoids tag pruning; deleted-tag cleanup remains outside this request-time flow. Validation: go test ./internal/gitclone -run 'TestRepoBrowser' -shuffle=on; go test ./internal/server -run 'TestRepoBrowser|TestDocsBlobOpenAPIResponseIsBinary' -shuffle=on; go test ./internal/apiclient/generated -shuffle=on; node node_modules/vite-plus/bin/vp run ui-package-check; git diff --check.\n\nGenerated with Codex\nCo-authored-by: Codex <codex@openai.com>
Repo browser metadata needs to stay bounded without becoming misleading. Ref enumeration now excludes non-display refs before applying the cap, last-changed falls back per missing path after the bounded batch scan, and tag refresh uses non-pruning tag fetch semantics so new release tags appear without deleting cached tags during request-time refresh.\n\nThe plan text now documents literal pathspec handling and partial-result semantics so future UI work does not infer stronger guarantees than the API provides.\n\nValidation: go test ./internal/gitclone -run 'TestRepoBrowser' -shuffle=on; go test ./internal/server -run 'TestRepoBrowser|TestDocsBlobOpenAPIResponseIsBinary' -shuffle=on; go test ./internal/github -run 'TestSyncMRDiffPreservesCloneContextCancellation' -shuffle=on; go test ./internal/gitclone -run 'TestRepoBrowserLastChangedFallsBackPastBatchLogLimit|TestRepoBrowserFetchDoesNotPruneTagsOnHotPath|TestRepoBrowserHistoryTreatsPathspecMagicAsLiteral' -shuffle=on; git diff --check. Generated with Codex Co-authored-by: Codex <codex@openai.com>
The repo browser plan needs to state where expensive or stale metadata behavior is intentionally bounded. This documents the last-changed fallback process cap, the remaining deep-history cost tradeoff, visible-row caller expectation, and explicit ownership for deleted-tag cleanup outside hot fetch paths.\n\nValidation: git diff --check. Generated with Codex Co-authored-by: Codex <codex@openai.com>
The last-changed fallback changes the API-observable result for files older than the bounded batch scan. Covering it through the server route keeps clone fetch, ref resolution, repeated path query parsing, and JSON response shape tied to the contract.\n\nValidation: go test ./internal/server -run 'TestRepoBrowserLastChangedFallsBackPastBatchLogLimit|TestRepoBrowserTreeAssetLastChangedAndHistory' -shuffle=on; go test ./internal/gitclone -run 'TestRepoBrowserLastChangedFallsBackPastBatchLogLimit' -shuffle=on; git diff --check. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Repo browser reads must treat branch and tag names as exact refs, not revision expressions supplied by the caller. The clone cache also needs provider and repo_path in its namespace so repositories sharing host/owner/name cannot leak content across provider identities. Validation: go test ./internal/gitclone -run 'TestRepoBrowser' -shuffle=on; go test ./internal/server -run 'TestRepoBrowser' -shuffle=on; go test ./internal/ptyowner -run TestOwnerQuickExitRemainsAttachable -short -shuffle=on; git diff --check.\n\nGenerated with Codex\nCo-authored-by: Codex <codex@openai.com>
Repo browser API metadata is serialized across an API boundary, so commit author timestamps need to be normalized to UTC. Last-changed batch parsing also cannot use a textual commit marker that can collide with valid repository paths. The batch parser now consumes NUL-delimited git output and the regression test covers a commit:prefixed path plus a non-UTC author date. Validation: go test ./internal/gitclone -run 'TestRepoBrowser' -shuffle=on; go test ./internal/server -run 'TestRepoBrowser' -shuffle=on; git diff --check.\n\nGenerated with Codex\nCo-authored-by: Codex <codex@openai.com>
Repo browser reads are served from mutable local clones, so request-time refresh and raw asset endpoints need stricter invariants than a normal file browser route. Moved remote tags should not poison the shared clone refresh, raw bytes should only be served for immutable commit refs, and file history should fail when the selected tree does not contain the requested file. This keeps the hot path from pruning tags while still tolerating retags, and makes the API return explicit errors instead of cacheable bytes or misleading empty history for mutable or missing inputs. Validation: go test ./internal/gitclone -run 'TestRepoBrowser' -shuffle=on; go test ./internal/server -run 'TestRepoBrowser' -shuffle=on; go test ./internal/github -run 'TestSyncMRDiffPreservesCloneContextCancellation' -shuffle=on; go test -tags integration ./internal/gitclone -run 'TestEnsureCloneToleratesMovedRemoteTags' -shuffle=on; git diff --check. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Repo browser API hardening left a few contract edges that reviewers could still trip over: last-changed parsing needed an unambiguous commit-record delimiter, exact ref lookup needed to distinguish user ref misses from operational git errors, and the raw asset endpoint needed its immutable-ref requirement documented in the generated API contract. This also records the intended deleted-tag behavior: hot fetches may force-update moved tags, but they do not prune tags from middleman-owned clones. Stale tag cleanup belongs in explicit cache maintenance, not normal sync, diff, or repo-browser refresh paths. Validation: make api-generate; go test ./internal/gitclone -run 'TestRepoBrowser|TestEnsureCloneToleratesMovedRemoteTags' -shuffle=on; go test ./internal/server -run 'TestRepoBrowser' -shuffle=on; git diff --check. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Shared clone refreshes run on normal sync and diff paths, so force-fetching every tag there couples hot-path work to remote tag namespace size and moved-tag failures. That is broader than the repo-browser requirement. Fetch shared clones with --no-tags and refresh tags only for the repo-browser namespaced clone. Repo-browser refs still see moved tags, while middleman-owned sync/diff clones avoid tag refresh as part of ordinary fetch. Validation: go test ./internal/github -run TestSyncMRDiffPreservesCloneContextCancellation -shuffle=on; go test ./internal/gitclone -shuffle=on; go test ./internal/server -run RepoBrowser -shuffle=on. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Repo-browser requests were still paying for network fetch work whenever an existing local clone was opened. That made tag refresh part of a UI hot path and kept the same risk profile as the earlier shared-clone tag fetch problem. Keep request-path ensure local-only for existing repo-browser clones, register opened repos, and let the server background loop refresh registered clones on the normal sync interval. The explicit refresh path still force-updates tags without pruning deleted tags, so moved tags update while stale deleted tags remain outside the hot path. Validation: go test ./internal/gitclone -shuffle=on; go test ./internal/server -run 'TestRepoBrowser|TestAPI.*RepoBrowser|TestNonExistent' -shuffle=on; git diff --check. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Scheduled repo-browser refreshes moved tag fetches out of the request path, but explicit refresh still needs the same stampede protection as clone ensure. Concurrent refresh callers for the same namespaced clone should share one branch-and-tag refresh rather than racing on FETCH_HEAD or ref locks. Wrap repo-browser refresh in its own singleflight slot while preserving caller cancellation and the bounded detached operation context used by clone ensure. Validation: go test ./internal/gitclone -shuffle=on; go test ./internal/server -run 'TestRepoBrowser|TestAPI.*RepoBrowser|TestNonExistent' -shuffle=on. Generated with Codex Co-authored-by: Codex <codex@openai.com>
The repo-browser refresh loop is background network work, so controlled test and dev server runs that disable background monitors should not start it implicitly. Keep the scheduled refresh enabled for normal servers, but gate it with the same background-disable option already used for other server-owned loops. Validation: go test ./internal/server -run 'TestRepoBrowser|TestAPI.*RepoBrowser|TestNonExistent' -shuffle=on. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Existing repo-browser clones could remain stale after a server restart because the scheduled refresh loop only knew about repos opened in the current process. Seeding only clones already present on disk lets the immediate background refresh update those repos without cloning every configured repository or moving tag fetch back into the request path. The refresh interval is also captured at server construction so the background loop does not read mutable config while reload tests rewrite the in-memory config. That closes the race reported by the CI go test -race lane. Validation: go test ./internal/gitclone -run 'TestEnsureRepoBrowserCloneDoesNotFetchTagsForExistingClone|TestRefreshRepoBrowserClonesRefreshesRegisteredRepos|TestRefreshRepoBrowserClonesUsesSeededExistingClones|TestRepoBrowserRefreshFetchesTagsWithoutPruning' -shuffle=on; go test ./internal/server -run 'TestRepoBrowser|TestAPI.*RepoBrowser|TestNonExistent' -shuffle=on; go test -race ./internal/server -run 'TestAPISharedHostCloneFetchFollowsReloadedHostToken|TestSSHFleetWebSocketTerminalUsesAttachSpecCommand|TestSSHFleetWebSocketTerminalHonorsResizeActive' -shuffle=on; go test ./internal/server -run TestWorkspaceResponseProbesStoredRuntimeTmuxSessionWithoutBaseE2E -short -shuffle=on. A broader local go test -race ./... run reached an internal/server timeout/goroutine dump under local runner pressure, so it was not a clean validation signal. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Repo overview timelines still need fresh release tag targets even though shared clone fetches no longer pull the full tag namespace. Fetching only the requested release tag keeps the normal clone path bounded while allowing moved provider release tags to update before timeline calculation. Repo browser commit detail also should not reject valid older file commits just because they fall outside the paginated history response. Checking ancestry and the exact commit diff preserves the selected-root constraint without tying commit detail correctness to the UI history limit. Validation: go test ./internal/gitclone -run 'TestCommitTimelineSinceTag|TestRepoBrowserCommitDetail' -shuffle=on; go test ./internal/server -run 'TestRepoBrowser|TestAPI.*RepoBrowser|TestNonExistent' -shuffle=on; go test ./internal/gitclone -shuffle=on. Generated with Codex Co-authored-by: Codex <codex@openai.com>
Roborev found that the restart refresh path was only covered at the clone-manager level, and commit detail still rejected merge commits that changed the selected file. These are user-visible repo-browser paths, so keep merge commits in scope and pin the startup refresh behavior through the real server/API harness. The release timeline test now also proves moved latest and timeline tags are refreshed by targeted tag fetches, without putting eager tag fetching back into the normal hot path. Validation: go test ./internal/gitclone -run 'TestRepoBrowserCommitDetailAcceptsMergeCommitTouchingPath|TestRepoBrowserCommitDetailAcceptsOlderFileHistory|TestCommitTimelineSinceTagFetchesMovedTag' -shuffle=on; go test ./internal/server -run 'TestRepoBrowserCommitAcceptsOlderFileHistoryThroughHTTP|TestRepoBrowserStartupRefreshSeedsExistingClone|TestRepoBrowserStartupRefreshHonorsDisabledBackgroundMonitors|TestAPIListRepoSummariesIncludesSyncedReleaseTimeline' -shuffle=on; go test ./internal/gitclone -shuffle=on; go test ./internal/server -run 'TestRepoBrowser|TestAPIListRepoSummariesIncludesSyncedReleaseTimeline' -shuffle=on; go test -race ./internal/server -run 'TestRepoBrowser|TestAPIListRepoSummariesIncludesSyncedReleaseTimeline' -shuffle=on.\n\nGenerated with Codex\n\nCo-authored-by: Codex <codex@openai.com>
Scheduled repo-browser refreshes run under the server background context, so their git work needs to observe shutdown cancellation. Detaching every refresh made the waiter return while the singleflight worker could keep running until the clone timeout. Keep explicit one-off refreshes protected from individual caller cancellation, but make scheduled refresh operations inherit their caller context so server shutdown can actually drain them. Validation: go test ./internal/gitclone -run 'TestRepoBrowserScheduledRefreshContextStaysCancelable|TestRefreshRepoBrowserClones|TestRepoBrowserRefreshFetchesTagsWithoutPruning|TestRepoBrowserCommitDetailAcceptsMergeCommitTouchingPath' -shuffle=on; go test ./internal/gitclone -shuffle=on.\n\nGenerated with Codex\n\nCo-authored-by: Codex <codex@openai.com>
Resolve commit-detail SHAs before ancestry checks so missing full-length object IDs map to the repo-browser not_found response instead of surfacing git merge-base failures as internal errors.
The repo browser commit detail schema exposes body text, but the detail path reused the history-list format that only captured subjects. That made every selected commit detail appear bodyless even when Git had a multi-line description. Use a detail-only Git format and parser so body text is available without changing the line-oriented history and last-changed parsing paths. Validation: go test ./internal/gitclone -run 'TestRepoBrowserCommitDetail' -shuffle=on.\n\nGenerated with Codex\nCo-authored-by: Codex <codex@openai.com>
Repo browser read handlers should only create a clone when one is missing. Existing clone freshness now belongs to the scheduled refresh path, which avoids hot-path fetches and prevents disabled-monitor tests from racing clone cleanup. Refresh operations now run the cancellable clone/fetch implementation directly instead of delegating through the detached generic clone singleflight. HTTP tests cover merge commits and commit bodies through the repo browser API.
Repo-browser refresh singleflight should not let a canceled HTTP request abort shared clone work that another waiter, such as scheduled refresh, joined. Request-triggered missing-clone refreshes now detach the worker with the existing bounded timeout. Scheduled refreshes still use the caller context so server shutdown cancellation remains respected, preserving the earlier lifecycle fix while avoiding request-cancellation poisoning. Validation: go test ./internal/gitclone -run 'TestRepoBrowser(RequestRefreshWorkDetachesCallerCancellation|ScheduledRefreshContextStaysCancelable|RefreshFetchesTagsWithoutPruning|EnsureRepoBrowserCloneDoesNotRefreshExistingClone)' -shuffle=on; go test ./internal/gitclone -shuffle=on. Generated with Codex Co-authored-by: Codex <codex@openai.com>
e4c044c to
f484b82
Compare
The UI needs a repo-code API that preserves provider identity all the way to clone and cache selection; owner/name route placeholders are not enough for nested repos, self-hosted hosts, or default-host routing. This PR adds the read-only backend foundation so later UI branches can depend on generated, bounded contracts instead of inventing filesystem access in the frontend.
The API keeps risky behavior out of request hot paths: reads come from middleman-owned local clones, refs resolve with explicit stale-token metadata, large/binary/unsupported assets become typed states, and tag pruning is left to separate maintenance rather than source-browser refreshes.