diff --git a/README.md b/README.md index 4fc0095..e61ac15 100644 --- a/README.md +++ b/README.md @@ -163,6 +163,59 @@ end # removed old_file.ex ``` +### Working with the workspace (agent loop) + +`Exgit.Workspace` is a working tree on top of a ref. Reads pass +through to the ref until a write happens; each write produces a new +in-memory tree SHA, so every state of the workspace is a real git +tree object — snapshots are 20-byte values, branching is free, +commits are an O(1) hash-and-store. + +```elixir +ws = Exgit.Workspace.open(repo, "main") + +{:ok, ws} = Exgit.Workspace.write(ws, "lib/foo.ex", new_source) +{:ok, ws} = Exgit.Workspace.rm(ws, "lib/old.ex") + +{:ok, content, ws} = Exgit.Workspace.read(ws, "lib/foo.ex") +{:ok, [{:modified, "lib/foo.ex"}, {:deleted, "lib/old.ex"}], ws} = + Exgit.Workspace.diff(ws) + +{:ok, commit_sha, ws} = + Exgit.Workspace.commit(ws, + message: "agent: refactor", + author: %{name: "agent", email: "agent@example.com"}, + update_ref: "refs/heads/agent-turn-1") + +# Snapshot is an opaque value — persist it and replay later +saved = Exgit.Workspace.snapshot(ws) +ws = Exgit.Workspace.restore(ws, saved) +``` + +The struct is a plain value, so branching the agent's state for +parallel exploration is just `ws_b = ws_a` — both diverge +independently from there. + +### Mounting through `:vfs` + +If you depend on [`:vfs`](https://github.com/ivarvong/vfs), an +`Exgit.Workspace` ships a `VFS.Mountable` defimpl so it composes +with other backends (in-memory scratch, postgres, S3) under one +mount table. Capabilities: `[:read, :write, :lazy]`. + +```elixir +fs = + VFS.new() + |> VFS.mount("/repo", Exgit.Workspace.open(repo, "main")) + |> VFS.mount("/scratch", VFS.Memory.new()) + +{:ok, content, fs} = VFS.read_file(fs, "/repo/lib/foo.ex") +{:ok, fs} = VFS.write_file(fs, "/repo/lib/foo.ex", new_source) +``` + +`:vfs` is an optional dependency; `Exgit.Workspace` is fully +usable without it. + ## Performance Every hot path emits [`:telemetry`](https://hexdocs.pm/telemetry/) diff --git a/docs/VFS.md b/docs/VFS.md new file mode 100644 index 0000000..d50015a --- /dev/null +++ b/docs/VFS.md @@ -0,0 +1,174 @@ +# VFS integration + +`:vfs` ([github.com/ivarvong/vfs](https://github.com/ivarvong/vfs)) is +the protocol-based virtual-filesystem layer that sits *above* exgit. +Agents in production rarely consume git in isolation — they compose a +read-write working tree (git), a scratch (in-memory), and a durable +per-tenant store (postgres/S3) under one tree. `:vfs` provides the +mount table and the `VFS.Mountable` protocol; exgit ships +`Exgit.Workspace`, a working-tree-on-top-of-a-ref, as a backend. + +## Concept + +The integration is a wrapper struct, `Exgit.Workspace`, that pairs +`(repository, base_ref, head_tree)`: + + * `:base_ref` — the starting point ("HEAD", a branch, a commit SHA). + * `:head_tree` — the working tree's SHA (20 bytes), or `nil` when + the workspace is pristine. Reads use `head_tree || base_ref`. + +Every state of the workspace is a real git tree object. That gives +us: + + * **Snapshot is free.** Just save the head_tree binary. + * **Branching is free.** `ws_b = ws_a` — both diverge independently. + * **Commit is instantaneous.** The work is already done; we have + the tree. + * **Diff is structured.** `Exgit.Diff.trees/4` against `base_ref`. + +Trade-off: empty directories don't exist (git doesn't store them) and +writes cost O(depth × log fanout) rather than O(1). For typical agent +workloads (depth <10, <100 entries per dir) writes are ~100µs each +in-memory. + +## Dependency direction + +`:exgit` takes `:vfs` as an **optional** dep. `:vfs` never depends on +`:exgit`. + +```elixir +# mix.exs in :exgit (already wired) +{:vfs, github: "ivarvong/vfs", ref: "...", optional: true, only: [:dev, :test]} +``` + +The `Exgit.Workspace.VFS` module wraps the entire defimpl in +`Code.ensure_loaded?(VFS.Mountable)`. Production builds without +`:vfs` resolved drop it cleanly; `Exgit.Workspace` itself is fully +usable as a standalone API. + +## Capabilities + +```elixir +MapSet.new([:read, :write, :lazy]) +``` + +Not `:mkdir` — git trees can't represent empty directories, so a +faithful `mkdir/3` has no honest semantics. `write_file/4` implicitly +creates parent directories (vfs explicitly supports this for +flat-keyed backends). + +## State threading + +`VFS.Mountable` requires every op to return the (possibly updated) +backend impl as the last element of its success tuple. The workspace +threads two pieces of state on each call: + + 1. `repo.object_store` — grows on lazy partial-clone fetches. + 2. `head_tree` — advances on every write. + +The conformance suite's "state threading" tests exercise this +explicitly: a write returns a workspace whose subsequent reads +reflect the write. + +## Path translation + +vfs paths are absolute with a leading `/`. `Exgit.FS` paths are +slash-tolerant but treat `""` as the root tree. The defimpl strips +the leading slash before calling FS: + +```elixir +defp strip_leading("/"), do: "" +defp strip_leading("/" <> rest), do: rest +``` + +vfs's mount-table dispatcher already normalizes paths and strips the +mount prefix before reaching the backend. + +## Materialize + +Calls `Exgit.Repository.materialize/2`, NOT `Exgit.FS.prefetch/3`. +The latter populates the cache without flipping `mode: :eager`, which +means streaming ops (`walk`, `grep`) still raise `ArgumentError` (see +`Exgit.FS.require_eager!/2` at `lib/exgit/fs.ex:1414-1423`). The +former does both in one step. + +## Walk + +`Exgit.FS.walk/2` requires the underlying repo to be `:eager`. After +a write, the head_tree is resident in the object store but the repo's +mode flag is unchanged, so `VFS.walk/3` still requires +`VFS.materialize/2` to be called first on lazy partial-clone repos. + +For an agent loop this is the natural sequence anyway: clone lazy → +materialize → search/edit. Loosening `walk/2` to allow walking a +fully-resident tree without `:eager` is tracked as a possible +follow-up in `Exgit.FS`. + +## Walk-emitted stat caveats + + * **`size` is 0.** Git tree entries don't carry blob size; only an + explicit `stat/2` per path resolves the blob and returns the + real number. + * **`mtime` is the epoch.** Git blobs aren't dated; only commits + are. Walking history per blob to invent an mtime is expensive + and rarely correct. + +## Git-aware ops live on the workspace, not the protocol + +`commit/2`, `snapshot/1`, `restore/2`, `diff/1`, `checkout/2`, and +`materialize/1` aren't part of `VFS.Mountable`. Agents reach for +them on the workspace struct directly: + +```elixir +ws = Exgit.Workspace.open(repo, "main") + +# Filesystem ops via vfs (interoperable with other mounts) +fs = VFS.new() |> VFS.mount("/repo", ws) +{:ok, content, fs} = VFS.read_file(fs, "/repo/lib/foo.ex") + +# Or directly on the workspace (when ws is the only thing you have) +{:ok, content, ws} = Exgit.Workspace.read(ws, "lib/foo.ex") + +# Git-aware: workspace API only +snapshot = Exgit.Workspace.snapshot(ws) +{:ok, sha, ws} = Exgit.Workspace.commit(ws, message: "...", author: %{...}) +``` + +## Conformance + +vfs ships `VFS.ConformanceCase` — a parametrized macro every backend +runs through. The exgit-side conformance test lives at +`test/exgit/workspace_vfs_test.exs` and is tagged `:vfs` so it's +skipped when the dep isn't resolved (e.g. on the Elixir 1.17 CI tier +where vfs requires ~> 1.18). + +A backend that ships without conformance is shipping with unverified +contract behavior — which is exactly how `VFS.Test.AppService` +silently ignored `:byte_range` / `:line_range` / `:chunk_size` until +the audit (vfs CHANGELOG, 2026-05-02). New behavior gets caught here. + +The harness currently lives in vfs's `test/support/`; we load it via +`Code.require_file/1` from `test_helper.exs`. Once vfs publishes +`VFS.ConformanceCase` in `lib/`, the require-file dance can drop and +`use VFS.ConformanceCase` works directly. + +## What this doesn't try to be + + * **Not an index.** No "staged vs working tree" distinction. The + workspace IS the working tree; commit takes everything-or-nothing. + * **Not a merger.** Single parent on commit. Multi-parent merge is + a future concern. + * **Not auto-committing.** Writes never produce a commit by + themselves. The agent decides when to checkpoint via + `Exgit.Workspace.commit/2`. + * **Not a sync layer.** Push/pull aren't workspace ops — they're + `Exgit.push/3` against the underlying repo. + +## References + + * vfs repo: + * vfs SPEC: vfs `SPEC.md` + * Working impl: `lib/exgit/workspace.ex` + * VFS defimpl: `lib/exgit/workspace/vfs.ex` + * Workspace tests: `test/exgit/workspace_test.exs` + * Conformance test: `test/exgit/workspace_vfs_test.exs` diff --git a/lib/exgit/fs.ex b/lib/exgit/fs.ex index 3de1080..ca0d2a1 100644 --- a/lib/exgit/fs.ex +++ b/lib/exgit/fs.ex @@ -1241,6 +1241,84 @@ defmodule Exgit.FS do {:ok, sha, %{repo | object_store: store}} end + @doc """ + Remove the entry at `path` from the tree at `reference`. Returns + `{:ok, new_tree_sha, repo}` — the new tree omits the entry; existing + blob/tree objects are left untouched (git is content-addressed; orphan + objects are GC'd separately). + + ## Options + + * `:recursive` — when `true`, removing a directory also removes its + contents. Default `false`; removing a directory without + `:recursive` returns `{:error, :eisdir}`. + + Errors: + + * `{:error, :not_found}` — `path` does not exist in the tree + * `{:error, :eisdir}` — `path` is a directory and `:recursive` is + not set + * `{:error, :cannot_rm_root}` — `path` is empty or `"/"` + + Mirrors `write_path/5`'s tree-rewrite shape so a workspace can chain + `rm_path` and `write_path` calls to assemble multi-file edits before + committing. + """ + @spec rm_path(Repository.t(), ref(), path(), keyword()) :: + {:ok, binary(), Repository.t()} | {:error, term()} + def rm_path(%Repository{} = repo, reference, path, opts \\ []) do + recursive = Keyword.get(opts, :recursive, false) + segments = normalize_path(path) + + if segments == [] do + {:error, :cannot_rm_root} + else + with {:ok, tree_sha, repo} <- resolve_tree(repo, reference) do + remove_entry_from_tree(repo, tree_sha, segments, recursive) + end + end + end + + defp remove_entry_from_tree(repo, tree_sha, [name], recursive) do + with {:ok, %Tree{entries: entries}, repo} <- fetch_object(repo, tree_sha) do + case Enum.find(entries, fn {_, n, _} -> n == name end) do + nil -> + {:error, :not_found} + + {"40000", _, _} when not recursive -> + {:error, :eisdir} + + _ -> + new_entries = Enum.reject(entries, fn {_, n, _} -> n == name end) + new_tree = Tree.new(new_entries) + {:ok, sha, store} = ObjectStore.put(repo.object_store, new_tree) + {:ok, sha, %{repo | object_store: store}} + end + end + end + + defp remove_entry_from_tree(repo, tree_sha, [dir | rest], recursive) do + with {:ok, %Tree{entries: entries}, repo} <- fetch_object(repo, tree_sha) do + case Enum.find(entries, fn {m, n, _} -> n == dir and m == "40000" end) do + nil -> + {:error, :not_found} + + {_, _, child_sha} -> + case remove_entry_from_tree(repo, child_sha, rest, recursive) do + {:ok, new_child_sha, repo} -> + other_entries = Enum.reject(entries, fn {_, n, _} -> n == dir end) + new_entries = other_entries ++ [{"40000", dir, new_child_sha}] + new_tree = Tree.new(new_entries) + {:ok, sha, store} = ObjectStore.put(repo.object_store, new_tree) + {:ok, sha, %{repo | object_store: store}} + + {:error, _} = err -> + err + end + end + end + end + # ---------------------------------------------------------------------- # Internal: object fetch that threads the repo for Promisor-backed stores # ---------------------------------------------------------------------- diff --git a/lib/exgit/workspace.ex b/lib/exgit/workspace.ex new file mode 100644 index 0000000..74bfb3c --- /dev/null +++ b/lib/exgit/workspace.ex @@ -0,0 +1,448 @@ +defmodule Exgit.Workspace do + @moduledoc """ + An agent-loop working tree on top of a git ref. + + A workspace pairs `(repository, base_ref, head_tree)`: + + * `:base_ref` — the starting point ("HEAD", a branch, a commit SHA). + * `:head_tree` — the current working tree's SHA. `nil` when the + workspace is pristine; reads in that state pass through to + `base_ref`. Set to a 20-byte tree SHA after the first write. + + Every state of the workspace is a real git tree object. Snapshots + are 20-byte SHAs you can persist and replay; commits are an O(1) + hash-and-store on top of the head tree; branching the workspace + for parallel exploration is `ws_b = ws_a` — the struct is a value, + no copy needed. + + ## Lifecycle + + ws = Exgit.Workspace.open(repo, "main") + {:ok, ws} = Exgit.Workspace.write(ws, "lib/foo.ex", new_source) + {:ok, ws} = Exgit.Workspace.rm(ws, "lib/old.ex") + + {:ok, content, ws} = Exgit.Workspace.read(ws, "lib/foo.ex") + {:ok, [{:modified, "lib/foo.ex"}, {:deleted, "lib/old.ex"}], ws} + = Exgit.Workspace.diff(ws) + + {:ok, commit_sha, ws} = + Exgit.Workspace.commit(ws, + message: "agent: refactor", + author: %{name: "agent", email: "agent@example.com"}, + update_ref: "refs/heads/agent-turn-1") + + ## Snapshot / restore + + saved = Exgit.Workspace.snapshot(ws) # :pristine | <<20-byte sha>> + ws = Exgit.Workspace.restore(ws, saved) + + Snapshots are opaque values you can stash anywhere — a database, + another conversation, a Linear comment. To replay an agent's run + end-to-end, restore from the saved value. + + ## Branching + + Pass the same workspace to two parallel computations; each gets its + own threaded state. + + ws_a = ws + ws_b = ws + + {:ok, ws_a} = Exgit.Workspace.write(ws_a, "lib/x.ex", "...") + {:ok, ws_b} = Exgit.Workspace.write(ws_b, "lib/x.ex", "different") + + `ws_a` and `ws_b` now diverge. The underlying object store is shared + (each write puts new blobs/trees) but neither workspace's `head_tree` + references the other's writes. + + ## VFS integration + + When `:vfs` is loaded, `Exgit.Workspace` implements `VFS.Mountable` + and can be mounted into a `%VFS{}` mount table. See + `Exgit.Workspace.VFS`. + """ + + alias Exgit.Diff + alias Exgit.FS + alias Exgit.Object.{Blob, Commit} + alias Exgit.{ObjectStore, RefStore, Repository} + + @enforce_keys [:repo, :base_ref] + defstruct [:repo, :base_ref, :head_tree] + + @type t :: %__MODULE__{ + repo: Repository.t(), + base_ref: String.t() | binary(), + head_tree: binary() | nil + } + + @typedoc """ + Identity for `commit/2`. Either a pre-formatted git identity string + (`"Name ts +tz"`) used verbatim, or a `%{name:, email:}` map + which is rendered with the current timestamp at UTC. + """ + @type identity :: String.t() | %{required(:name) => String.t(), required(:email) => String.t()} + + @typedoc """ + An entry returned by `diff/1`. Path is relative to the repo root. + """ + @type change :: {:added | :modified | :deleted, String.t()} + + @typedoc """ + Opaque snapshot value. Either the sentinel `:pristine` or a 20-byte + tree SHA. + """ + @type snapshot :: :pristine | binary() + + # ────────────────────────────────────────────────────────────────── + # Construction + # ────────────────────────────────────────────────────────────────── + + @doc """ + Open a workspace over `repo` rooted at `ref` (default `"HEAD"`). + + The workspace starts pristine — `head_tree` is `nil`, reads go + straight to `ref`. + """ + @spec open(Repository.t(), String.t()) :: t() + def open(%Repository{} = repo, ref \\ "HEAD") when is_binary(ref) do + %__MODULE__{repo: repo, base_ref: ref, head_tree: nil} + end + + # ────────────────────────────────────────────────────────────────── + # Reads + # ────────────────────────────────────────────────────────────────── + + @doc """ + Read the file at `path`. Returns the blob bytes plus the threaded + workspace. + """ + @spec read(t(), String.t()) :: {:ok, binary(), t()} | {:error, term()} + def read(%__MODULE__{} = ws, path) do + case FS.read_path(ws.repo, effective_ref(ws), path) do + {:ok, {_mode, %Blob{data: data}}, repo} -> + {:ok, data, %{ws | repo: repo}} + + {:error, _} = err -> + err + end + end + + @doc """ + List names of entries directly under `path`. Sorted lexicographically. + """ + @spec ls(t(), String.t()) :: {:ok, [String.t()], t()} | {:error, term()} + def ls(%__MODULE__{} = ws, path) do + case FS.ls(ws.repo, effective_ref(ws), path) do + {:ok, entries, repo} -> + names = entries |> Enum.map(fn {_m, n, _s} -> n end) |> Enum.sort() + {:ok, names, %{ws | repo: repo}} + + {:error, _} = err -> + err + end + end + + @doc """ + Stat the entry at `path`. Returns `%{type: :blob | :tree, mode:, size:}`. + """ + @spec stat(t(), String.t()) :: {:ok, FS.stat(), t()} | {:error, term()} + def stat(%__MODULE__{} = ws, path) do + case FS.stat(ws.repo, effective_ref(ws), path) do + {:ok, stat, repo} -> {:ok, stat, %{ws | repo: repo}} + {:error, _} = err -> err + end + end + + @doc """ + Whether `path` exists in the current working state. + """ + @spec exists?(t(), String.t()) :: {boolean(), t()} + def exists?(%__MODULE__{} = ws, path) do + {FS.exists?(ws.repo, effective_ref(ws), path), ws} + end + + @doc """ + Stream every blob path under the workspace's working state. Like + `Exgit.FS.walk/2`, requires the underlying repo to be `:eager` — + call `materialize/1` first on lazy partial-clone repos. + """ + @spec walk(t()) :: Enumerable.t() + def walk(%__MODULE__{} = ws) do + FS.walk(ws.repo, effective_ref(ws)) + end + + # ────────────────────────────────────────────────────────────────── + # Writes + # ────────────────────────────────────────────────────────────────── + + @doc """ + Write `content` to `path`. Creates intermediate directories + implicitly. Refuses to overwrite a directory. + """ + @spec write(t(), String.t(), binary()) :: {:ok, t()} | {:error, term()} + def write(%__MODULE__{} = ws, path, content) when is_binary(content) do + with :ok <- guard_not_directory(ws, path), + {:ok, new_tree, repo} <- FS.write_path(ws.repo, effective_ref(ws), path, content) do + {:ok, %{ws | repo: repo, head_tree: new_tree}} + end + end + + @doc """ + Remove the entry at `path`. + + Options: + + * `:recursive` — when `true`, removing a directory removes its + contents. Default `false`; without it, directory removal returns + `{:error, :eisdir}`. + """ + @spec rm(t(), String.t(), keyword()) :: {:ok, t()} | {:error, term()} + def rm(%__MODULE__{} = ws, path, opts \\ []) do + case FS.rm_path(ws.repo, effective_ref(ws), path, opts) do + {:ok, new_tree, repo} -> + {:ok, %{ws | repo: repo, head_tree: new_tree}} + + {:error, _} = err -> + err + end + end + + # ────────────────────────────────────────────────────────────────── + # Snapshot / restore / fork + # ────────────────────────────────────────────────────────────────── + + @doc """ + Capture the workspace's working state as an opaque snapshot value. + + Returns `:pristine` for a workspace with no writes, otherwise the + 20-byte tree SHA. Persistable, transferable, and replayable via + `restore/2`. + """ + @spec snapshot(t()) :: snapshot() + def snapshot(%__MODULE__{head_tree: nil}), do: :pristine + def snapshot(%__MODULE__{head_tree: tree}), do: tree + + @doc """ + Replace the workspace's head tree with a previously-captured snapshot. + + The snapshot's referenced objects must already be in the underlying + repo's object store — this is the case when the snapshot was produced + by a workspace sharing the same store. + """ + @spec restore(t(), snapshot()) :: t() + def restore(%__MODULE__{} = ws, :pristine), do: %{ws | head_tree: nil} + + def restore(%__MODULE__{} = ws, tree) when is_binary(tree) and byte_size(tree) == 20 do + %{ws | head_tree: tree} + end + + # ────────────────────────────────────────────────────────────────── + # Diff + # ────────────────────────────────────────────────────────────────── + + @doc """ + Compare the workspace's working state against `base_ref`, returning + a list of `{:added | :modified | :deleted, path}` entries. + + A pristine workspace returns `{:ok, [], ws}` immediately. + """ + @spec diff(t()) :: {:ok, [change()], t()} | {:error, term()} + def diff(%__MODULE__{head_tree: nil} = ws), do: {:ok, [], ws} + + def diff(%__MODULE__{} = ws) do + with {:ok, base_tree, repo} <- resolve_ref_to_tree(ws.repo, ws.base_ref), + {:ok, changes} <- Diff.trees(repo, base_tree, ws.head_tree) do + {:ok, simplify_changes(changes), %{ws | repo: repo}} + end + end + + defp simplify_changes(changes) do + Enum.map(changes, fn + %{op: :added, path: p} -> {:added, p} + %{op: :removed, path: p} -> {:deleted, p} + %{op: :modified, path: p} -> {:modified, p} + %{op: :mode_changed, path: p} -> {:modified, p} + %{op: :type_changed, path: p} -> {:modified, p} + %{op: :submodule_change, path: p} -> {:modified, p} + end) + end + + # ────────────────────────────────────────────────────────────────── + # Commit + # ────────────────────────────────────────────────────────────────── + + @doc """ + Materialize the working tree as a commit object. + + Required options: + + * `:message` — commit message (a binary, with or without trailing newline). + * `:author` — `t:identity/0`. + + Optional: + + * `:committer` — defaults to `:author`. + * `:update_ref` — `false` (default) leaves refs untouched; the + caller takes the returned commit SHA. A binary like + `"refs/heads/agent-turn-1"` writes that ref to point at the new + commit and tracks it as the workspace's new `base_ref`. The ref + may already exist; it is overwritten. + + After commit, `head_tree` is cleared and `base_ref` advances to + identify where the new commit lives: + + * `update_ref: false` → `base_ref` becomes the commit SHA (binary). + * `update_ref: "refs/heads/foo"` → `base_ref` becomes that string. + + Returns `{:error, :nothing_to_commit}` if the workspace is pristine. + """ + @spec commit(t(), keyword()) :: {:ok, binary(), t()} | {:error, term()} + def commit(%__MODULE__{head_tree: nil}, _opts), do: {:error, :nothing_to_commit} + + def commit(%__MODULE__{} = ws, opts) do + message = Keyword.fetch!(opts, :message) + author = format_identity(Keyword.fetch!(opts, :author)) + committer = opts |> Keyword.get(:committer, author) |> format_identity() + update_ref = Keyword.get(opts, :update_ref, false) + + with {:ok, parents, repo} <- parent_commit_shas(ws.repo, ws.base_ref) do + ws = %{ws | repo: repo} + + commit = + Commit.new( + tree: ws.head_tree, + parents: parents, + author: author, + committer: committer, + message: ensure_trailing_newline(message) + ) + + {:ok, commit_sha, store} = ObjectStore.put(ws.repo.object_store, commit) + repo = %{ws.repo | object_store: store} + + case advance_ref(repo, update_ref, commit_sha) do + {:ok, repo, new_base_ref} -> + {:ok, commit_sha, %{ws | repo: repo, head_tree: nil, base_ref: new_base_ref}} + + {:error, _} = err -> + err + end + end + end + + # ────────────────────────────────────────────────────────────────── + # Materialize / checkout + # ────────────────────────────────────────────────────────────────── + + @doc """ + Convert a lazy partial-clone repo to eager mode by prefetching every + object reachable from the workspace's effective ref. After this, + streaming ops (`walk/1`, `Exgit.FS.grep/4`) work without per-blob + network round-trips. + + No-op for already-eager repos. + """ + @spec materialize(t()) :: {:ok, t()} | {:error, term()} + def materialize(%__MODULE__{} = ws) do + case Repository.materialize(ws.repo, effective_ref(ws)) do + {:ok, repo} -> {:ok, %{ws | repo: repo}} + {:error, _} = err -> err + end + end + + @doc """ + Switch the workspace's `base_ref`. Discards any uncommitted writes + (`head_tree` is reset to `nil`). + """ + @spec checkout(t(), String.t()) :: {:ok, t()} + def checkout(%__MODULE__{} = ws, ref) when is_binary(ref) do + {:ok, %{ws | base_ref: ref, head_tree: nil}} + end + + # ────────────────────────────────────────────────────────────────── + # Internals + # ────────────────────────────────────────────────────────────────── + + defp effective_ref(%__MODULE__{head_tree: nil, base_ref: ref}), do: ref + defp effective_ref(%__MODULE__{head_tree: tree}), do: tree + + defp guard_not_directory(ws, path) do + case FS.stat(ws.repo, effective_ref(ws), path) do + {:ok, %{type: :tree}, _repo} -> {:error, :eisdir} + _ -> :ok + end + end + + # Resolve any ref/sha down to a tree-sha. `Exgit.FS.resolve_tree/2` + # is private; this mirrors its external contract for the cases the + # workspace produces (named refs and 20-byte commit/tree SHAs). + defp resolve_ref_to_tree(%Repository{} = repo, ref) when is_binary(ref) do + if byte_size(ref) == 20 do + resolve_sha_to_tree(repo, ref) + else + case RefStore.resolve(repo.ref_store, ref) do + {:ok, sha} -> resolve_sha_to_tree(repo, sha) + {:error, _} = err -> err + end + end + end + + defp resolve_sha_to_tree(repo, sha) do + case ObjectStore.get(repo.object_store, sha) do + {:ok, %Commit{} = c} -> {:ok, Commit.tree(c), repo} + {:ok, %Exgit.Object.Tree{}} -> {:ok, sha, repo} + {:ok, _} -> {:error, :not_a_commit_or_tree} + {:error, _} = err -> err + end + end + + # Resolve `base_ref` to the parent-commits list for a new commit. + # Returns `{:ok, [parent_sha], repo}` when the ref points at a real + # commit, `{:ok, [], repo}` when there's no commit yet (initial + # commit case — base_ref is a bare tree-sha or doesn't resolve). + defp parent_commit_shas(%Repository{} = repo, ref) when is_binary(ref) do + cond do + byte_size(ref) == 20 -> + resolve_parent_sha(repo, ref) + + true -> + case RefStore.resolve(repo.ref_store, ref) do + {:ok, sha} -> resolve_parent_sha(repo, sha) + {:error, :not_found} -> {:ok, [], repo} + {:error, _} = err -> err + end + end + end + + defp resolve_parent_sha(repo, sha) do + case ObjectStore.get(repo.object_store, sha) do + {:ok, %Commit{}} -> {:ok, [sha], repo} + {:ok, _} -> {:ok, [], repo} + {:error, _} -> {:ok, [], repo} + end + end + + defp advance_ref(repo, false, commit_sha), do: {:ok, repo, commit_sha} + + defp advance_ref(repo, ref_name, commit_sha) when is_binary(ref_name) do + case RefStore.write(repo.ref_store, ref_name, commit_sha, []) do + {:ok, ref_store} -> {:ok, %{repo | ref_store: ref_store}, ref_name} + {:error, _} = err -> err + end + end + + defp format_identity(s) when is_binary(s), do: s + + defp format_identity(%{name: name, email: email}) + when is_binary(name) and is_binary(email) do + ts = System.os_time(:second) + "#{name} <#{email}> #{ts} +0000" + end + + defp ensure_trailing_newline(""), do: "\n" + + defp ensure_trailing_newline(msg) do + if String.ends_with?(msg, "\n"), do: msg, else: msg <> "\n" + end +end diff --git a/lib/exgit/workspace/vfs.ex b/lib/exgit/workspace/vfs.ex new file mode 100644 index 0000000..122acc5 --- /dev/null +++ b/lib/exgit/workspace/vfs.ex @@ -0,0 +1,301 @@ +if Code.ensure_loaded?(VFS.Mountable) do + defmodule Exgit.Workspace.VFS do + @moduledoc """ + `VFS.Mountable` defimpl for `Exgit.Workspace`. + + Loaded only when `:vfs` is available — `Exgit.Workspace` is + fully usable without `:vfs` for direct API access. Mounting + a workspace into a `%VFS{}` mount table makes it interoperable + with other backends (in-memory scratch, postgres, S3) under one + tree. + + ws = Exgit.Workspace.open(repo, "main") + fs = VFS.new() |> VFS.mount("/repo", ws) + + {:ok, content, fs} = VFS.read_file(fs, "/repo/lib/foo.ex") + {:ok, fs} = VFS.write_file(fs, "/repo/lib/foo.ex", new_source) + + The mount table threads workspace state through every op, so + cache growth from lazy fetches and `head_tree` advancement + from writes are both visible to subsequent calls. + + ## Capabilities + + `[:read, :write, :lazy]`. We do not claim `:mkdir`: git trees + cannot represent empty directories, so a faithful `mkdir/3` + has no honest semantics. Writes implicitly create parents (vfs + explicitly supports this for flat-keyed backends). + + ## Mutations through vfs vs git-aware ops + + File-shaped mutations (`write_file`, `rm`) flow through this + impl. Git-aware ops (`Exgit.Workspace.commit/2`, + `snapshot/1`, `restore/2`, `diff/1`, `checkout/2`) are not + part of `VFS.Mountable` — agents reach for them on the + workspace struct directly. + """ + + # The defimpl's own module: just a place to hang documentation. + # Protocol implementations don't have moduledoc support directly. + end + + defimpl VFS.Mountable, for: Exgit.Workspace do + alias Exgit.Workspace + alias VFS.Error + alias VFS.Stat + + @epoch DateTime.from_unix!(0) + + # ── reads ────────────────────────────────────────────────────────── + + def exists?(%Workspace{} = ws, path) do + p = strip_leading(VFS.Path.normalize(path)) + {boolean, ws} = Workspace.exists?(ws, p) + {boolean, ws} + end + + def stat(%Workspace{} = ws, path) do + p = strip_leading(VFS.Path.normalize(path)) + + case Workspace.stat(ws, p) do + {:ok, %{type: :blob, size: size}, ws} -> + {:ok, %Stat{type: :regular, size: size, mtime: @epoch}, ws} + + {:ok, %{type: :tree}, ws} -> + {:ok, %Stat{type: :directory, size: 0, mtime: @epoch}, ws} + + {:error, reason} -> + {:error, error_for(reason, path)} + end + end + + def readdir(%Workspace{} = ws, path) do + p = strip_leading(VFS.Path.normalize(path)) + + case Workspace.ls(ws, p) do + {:ok, names, ws} -> {:ok, names, ws} + {:error, reason} -> {:error, error_for(reason, path)} + end + end + + def stream_read(%Workspace{} = ws, path, opts) do + p = strip_leading(VFS.Path.normalize(path)) + + case Workspace.read(ws, p) do + {:ok, data, ws} -> {:ok, slice(data, opts), ws} + {:error, reason} -> {:error, error_for(reason, path)} + end + end + + def walk(%Workspace{} = ws, root, opts) do + p = strip_leading(VFS.Path.normalize(root)) + max_depth = Keyword.get(opts, :max_depth, :infinity) + include_dirs = Keyword.get(opts, :include_dirs, false) + + ws + |> Workspace.walk() + |> Stream.flat_map(fn {file_path, _sha} -> + cond do + not under?(file_path, p) -> [] + beyond_depth?(file_path, p, max_depth) -> [] + true -> [{"/" <> file_path, blob_stat()}] + end + end) + |> maybe_with_dirs(p, include_dirs, max_depth) + end + + # ── eager prefetch ──────────────────────────────────────────────── + + def materialize(%Workspace{} = ws, _opts) do + case Workspace.materialize(ws) do + {:ok, ws} -> {:ok, ws} + {:error, reason} -> {:error, Error.new(:eio, message: inspect(reason))} + end + end + + # ── mutations ───────────────────────────────────────────────────── + + def write_file(%Workspace{} = ws, path, content, _opts) do + p = strip_leading(VFS.Path.normalize(path)) + + case Workspace.write(ws, p, content) do + {:ok, ws} -> {:ok, ws} + {:error, reason} -> {:error, error_for(reason, path)} + end + end + + def mkdir(_ws, path, _opts) do + {:error, + Error.new(:enotsup, path: path, message: "git trees cannot store empty directories")} + end + + def rm(%Workspace{} = ws, path, opts) do + p = strip_leading(VFS.Path.normalize(path)) + recursive = Keyword.get(opts, :recursive, false) + + case Workspace.rm(ws, p, recursive: recursive) do + {:ok, ws} -> {:ok, ws} + {:error, reason} -> {:error, error_for(reason, path)} + end + end + + # ── introspection ───────────────────────────────────────────────── + + def capabilities(_), do: MapSet.new([:read, :write, :lazy]) + + # ── helpers ─────────────────────────────────────────────────────── + + defp strip_leading("/"), do: "" + defp strip_leading("/" <> rest), do: rest + defp strip_leading(other), do: other + + defp blob_stat, do: %Stat{type: :regular, size: 0, mtime: @epoch} + + defp under?(_file_path, ""), do: true + + defp under?(file_path, prefix), + do: file_path == prefix or String.starts_with?(file_path, prefix <> "/") + + defp beyond_depth?(_file_path, _prefix, :infinity), do: false + + defp beyond_depth?(file_path, prefix, max_depth) when is_integer(max_depth) do + depth_under(file_path, prefix) > max_depth + end + + defp depth_under(file_path, ""), do: file_path |> String.split("/") |> length() + + defp depth_under(file_path, prefix) do + rest = String.replace_prefix(file_path, prefix <> "/", "") + rest |> String.split("/") |> length() + end + + defp maybe_with_dirs(stream, _prefix, false, _max_depth), do: stream + + defp maybe_with_dirs(stream, prefix, true, max_depth) do + # Materialize dirs by collecting unique parent paths from the + # file stream. This forces enumeration but `walk` semantics in + # vfs already permit a list-shaped result; the protocol only + # requires Enumerable.t/0. + Stream.transform(stream, MapSet.new(), fn + {file_path, _stat} = entry, seen -> + dirs = parent_dirs(file_path, prefix, max_depth) + new_dirs = Enum.reject(dirs, &MapSet.member?(seen, &1)) + new_seen = Enum.reduce(new_dirs, seen, &MapSet.put(&2, &1)) + dir_entries = Enum.map(new_dirs, &{&1, %Stat{type: :directory, size: 0, mtime: @epoch}}) + {dir_entries ++ [entry], new_seen} + end) + end + + defp parent_dirs(file_path, prefix, max_depth) do + "/" <> rest = file_path + parts = String.split(rest, "/") + # All ancestor dirs of the file (excluding the file itself). + ancestors = parts |> Enum.drop(-1) |> ancestors_paths() + strip_prefix = if prefix == "", do: "/", else: "/" <> prefix <> "/" + + Enum.filter(ancestors, fn dir_path -> + cond do + # Must be at-or-under the walk root + dir_path == "/" <> prefix -> false + not String.starts_with?(dir_path, strip_prefix) and prefix != "" -> false + true -> within_depth?(dir_path, prefix, max_depth) + end + end) + end + + defp ancestors_paths(parts) do + parts + |> Enum.scan([], fn p, acc -> acc ++ [p] end) + |> Enum.map(fn segs -> "/" <> Enum.join(segs, "/") end) + end + + defp within_depth?(_dir_path, _prefix, :infinity), do: true + + defp within_depth?(dir_path, prefix, max_depth) when is_integer(max_depth) do + "/" <> rest = dir_path + depth_under(rest, prefix) <= max_depth + end + + # Stream slicing per vfs's stream_read options. + defp slice(data, opts) do + data + |> apply_byte_range(Keyword.get(opts, :byte_range)) + |> apply_line_range(Keyword.get(opts, :line_range)) + |> chunkify(Keyword.get(opts, :chunk_size, 64 * 1024)) + end + + defp apply_byte_range(data, nil), do: data + + defp apply_byte_range(data, {start, length}) when start >= 0 and length >= 0 do + size = byte_size(data) + + cond do + start >= size -> "" + start + length > size -> binary_part(data, start, size - start) + true -> binary_part(data, start, length) + end + end + + defp apply_line_range(data, nil), do: data + + defp apply_line_range(data, {first, last}) when first >= 1 do + lines = String.split(data, "\n") + total = length(lines) + ends_with_nl? = String.ends_with?(data, "\n") + + # If the data ends with "\n", String.split yields a trailing "". + # We want to operate on logical lines (1..N where N is the count + # of \n-terminated or final non-empty segments). + logical_lines = + if ends_with_nl? and total > 0 and List.last(lines) == "", + do: Enum.drop(lines, -1), + else: lines + + logical_count = length(logical_lines) + + last_idx = + case last do + :end -> logical_count + n when is_integer(n) -> min(n, logical_count) + end + + if first > logical_count do + "" + else + slice = Enum.slice(logical_lines, (first - 1)..(last_idx - 1)) + joined = Enum.join(slice, "\n") + if last == :end and ends_with_nl?, do: joined <> "\n", else: joined + end + end + + defp chunkify("", _chunk_size), do: [] + + defp chunkify(data, chunk_size) when is_integer(chunk_size) and chunk_size > 0 do + Stream.unfold(data, fn + "" -> + nil + + rest when byte_size(rest) <= chunk_size -> + {rest, ""} + + rest -> + <> = rest + {chunk, more} + end) + end + + # Error mapping. `path` is the ORIGINAL (pre-stripping) path so the + # error surface to vfs callers contains absolute paths. + defp error_for(:not_found, path), do: Error.new(:enoent, path: VFS.Path.normalize(path)) + defp error_for(:not_a_blob, path), do: Error.new(:eisdir, path: VFS.Path.normalize(path)) + defp error_for(:not_a_tree, path), do: Error.new(:enotdir, path: VFS.Path.normalize(path)) + defp error_for(:eisdir, path), do: Error.new(:eisdir, path: VFS.Path.normalize(path)) + defp error_for(:enotdir, path), do: Error.new(:enotdir, path: VFS.Path.normalize(path)) + + defp error_for(:cannot_rm_root, path), + do: Error.new(:einval, path: VFS.Path.normalize(path), message: "cannot rm /") + + defp error_for(reason, path), + do: Error.new(:eio, path: VFS.Path.normalize(path), message: inspect(reason)) + end +end diff --git a/mix.exs b/mix.exs index e5fc520..54724b0 100644 --- a/mix.exs +++ b/mix.exs @@ -82,6 +82,18 @@ defmodule Exgit.MixProject do # Telemetry: BEAM-wide standard for instrumentation. Emits events # consumers can attach to. Zero cost when no handler is attached. {:telemetry, "~> 1.0"}, + # Optional vfs integration: `Exgit.Workspace` ships a + # `VFS.Mountable` defimpl when `:vfs` is loaded. Pinned to a SHA + # because vfs has no hex release yet. `optional: true` means: + # - downstream consumers don't have to install :vfs to use exgit + # - if they DO add :vfs, Mix orders our build after vfs's so the + # defimpl compiles in and protocol consolidation picks it up. + # We deliberately do NOT scope this to :dev/:test only — that would + # remove vfs from our dep graph in :prod, breaking compile-ordering + # guarantees in downstream consumer builds. Requires Elixir ~> 1.18; + # the 1.17 CI tier skips the integration via `Code.ensure_loaded?`. + {:vfs, + github: "ivarvong/vfs", ref: "32d2ab618ec12c16fe4f675b5ee8b563c660dd69", optional: true}, {:stream_data, "~> 1.0", only: [:test, :dev]}, # Optional dev-only OpenTelemetry bridge: auto-converts :telemetry # events into OTel spans. Only loaded in dev/test; production users diff --git a/mix.lock b/mix.lock index dce8df9..9e06912 100644 --- a/mix.lock +++ b/mix.lock @@ -26,4 +26,5 @@ "stream_data": {:hex, :stream_data, "1.3.0", "bde37905530aff386dea1ddd86ecbf00e6642dc074ceffc10b7d4e41dfd6aac9", [:mix], [], "hexpm", "3cc552e286e817dca43c98044c706eec9318083a1480c52ae2688b08e2936e3c"}, "telemetry": {:hex, :telemetry, "1.4.1", "ab6de178e2b29b58e8256b92b382ea3f590a47152ca3651ea857a6cae05ac423", [:rebar3], [], "hexpm", "2172e05a27531d3d31dd9782841065c50dd5c3c7699d95266b2edd54c2dafa1c"}, "tls_certificate_check": {:hex, :tls_certificate_check, "1.32.1", "f90f9647668f7af804ec63c4e67f3799931166caf8264b81f37c1d863a36ae1d", [:rebar3], [{:ssl_verify_fun, "~> 1.1", [hex: :ssl_verify_fun, repo: "hexpm", optional: false]}], "hexpm", "e78a157966456b500a87a2fc29cffcd6dcfb5a26348c8372a2c5c0a8e5797f51"}, + "vfs": {:git, "https://github.com/ivarvong/vfs.git", "32d2ab618ec12c16fe4f675b5ee8b563c660dd69", [ref: "32d2ab618ec12c16fe4f675b5ee8b563c660dd69"]}, } diff --git a/test/exgit/fs_test.exs b/test/exgit/fs_test.exs index 73f211c..6063a4f 100644 --- a/test/exgit/fs_test.exs +++ b/test/exgit/fs_test.exs @@ -983,4 +983,56 @@ defmodule Exgit.FsTest do assert b.data =~ "defmodule B" end end + + describe "rm_path/4" do + test "removes a top-level file", %{repo: repo} do + assert {:ok, tree_sha, repo2} = FS.rm_path(repo, "HEAD", "README.md") + assert {:error, :not_found} = FS.read_path(repo2, tree_sha, "README.md") + + assert {:ok, {_, blob}, _} = FS.read_path(repo2, tree_sha, "src/a.ex") + assert blob.data =~ "defmodule A" + end + + test "removes a nested file, leaving siblings intact", %{repo: repo} do + assert {:ok, tree_sha, repo2} = FS.rm_path(repo, "HEAD", "src/a.ex") + assert {:error, :not_found} = FS.read_path(repo2, tree_sha, "src/a.ex") + + assert {:ok, {_, b}, _} = FS.read_path(repo2, tree_sha, "src/b.ex") + assert b.data =~ "defmodule B" + + assert {:ok, {_, c}, _} = FS.read_path(repo2, tree_sha, "src/nested/c.ex") + assert c.data =~ "defmodule C" + end + + test "removing a directory without :recursive returns :eisdir", %{repo: repo} do + assert {:error, :eisdir} = FS.rm_path(repo, "HEAD", "src") + end + + test "removing a directory with :recursive removes all contents", %{repo: repo} do + assert {:ok, tree_sha, repo2} = FS.rm_path(repo, "HEAD", "src", recursive: true) + assert {:error, :not_found} = FS.read_path(repo2, tree_sha, "src/a.ex") + assert {:error, :not_found} = FS.read_path(repo2, tree_sha, "src/nested/c.ex") + + assert {:ok, {_, readme}, _} = FS.read_path(repo2, tree_sha, "README.md") + assert readme.data == "hello\n" + end + + test "missing path returns :not_found", %{repo: repo} do + assert {:error, :not_found} = FS.rm_path(repo, "HEAD", "does/not/exist") + end + + test "rm of root path is rejected", %{repo: repo} do + assert {:error, :cannot_rm_root} = FS.rm_path(repo, "HEAD", "") + assert {:error, :cannot_rm_root} = FS.rm_path(repo, "HEAD", "/") + end + + test "rm followed by write yields a tree without the rm'd entry", %{repo: repo} do + {:ok, t1, repo2} = FS.rm_path(repo, "HEAD", "README.md") + {:ok, t2, repo3} = FS.write_path(repo2, t1, "NOTES.md", "new\n") + + assert {:error, :not_found} = FS.read_path(repo3, t2, "README.md") + assert {:ok, {_, blob}, _} = FS.read_path(repo3, t2, "NOTES.md") + assert blob.data == "new\n" + end + end end diff --git a/test/exgit/workspace_test.exs b/test/exgit/workspace_test.exs new file mode 100644 index 0000000..1d9e8ca --- /dev/null +++ b/test/exgit/workspace_test.exs @@ -0,0 +1,289 @@ +defmodule Exgit.WorkspaceTest do + use ExUnit.Case, async: true + + alias Exgit.Object.{Blob, Commit, Tree} + alias Exgit.{ObjectStore, RefStore, Repository, Workspace} + + # Build a tiny repo: README.md + src/{a.ex, b.ex} + src/nested/c.ex + setup do + store = ObjectStore.Memory.new() + + {:ok, readme_sha, store} = ObjectStore.put(store, Blob.new("hello\n")) + {:ok, a_sha, store} = ObjectStore.put(store, Blob.new("module A\n")) + {:ok, b_sha, store} = ObjectStore.put(store, Blob.new("module B\n")) + {:ok, c_sha, store} = ObjectStore.put(store, Blob.new("module C\n")) + + nested_tree = Tree.new([{"100644", "c.ex", c_sha}]) + {:ok, nested_sha, store} = ObjectStore.put(store, nested_tree) + + src_tree = + Tree.new([ + {"100644", "a.ex", a_sha}, + {"100644", "b.ex", b_sha}, + {"40000", "nested", nested_sha} + ]) + + {:ok, src_sha, store} = ObjectStore.put(store, src_tree) + + root_tree = + Tree.new([ + {"100644", "README.md", readme_sha}, + {"40000", "src", src_sha} + ]) + + {:ok, root_sha, store} = ObjectStore.put(store, root_tree) + + commit = + Commit.new( + tree: root_sha, + parents: [], + author: "T 1700000000 +0000", + committer: "T 1700000000 +0000", + message: "init\n" + ) + + {:ok, commit_sha, store} = ObjectStore.put(store, commit) + + {:ok, ref_store} = RefStore.write(RefStore.Memory.new(), "refs/heads/main", commit_sha, []) + {:ok, ref_store} = RefStore.write(ref_store, "HEAD", {:symbolic, "refs/heads/main"}, []) + + repo = Repository.new(store, ref_store) + + {:ok, repo: repo, commit_sha: commit_sha, root_sha: root_sha} + end + + describe "open/2 + reads on a pristine workspace" do + test "reads pass through to base_ref", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + assert {:ok, "hello\n", ws} = Workspace.read(ws, "README.md") + assert {:ok, "module A\n", _ws} = Workspace.read(ws, "src/a.ex") + end + + test "ls returns sorted names", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + assert {:ok, ["README.md", "src"], _ws} = Workspace.ls(ws, "") + assert {:ok, ["a.ex", "b.ex", "nested"], _ws} = Workspace.ls(ws, "src") + end + + test "stat distinguishes blob and tree", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + assert {:ok, %{type: :blob}, _} = Workspace.stat(ws, "README.md") + assert {:ok, %{type: :tree}, _} = Workspace.stat(ws, "src") + end + + test "exists?/2 threads state but never writes", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + assert {true, _ws} = Workspace.exists?(ws, "README.md") + assert {false, _ws} = Workspace.exists?(ws, "missing.ex") + end + + test "snapshot of pristine workspace is :pristine", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + assert Workspace.snapshot(ws) == :pristine + end + + test "diff of pristine workspace is empty", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + assert {:ok, [], _ws} = Workspace.diff(ws) + end + end + + describe "write/3" do + test "first write sets head_tree to a real tree-sha", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "new.txt", "fresh\n") + assert is_binary(ws.head_tree) + assert byte_size(ws.head_tree) == 20 + end + + test "subsequent reads see the write", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "lib/foo.ex", "new content\n") + assert {:ok, "new content\n", _ws} = Workspace.read(ws, "lib/foo.ex") + end + + test "writes preserve unmodified files", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "src/a.ex", "changed\n") + assert {:ok, "changed\n", ws} = Workspace.read(ws, "src/a.ex") + assert {:ok, "module B\n", _} = Workspace.read(ws, "src/b.ex") + assert {:ok, "hello\n", _} = Workspace.read(ws, "README.md") + end + + test "writing onto an existing directory returns :eisdir", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + assert {:error, :eisdir} = Workspace.write(ws, "src", "no") + end + + test "writes implicitly create parents", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "deep/path/here/file.txt", "deep\n") + assert {:ok, "deep\n", _} = Workspace.read(ws, "deep/path/here/file.txt") + end + end + + describe "rm/3" do + test "removes a top-level file", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.rm(ws, "README.md") + assert {:error, :not_found} = Workspace.read(ws, "README.md") + end + + test "missing path returns :not_found", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + assert {:error, :not_found} = Workspace.rm(ws, "nope") + end + + test "directory without :recursive returns :eisdir", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + assert {:error, :eisdir} = Workspace.rm(ws, "src") + end + + test "directory with :recursive removes contents", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.rm(ws, "src", recursive: true) + assert {:error, :not_found} = Workspace.read(ws, "src/a.ex") + assert {:error, :not_found} = Workspace.read(ws, "src/nested/c.ex") + assert {:ok, "hello\n", _} = Workspace.read(ws, "README.md") + end + end + + describe "snapshot/1 + restore/2" do + test "snapshot of dirty workspace is the head_tree binary", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "x.txt", "x") + sha = Workspace.snapshot(ws) + assert is_binary(sha) + assert byte_size(sha) == 20 + end + + test "round-trip via restore preserves working state", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "x.txt", "v1") + sha_after_v1 = Workspace.snapshot(ws) + + {:ok, ws} = Workspace.write(ws, "x.txt", "v2") + assert {:ok, "v2", _} = Workspace.read(ws, "x.txt") + + ws = Workspace.restore(ws, sha_after_v1) + assert {:ok, "v1", _} = Workspace.read(ws, "x.txt") + end + + test "restore to :pristine resets head_tree", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "x.txt", "writes") + ws = Workspace.restore(ws, :pristine) + assert ws.head_tree == nil + assert {:error, :not_found} = Workspace.read(ws, "x.txt") + end + end + + describe "diff/1" do + test "lists added/modified/deleted entries", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "src/a.ex", "modified\n") + {:ok, ws} = Workspace.write(ws, "lib/new.ex", "new file\n") + {:ok, ws} = Workspace.rm(ws, "README.md") + + {:ok, changes, _ws} = Workspace.diff(ws) + changes = Enum.sort(changes) + + assert {:deleted, "README.md"} in changes + assert {:added, "lib/new.ex"} in changes + assert {:modified, "src/a.ex"} in changes + end + end + + describe "commit/2" do + test "fails with :nothing_to_commit on pristine workspace", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + + assert {:error, :nothing_to_commit} = + Workspace.commit(ws, message: "x", author: "a 0 +0000") + end + + test "creates a commit; returned ws reads from the new state", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "lib/foo.ex", "content\n") + + {:ok, commit_sha, ws} = + Workspace.commit(ws, + message: "agent: add foo", + author: %{name: "agent", email: "a@b"} + ) + + assert is_binary(commit_sha) + assert byte_size(commit_sha) == 20 + + # head_tree cleared + assert ws.head_tree == nil + # base_ref now points at the new commit (sha, since update_ref: false) + assert ws.base_ref == commit_sha + + # Reads from the new state + assert {:ok, "content\n", _} = Workspace.read(ws, "lib/foo.ex") + + # The old branch ref was NOT advanced (update_ref: false) + {:ok, old_main} = RefStore.resolve(repo.ref_store, "refs/heads/main") + assert old_main != commit_sha + end + + test "with update_ref: , advances the named ref", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "lib/foo.ex", "content\n") + + {:ok, commit_sha, ws} = + Workspace.commit(ws, + message: "agent: add foo", + author: %{name: "agent", email: "a@b"}, + update_ref: "refs/heads/agent-branch" + ) + + # The new ref exists in the workspace's repo + {:ok, advanced_sha} = RefStore.resolve(ws.repo.ref_store, "refs/heads/agent-branch") + assert advanced_sha == commit_sha + assert ws.base_ref == "refs/heads/agent-branch" + + # Subsequent reads work via the new base_ref + assert {:ok, "content\n", _} = Workspace.read(ws, "lib/foo.ex") + end + + test "commit then write builds on top of the committed tree", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "a.txt", "1") + {:ok, _sha1, ws} = Workspace.commit(ws, message: "first", author: "a 0 +0000") + + {:ok, ws} = Workspace.write(ws, "b.txt", "2") + {:ok, _sha2, ws} = Workspace.commit(ws, message: "second", author: "a 0 +0000") + + assert {:ok, "1", _} = Workspace.read(ws, "a.txt") + assert {:ok, "2", _} = Workspace.read(ws, "b.txt") + end + end + + describe "checkout/2" do + test "switches base_ref and discards uncommitted writes", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + {:ok, ws} = Workspace.write(ws, "scratch.txt", "draft") + + {:ok, ws} = Workspace.checkout(ws, "refs/heads/main") + assert ws.head_tree == nil + assert {:error, :not_found} = Workspace.read(ws, "scratch.txt") + end + end + + describe "branching" do + test "two workspaces from the same parent diverge independently", %{repo: repo} do + ws = Workspace.open(repo, "refs/heads/main") + + ws_a = ws + ws_b = ws + + {:ok, ws_a} = Workspace.write(ws_a, "x.txt", "A") + {:ok, ws_b} = Workspace.write(ws_b, "x.txt", "B") + + assert {:ok, "A", _} = Workspace.read(ws_a, "x.txt") + assert {:ok, "B", _} = Workspace.read(ws_b, "x.txt") + end + end +end diff --git a/test/exgit/workspace_vfs_test.exs b/test/exgit/workspace_vfs_test.exs new file mode 100644 index 0000000..b482856 --- /dev/null +++ b/test/exgit/workspace_vfs_test.exs @@ -0,0 +1,17 @@ +if Code.ensure_loaded?(VFS.Mountable) and Code.ensure_loaded?(VFS.ConformanceCase) do + defmodule Exgit.WorkspaceVFSTest do + @moduledoc """ + `VFS.Mountable` conformance for `Exgit.Workspace`. + + Runs the standard vfs backend test set against a workspace built + from an in-memory `Exgit.Repository`. Excluded from the 1.17 CI + tier (where `:vfs` doesn't resolve) via the file-level guard. + """ + + use VFS.ConformanceCase, + backend: &Exgit.Test.WorkspaceVFSFixture.fresh/0, + capabilities: [:read, :write, :lazy] + + @moduletag :vfs + end +end diff --git a/test/support/workspace_vfs_fixture.ex b/test/support/workspace_vfs_fixture.ex new file mode 100644 index 0000000..5ed146a --- /dev/null +++ b/test/support/workspace_vfs_fixture.ex @@ -0,0 +1,36 @@ +defmodule Exgit.Test.WorkspaceVFSFixture do + @moduledoc false + # Backend factory for the VFS conformance test against `Exgit.Workspace`. + # Lives in test/support so it's compiled once and stays on the load path + # for every conformance run. + + alias Exgit.Object.{Commit, Tree} + alias Exgit.{ObjectStore, RefStore, Repository, Workspace} + + @doc """ + Build an empty in-memory repo (root tree with no entries) and open + a workspace over it. Conformance tests write into the fresh tree and + assert reads/walks reflect them — so the starting state is empty. + """ + def fresh do + store = ObjectStore.Memory.new() + + {:ok, root_sha, store} = ObjectStore.put(store, Tree.new([])) + + commit = + Commit.new( + tree: root_sha, + parents: [], + author: "T 0 +0000", + committer: "T 0 +0000", + message: "init\n" + ) + + {:ok, commit_sha, store} = ObjectStore.put(store, commit) + {:ok, ref_store} = RefStore.write(RefStore.Memory.new(), "refs/heads/main", commit_sha, []) + {:ok, ref_store} = RefStore.write(ref_store, "HEAD", {:symbolic, "refs/heads/main"}, []) + + repo = Repository.new(store, ref_store) + Workspace.open(repo, "refs/heads/main") + end +end diff --git a/test/test_helper.exs b/test/test_helper.exs index cb17cd4..8ed250d 100644 --- a/test/test_helper.exs +++ b/test/test_helper.exs @@ -48,4 +48,15 @@ exclude = ] |> Enum.uniq() +# vfs ships VFS.ConformanceCase under test/support, which isn't on the +# load path of consumers — only its own test env. Until vfs publishes +# the harness in lib/, load it directly from the dep so our +# `use VFS.ConformanceCase` works. Skip silently if the file is missing +# (e.g. Elixir 1.17 CI tier where :vfs doesn't resolve). +conformance = Path.join([__DIR__, "..", "deps", "vfs", "test", "support", "conformance_case.ex"]) + +if File.exists?(conformance) and Code.ensure_loaded?(VFS.Mountable) do + Code.require_file(conformance) +end + ExUnit.start(exclude: exclude)