A protocol-based virtual filesystem for Elixir. Mount git repos, in-memory
scratch space, and database-backed application state behind one value, then
thread that value through your agent loop the way Plug.Conn flows through
a request pipeline.
fs =
VFS.new()
|> VFS.mount("/", VFS.Memory.new(%{"/repo/README.md" => "hello\n", "/tmp/scratch" => ""}))
{:ok, "hello\n", fs} = VFS.read_file(fs, "/repo/README.md")
{:ok, fs} = VFS.write_file(fs, "/tmp/scratch", "world\n")
# Lazy traversal — composes with Stream.take/2 so an infinite-depth
# backend stays bounded by what the consumer asks for.
2 =
fs
|> VFS.walk("/")
|> Stream.filter(fn {_, %VFS.Stat{type: t}} -> t == :regular end)
|> Enum.count()
# Structured errors:
{:error, %VFS.Error{kind: :enoent}} = VFS.read_file(fs, "/nope")Real-world FS-shaped backends — git repos, S3 buckets, postgres-backed
application data — don't fit File. Shimming each one into File-like
calls means duplicating mount-routing, error mapping, and cache threading
at every consumer.
VFS.Mountable is a single protocol; backends are plain structs that
defimpl it. Reads return {:ok, value, fs} so lazy backends (a
partial-clone git repo, a paginated S3 lister) populate caches in their
struct on read and the caller threads the updated struct forward. Mount
tables nest because %VFS{} itself implements the protocol — composing
multiple backends under one root is mount/3 calls, no router, no glue.
Pure data, no processes, no global state. The whole FS is a value you hold; it works inside releases, on Nerves, in Lambda, across distributed BEAM nodes — anywhere a value travels.
- Not a
Filereplacement for the host OS. UseFilefor that. - Not POSIX-complete. No symlinks, hard links,
chmod,lstatin v0.1. The protocol is shaped to virtual-FS semantics (git blobs, S3 objects, DB rows), not OS files. - Not a process-backed sandbox. No
start_link, no supervision. Sandboxing a tool's view of the FS is a job for whatever spawns the tool, not for the FS abstraction.
For a guided walkthrough — every design decision demonstrated in
runnable Elixir, organized around three real agent-loop scenarios —
read test/showcase_test.exs:
mix test test/showcase_test.exs --trace # local sections
mix test test/showcase_test.exs --trace --include integration_network # + real github cloneFour sections, eight tests, ~365 lines including prose:
- Solo agent — in-memory scratch only.
- Read-only codebase — real git clone via
:exgit. - App service backend — postgres-shaped (the
VFS.Test.AppServicestand-in maps directly to a realPostgrex/Ectoimpl). - The full loop — codebase + scratch + app service mounted under
one
%VFS{}and threaded through five agent steps.
Recommended starting point.
Real-network demos in examples/:
# Structured codesearch: clone anthropics/skills, parse YAML front-matter
# from every SKILL.md, return {name, description, license, path} records.
MIX_ENV=test mix run examples/list_skills.exs
# Regex grep across an arbitrary repo.
MIX_ENV=test mix run examples/grep.exsThe MIX_ENV=test is required because the :exgit backend wrapper
lives in test/support/. Production usage will move that defimpl into
:exgit itself.
Pre-1.0. The protocol shape is settled (per SPEC.md); the
API may adjust before 1.0.0 based on real-world consumer use.
Currently exercised against :exgit
in test/integration/exgit_test.exs and against a live GitHub clone in
test/integration/codesearch_smoke_test.exs. Planned consumers:
just_bash and
pyex.
Reference numbers in bench/baselines.md. On an
M3 Max under Elixir 1.20-rc.3:
VFS.Path.normalize/1— 250 ns–1 µs (every public op normalizes once)VFS.Memory.read_file— 350 ns at 1 KB, 570 ns at 1 MB- Mount-table dispatch tax — ~1–2 µs/op vs direct backend calls
walkover 10k files — 77 ms (~7 µs/file, linear)
def deps do
[{:vfs, github: "ivarvong/vfs"}]
end(Not yet published to hex.pm.)
mix setup # deps + dialyzer PLT
mix check # format, compile -W, credo, dialyzer, 100% coverage
mix test # fast loopmix check is the gate. CI runs it on every push.
MIT — see LICENSE.