Working log for the Rust reimplementation of git-lfs: deferred items, open questions, and milestone tracking.
The original Go implementation lives at https://github.com/git-lfs/git-lfs. When behavior is ambiguous in the docs, that is the source of truth — grep there before guessing.
Useful entry points in the upstream tree:
commands/— CLI surface (one file per subcommand). Drives the--helpUX we want to improve on.lfs/— pointer file format, smudge/clean filters, scanner.tq/— transfer queue (concurrent up/download with retries).lfsapi/,lfshttp/— batch API client + HTTP plumbing.git/— git interop (config, refs, attributes, filter-process protocol).locking/— file locks (server-side state).creds/— credential helper integration.ssh/— SSH transfer protocol.fs/— content-addressable object store on disk.tools/,subprocess/,filepathfilter/— utility layers.git-lfs_windows_*.go— Windows-only variants. Defer.
docs/api/— wire protocol (batch, basic transfers, locking, server discovery, authentication, JSON schemas). Authoritative.docs/spec.md— pointer file format. Authoritative.docs/custom-transfers.md— custom transfer agent protocol. Third-party contract; must match exactly.docs/extensions.md— extension protocol.t/— shell integration tests + fixtures + helpers. These drive the binary via its CLI, so they port for free if we keep CLI compatibility. Strongest safety net.
docs/proposals/— historical, mostly superseded.docs/howto/— user-facing docs; we'll write our own.docs/man/— generated from the upstream CLI; copying locks us into their--helpoutput, which is what we're trying to fix.docs/l10n.md— process doc tied to upstream workflow.- All Go source — we're rewriting, not translating.
- Go unit tests (
*_test.go) — useful as behavioral references, but not portable. Reimplement alongside Rust modules. t-usage.sh— checks that the synopsis line readsgit lfs <command> [<args>]. We own our help output and let clap render the defaultUsage: git-lfs [COMMAND]; matching the upstream wording would mean fighting clap on every subcommand. Stays a permanent failure.
Per-suite counts live in tests/SCOREBOARD.md. This section is
the categorical view: what's still broken and what would unlock
it. Used to triage which milestone to pick up next.
- Credentials family — t-credentials, t-askpass test 4. The
basic 401-fill-retry loop ships, but multi-attempt auth (
wwwauth[]/state[]), per-URLcredential.<url>.helper, and NTLM / Negotiate are deferred. - Custom transfer adapters + tus — t-custom-transfers, t-standalone-file, t-batch-storage-upload-tus, t-multiple-remotes. Real protocol surface; basic adapter only ships today.
- Pure-SSH transfer (Phase 4 — SSH locks) — t-batch-transfer 6-8, t-push-failures-local 2/4/6, t-lock 17, t-locks 4, t-unlock SSH variant. Transfer/download/upload already ship through Phase 3; what's left is SSH lock commands plus a pre-push lock-pool spawn that the test helper's connection- count check asserts. See the "Pure-SSH transfer" section below for the full Phase 4 scope.
- SSH URL injection (t-ssh.sh) — both tests check that
ssh://-oProxyCommand=…URLs get rejected. Defense lives intransfer::sshtransfer::connection::build_argv(the--/ dash-strip handling), but the auth-side path increds::SshAuthClient::spawnstill passes user-and-host verbatim. Port the defense over for the git-lfs-authenticate flow too. - ls-files long tail —
--include/--exclude(needs filepathfilter) and the two-ref range form. - Migrate import — 7 tests in t-migrate-import still fail, most
around
--no-rewrite,--object-map, and pattern-accumulation edge cases. - Unshipped commands —
completion,dedup. - Push edge cases —
push (retry with expired actions)needs the action-URL expiry + rebatch flow (companion to the t-expired cluster). - Single-file holdouts — t-batch-error-handling, t-progress, t-batch-storage-encoding, t-batch-unknown-oids, t-clone (ClientCert tests).
Listed by the size of the cluster they unlock. Each entry says what's broken and where to start.
- Credential helper ecosystem. The basic 401 →
git credential fill→ retry → approve/reject loop ships, plus netrc, askpass, extra HTTP headers, content-type, and credential-protect. Still missing: per-URLcredential.<url>.helperconfig, stateful multi-stage auth (state[]/wwwauth[]carried between fills), NTLM / Negotiate. Seecreds/deferral list. - Custom transfer adapters + tus. Third-party protocol
surface; basic adapter only ships today. Pure-SSH transfer
(
git-lfs-transfer) is mostly shipped through Phase 3 — download, upload, batch, andBackendnegotiate dispatch all work. Phase 4 (SSH lock commands + pre-push lock-pool spawn) is what's left; see the dedicated section below. - ls-files long tail.
--include/--excludefilters (needs filepathfilter) and the two-ref range form. - Unshipped commands —
completion,dedup. - Push retry-with-expired-actions. Server hands back stale action URLs; client should rebatch and retry. Shares plumbing with the t-expired suite.
Loose ordering for the deferred work. Each milestone is independent enough to ship on its own; rough effort is small (1-3 days), medium (1-2 weeks), large (multi-week).
- Per-URL credential config + multi-stage auth —
credential.<url>.helper,state[]/wwwauth[]carrying. Owns t-askpass test 4 plus the t-credentials tail. - NTLM / Negotiate — heaviest; defer until a real Windows AD user surfaces.
Two independent adapters in transfer/:
- Custom transfer agent protocol —
docs/custom-transfers.md. Third-party byte-for-byte contract. - Tus resumable uploads — chunk + resume + finalize.
Phases 1-3 shipped (commits ab224, 7cca3, 1b265, 0099f):
pkt-line framing, Connection with version handshake and quit,
Pool (master + lazy-spawned multiplex clients), ssh::batch,
ssh::download, ssh::upload (put-object + verify-object),
Backend enum with negotiate dispatch (lfs.<url>.sshtransfer = always | negotiate | never, default negotiate for ssh://),
HTTP fallback per direction when pool spawn fails, and Drop
for Connection that reaps the child. Wired into
cli::fetcher::LfsFetcher. Lands t-filter-process test 8 and
t-push-failures-local test 8 once scutiger is vendored at
tests/scutiger/bin/git-lfs-transfer (via cargo install --root tests/scutiger scutiger-lfs).
Phase 4 — SSH lock commands (~700 LOC, recovers ~10 tests):
-
New
lock/unlock/list-lockprotocol commands ontransfer::sshtransfer::adapter(or new lock module). Spec atdocs/proposals/ssh_adapter.mdsection "Locks". -
New
LockClienttrait +HttpLockClient(wraps the existingapi::Client::{create_lock, list_locks, delete_lock}) +SshLockClient(uses the SSH pool + new protocol commands). -
Refactor
cli/src/lock.rs,unlock.rs,locks.rsto call throughLockClientinstead ofapi::Clientdirectly. -
LfsFetcher(or sibling type) constructs the rightLockClientper endpoint and shares the SSH pool with the transfer backend. -
Pre-push lock-verify pool spawn: upstream's push always spawns a second
-oControlMaster=yesSSH connection during upload for locking commands "and never shuts it down cleanly" (per theassert_ssh_transfer_sessionshelper's comment intests/t-batch-transfer.sh), even when no locks are involved.expected_ctrl=2on upload tests because of this. We need to spawn the lock pool master during push to match the count; the lock pool itself just sits there. Quirky upstream behavior that's load-bearing for the tests.Test impact:
t-batch-transfertests 6-8,t-push-failures-localtests 2/4/6,t-locktest 17,t-lockstest 4, and thet-unlockSSH variant. The actual transfer logic already works on those after Phase 3 — only the connection-count signature is missing.
completion, dedup. Each is small in isolation — bundle as one
focused pass.
ls-files (--include/--exclude/two-ref range), push retry-with-
expired-actions, checkout --to <path> [--ours|--theirs], fetch
--recent integration, install --manual, fsck <a>..<b> range.
Pluck individual items between bigger milestones rather than as a
single pass.
- Credential helper integration (keychain/wincred/git-credential) — what does the Rust ecosystem give us for free?
- Custom transfer agent protocol — third parties depend on it, must match byte-for-byte.
- Filter-process protocol with git itself — packet-line format, careful with framing.
- Concurrent transfer queue — defaults are CPU-scaled in upstream
(commit
aa08c37f). Worth understanding their tuning before picking ours.
Things we built minimally and need to come back to. Each entry says what's missing and why it was OK to skip for v0.
- Real crash-log integration.
git lfs logsreads/writes<lfs>/logs/correctly (landst-logs.sh), but no other command actually emits a log on push/fetch failure yet. WirePanic-style log writing into the fetch/push error paths so users hitting intermittent server errors get a postmortem to share. - Path encoding/decoding. Git escapes non-ASCII paths (octal
\NNNsequences) when emitting. Belongs ingit/notstore/— the working- tree path layer.
- Size-mismatch cleanup. When smudge sees an object on disk with the right OID but wrong size, it treats it as missing and re-fetches; we should also remove the corrupt local file before fetching.
- Smudge
--path argument. Clean already wires the path through to%fsubstitution; smudge accepts it (git-lfs smudge -- foo.bin) but doesn't use it. Upstream uses it for progress/log messages and to stat the file for size.
lfs.urldiscovery.LfsFetcheronly readslfs.urlfrom the local scope. Upstream also reads.lfsconfigat the repo root and falls back to deriving the LFS URL fromremote.<name>.url(server-discovery doc). Wire those once we have a callsite that needs them.- Auth. Fetcher passes
Auth::None— anonymous only. Real auth needscreds/(git-credential bridge) wired in. Until then, only public LFS endpoints work for on-demand smudge. - Multi-object download batching. Each smudge that misses triggers a
one-object batch. The filter-process protocol's
delaycapability would let us defer multiple smudges, batch the downloads, then return — big checkout speedup. Already on the deferred list underfilter-process.
commitsOnlyscan mode (upstream'sScanRefRangeByTree). Walks trees per commit instead of letting rev-list's--objectsflatten the graph; visits the same blob multiple times but in a tree context. Used by upstream'sls-files-style commands.--recentsemantics (upstream'sfetchRecent/lfs.fetchrecentrefsdays/lfs.fetchrecentcommitsdays). Walks recent refs + recent commits on each ref. Layered on top ofscan_pointers, not a change to it.- Unified rev-walk filter object (mode + skip-deleted-blobs +
skipped-refs). Upstream's
ScanRefsOptionscarries several flags; v0 only exposes plain include/exclude. Add fields as commands need them.
- Tus, custom, ssh transfer adapters. Basic only for v0. Tus is
upload-only (resumable PUT chunks); custom is the third-party plugin
protocol (
docs/custom-transfers.md); ssh is thegit-lfs-transferover SSH protocol. Each is a separate adapter file alongsidebasic.rs. - Range requests / resume. A failed download starts over from byte 0.
HTTP
Range:resume needs the partial tempfile to survive across attempts and the server to advertiseAccept-Ranges. Big-file users will care; small/typical users won't. - Concurrency auto-tuning. Upstream picks
concurrencyfrom CPU count (commitaa08c37f); we hard-code 8. Revisit when we have benchmarks. - Smarter retry classification.
is_retryableonTransferError::Httptreats anything that's not a decode/builder error as retryable. We could be more precise (e.g. don't retry obvious DNS failures). Punt until we see real failure modes. - Per-attempt jitter. Backoff is pure
min(prev*2, max); no jitter to spread thundering herds. Add when we have many concurrent clients. - Cancellation. No way for a caller to cancel an in-flight batch
short of dropping the future. Add a
CancellationTokenonce a CLI command has a Ctrl-C handler. - Single-object download helper.
smudgeon a missing object will want to download exactly one OID without going through the batch-list API. Trivial wrapper overdownload(vec![spec]); add when filter wires up to transfer.
- HTTP client cert (
http.sslCert/http.sslKey). The CA-pin path lands viacli/src/http_client.rs(clearst-clone::cloneSSL), but mTLS (encrypted private keys, thecertcredential helper protocol) is still missing —t-clone::clone ClientCert(×2) is blocked on it. LFS-Authenticate-driven access mode. We surface the header on 401s but don't act on it (e.g. promoting to NTLM/Negotiate). Basic-auth retry viacreds/is implemented; everything else is deferred.- Multi-stage auth (
state[],wwwauth[]). Upstream forwards these between credential fills for stateful helpers (e.g. token providers). Our retry loop is single-stage. - Per-storage-URL auth. Only the batch endpoint goes through the retry loop. Pre-signed action URLs (S3 etc) typically don't need creds, but custom storage that 401s on the action would need its own pass.
- Typed timestamps.
Lock.locked_atandAction.expires_atare carried asString. Parsing into a typed datetime needs a date crate (chrono / jiff / time) — defer until a caller actually needs to compare. - Retry / backoff.
is_retryable()is a hint; thetransfer/queue will own the actual retry loop with jitter/backoff. - Tus + custom + ssh transfer adapters. Out of scope for
api/(it only models the batch negotiation). Adapters live intransfer/.
remote.<name>.pushurl. Upstream honors a separate push URL for the same remote; we only readremote.<name>.url. Minor accuracy gap for users with split read/write URLs.remote.<name>.lfspushurl. Per-remote push-only LFS URL. Skipped.lfs.<url>.access. Force an access mode (basic/ntlm/negotiate) per endpoint. Relevant once NTLM/Negotiate land.- FETCH_HEAD fallback. Upstream falls back to the remote URL in
.git/FETCH_HEADwhen no other source resolves. Edge case; rarely matters given ourorigindefault.
- SSH connection multiplexing / retries.
creds::SshAuthClientships the basic spawn + cache flow but doesn't honorlfs.ssh.automultiplex(-oControlMaster=yes -oControlPath=...to re-use a single SSH connection across calls),lfs.ssh.retries(upstream retries the SSH command up to 5 times by default), orlfs.activitytimeout. We also don't fall back to HTTP Basic whengit-lfs-authenticatefails — upstream does, after the retry budget is exhausted.core.sshCommandgit config is also not honored (we readGIT_SSH_COMMAND/GIT_SSHenv vars only). Owns t-batch-transfer tests beyond the basic auth flow once we start exercising connection reuse. lfs.defaulttokenttlfallback. Upstream falls back to this config value whengit-lfs-authenticatereturns noexpires_at/expires_in. We treat "no expiry" as "never expires until process exit", which is fine for the MVP test but loose for long-running daemons.- NTLM / Negotiate (Kerberos). Upstream supports both via separate access modes. Out of scope until a real user hits a Windows AD deployment.
- URL-pattern config.
credential.<url>.helper/credential.<url>.useHttpPathper-host overrides — git-credential does half of this for us already, and ourhas_credential_helperhonors the host-prefix form (credential.<scheme>://<host>.helper) for askpass selection. The full URL pattern matching upstream does (longest-prefix wins, including path) is not yet wired intouseHttpPathor general per-key lookup. - Multi-attempt auth retry.
Client::send_with_auth_retry_responsedoes one fill+retry per request. Upstream'sDoWithAuthloops up to 3-4 times and emitsapi: too many authentication attemptswhen the budget is exhausted. Owns t-askpass test 4 plus several t-credentials tests. Bundle with the wwwauth / state slice — they share the loop machinery. - Path-scoped queries. [
Query::from_url] populates path; theClient::with_use_http_pathbuilder now wires the globalcredential.useHttpPathconfig through. URL-scopedcredential.<url>.useHttpPathoverrides land with the URL-pattern matching above. - Path bytes vs UTF-8.
Query.pathisString, so our percent- decoder maps invalid UTF-8 byte sequences toU+FFFD. Upstream Go passes raw bytes through (Go strings hold arbitrary bytes). Real- world LFS paths are ASCII so no current test trips this, but the divergence is real. Fix: changeQuery.path: String→Query.path: Vec<u8>(orbstr::BString) and propagate through theHelpertrait +git_helper::write_input. Defer until the whole-codebase audit shakes out other non-UTF-8 path handling. - Approve/reject async safety. A
git credential approvefailure is swallowed (best-effort). If we ever target a flaky keystore that needs retry, surface it.
--jsonaction capture for non-dry-run.--jsonworks for--dry-run(the batch runs, URLs captured, emitted asactions). For non-dry-run we currently emit transfers without action URLs — needs the transfer queue to surface the batch response back to the caller.- Progress events. v0 prints a one-line summary; we already have
Event::Progressflowing throughtransfer/, just need a renderer (e.g.indicatif-based bar) wired up.
- End-to-end test against real
git push. Our e2e tests drive pre-push directly with hand-built stdin. Worth a separate test that spawnsgit pushagainst a wiremock-backed remote to catch hook invocation bugs (PATH, exit codes propagating) — but realgit pushneeds an SSH or HTTP git remote, so the setup is heavier.
- Action-URL expiry retry.
t-push.sh::push (retry with expired actions)— server hands back anexpires_atin the past, expecting the client to re-batch and pick up a fresh action URL on retry. We currently retry but don't re-issue the batch to refresh the URL. Shares plumbing with thet-expiredsuite.
- Don't read every tracked file.
pullcurrently walks every tracked working-tree file and tries to parse it as a pointer (skipping anything ≥ MAX_POINTER_SIZE). Cheap enough for v0; for huge non-LFS repos we could intersect withgit ls-files :(attr:filter=lfs)or query the scanner's HEAD-snapshot result first. - Conflict / dirty working-tree handling. v0 happily overwrites any pointer-shaped file it can resolve from the store. Probably want a guard ("file has uncommitted edits → skip with warning") once users start trusting this in serious workflows.
--systemscope. Trivial — just anotherConfigScopevariant.--worktreescope. Requires git ≥ 2.20 and worktree-feature config.--file <path>. Write to an arbitrary config file.--manual. Print instructions instead of installing.--skip-smudge. Different filter set (smudge gets--skipflag, so pointers stay as pointers in the working tree).- Upgradeable old hook contents. Upstream tracks several historical
hook script versions and rewrites them silently. We require exact match
with current content (or
--force). Migrating users from upstream Go will hit the conflict path; mention this once we care about that audience.
--filename. Escape glob characters in a literal filename so[foo]bar.txtmatches the literal file rather than the glob.t-track.sh::track: escaped glob pattern …(×2) and the second invocation oftrack: verbose loggingexercise it.--no-modify-attrs. Display-only mode that skips the.gitattributeswrite entirely (we already have--dry-run, which also skips the re-stage).- Cwd-relative pattern normalization. When run from a subdirectory,
upstream rewrites bare patterns relative to the repo root (so
cd a; git lfs track test.filerecordsa/test.file). We pass patterns through verbatim.t-track.sh::track representationcovers this. core.attributesfileglobal gitattributes —list_lfs_patternswalks per-directory.gitattributes+.git/info/attributes, but doesn't read the file pointed at bycore.attributesfile.t-track.sh::track (global gitattributes)covers this.
- Native
cargo testport of the upstreamt-*.shsuite. The current setup vendors upstream's Go helpers and runs the shell tests viaprove. Long-term goal: rewrite as native Rust integration tests socargo testruns them, nomakestep, no Go toolchain. Big undertaking (~100 test files, ~200 assertions) — handle one test file at a time as we touch each command. - Two upstream helpers excluded because they import internal
upstream Go packages (
lfsapi,tools,config):lfstest-customadapterandlfstest-standalonecustomadapter. Referenced only byt-custom-transfers.shandt-standalone-file.sh; the rest of the suite doesn't need them.lfstest-testutils(theaddcommitshelper used by ~11 t-*.sh files for fixture-building) is reimplemented in Rust atcli/src/bin/lfstest-testutils.rs.
delaycapability. v0 handshake doesn't advertise it. Oncetransfer/exists, supporting delay lets us defer multiple smudges, batch the download, then return. Big checkout speedup; not required for correctness.list_available_blobscommand. Pairs withdelay.--skipflag. Pointer-passthrough mode for smudge (working tree keeps pointers literal). Useful forgit lfs install --skip-smudgeworkflows.- Pathname-based include/exclude filter (
lfs.fetchinclude/lfs.fetchexclude). Lets users opt out of fetching certain large paths. - Malformed-pointer accumulator + final stderr summary. Upstream prints
a "Encountered N files that should have been pointers" report at end of
session if any per-file
clean/smudgecalls hit malformed pointers.
--system/--worktree/--file— only--global(default) and--localwired up so far. Mirrors the install gap.uninstall hookssubcommand — upstream exposes hook-only removal as a nested subcommand. We collapse into--skip-repoinversion, but a dedicated subcommand may be worth adding for parity.
escapeAttrPattern/unescapeAttrPatternparity — upstream escapes#, spaces, and a handful of glob characters when comparing patterns, sogit lfs untrack 'foo bar.bin'matches the escaped form written bytrack. We currently do exact-string match. Not an issue for typical patterns (*.jpg,data/*.bin); revisit if a test hits it.
locks --localand--cached. Both rely on an on-disk lock cache upstream maintains under.git/lfs/cache/locks/<remote>/; we don't have that cache yet. Adding it is mostly a JSON-on-disk shim aroundClient::list_locksresults.unlock --forcepath fallback. Whenresolve_lock_pathfails (e.g. file is gone), we currently do a minimal\\→/+ strip./. Upstream canonicalizes more carefully. Revisit if tests hit it.--cached/--localforlocks(require an on-disk lock cache we don't have). Tracked alongside the rest of the cache work.
--include/--excludepath filters. Upstream filters output by working-tree pattern. Builds on a filepathfilter-style glob matcher we haven't ported yet (see alsocli fetch).- Two-ref range form —
git lfs ls-files <a> <b>walks pointers added between two refs. Maps ontorev_list(include=[b], exclude=[a])but the CLI parsing must distinguish "second arg is a ref" from "second arg is a path".
- Trimmed output fields. Upstream emits
LocalGitStorageDir,LocalReferenceDirs,ConcurrentTransfers,TusTransfers,BasicTransfersOnly,SkipDownloadErrors,FetchRecentAlways,FetchRecentRefsDays,FetchRecentCommitsDays,FetchRecentRemoteRefs,PruneOffsetDays,PruneVerifyRemoteAlways,PruneRemoteName,LfsExtensions,GitProtocol, …. We skip these for now because most refer to config knobs we don't honor yet — adding stubs would lie. Add each as the corresponding feature lands. auth=<mode>annotation. Upstream printsEndpoint=… (auth=basic)/(auth=none)/ etc. We don't track access mode per endpoint.--helpcontent. Upstream'senvis also where users go to copy a bug report. We could format ours as a fenced markdown block for paste- friendliness once the surface stabilizes.
- "Objects to be pushed to /" section. Upstream
prefixes its output with the LFS pointers reachable from HEAD but not
the upstream tracking ref. Skipped for v0 because it requires resolving
the upstream tracking ref + a separate
scan_pointersrange walk per invocation. Useful but not core. - Symlinked working dir. Upstream resolves symlinks in
cwdbefore computing relative paths so the displayed paths look right when the usercd'd via a symlink. We just print repo-relative paths.
All three phases shipped: info, import, export. Subprocess
plumbing (fast-export → transform → fast-import + working-tree
refresh + dirty-tree refusal) lives in migrate/pipeline.rs so
import and export share it.
Phase 1 deferrals (info):
--include-ref/--exclude-ref. v0 only honors positional branch args +--everything. Append-style refspec flags are a small follow-on; left out so the first cut keeps the CLI surface tight.--unit <unit>. v0 always prints with auto-scaling KB/MB/GB.--object-map. Records old→new commit SHAs.
Phase 2 deferrals (import):
- First-commit-wins for shared blobs. If the same blob OID appears at two paths with conflicting filter outcomes, the first commit's decision wins. Real-world impact is low (typical filters either match or don't match by extension) but documented for clarity.
- In-memory blob buffering.
--full-treeemits every blob before any commit; we buffer them all in RAM until commits drain them. Massive repos may hit memory pressure. v2 fix: a streaming convert that decides without knowing the path. - No automatic ref backup. We print pre-migrate ref SHAs so the user can roll back manually. Upstream doesn't auto-backup either.
--object-map <file>. Same gap as info — emit old→new SHA mapping for downstream tooling.--verboseper-commit progress. v0 prints a one-line summary.- Working-copy-clean prompt. v0 errors out on a dirty tree; upstream prompts. The friendly prompt requires TTY interaction.
- Pattern accumulation timing. Patterns visible to commit N
reflect only what was discovered in commits ≤ N (matches upstream).
An ambitious v2 could two-pass the stream so every commit's
.gitattributesshows the full eventual pattern set.
Phase 3 deferrals (export):
- Pre-download missing objects. Upstream's
migrate exportruns a download queue against the configured remote first, so any pointer whose object isn't local gets fetched before the rewrite. We skip this — pointers without local content pass through unchanged (no truncation), and the user's expected togit lfs fetchfirst if they care. --remote <name>. Picks which remote to pre-download from. Tied to the deferral above.- Post-export
prune. Upstream prunes the now-orphaned LFS objects automatically; ours leaves them —git lfs prunemanually does the job. - First-reference-wins. Same caveat as import: if the same git blob OID lives at two paths with different filter outcomes, the first-encountered M directive's path decides.
- Diff-tree optimization. All three hooks currently call
enforce_workdir, whichgit ls-files-walks the entire index and chmods every lockable match. Upstream optimizes by diffing the before/after tree (post-checkout/post-merge) or the index (post- commit) and only re-stating changed paths. Worth doing once we hit a large-repo perf complaint; correctness is the same either way.
--to <path> [--ours|--theirs|--base]conflict-resolution form. Used during merges to extract one stage of a conflicted LFS file. Needs index-stage parsing (git ls-files -sreports stage 1/2/3 for conflicted entries, plus the blob shas at each stage). v0 only ships the bulk re-smudge mode.- Glob / wildcard path patterns. v0 supports exact paths and
trailing-slash directory prefixes only. Shells handle
*.binanddata/*.binfor the common case (expanded against cwd before invocation), so the gap mostly bites recursive globs and patterns intended to match files that aren't in the user's cwd. - Progress meter. Upstream emits a TQ-style "checking out N files" meter. We just print a one-line summary at the end.
filepathfilterparity. Upstream uses gitignore-syntax matching (negative patterns, comments, escapes). v0's matcher is straight literal/prefix. When wiring this up, reach forglobset(compile patterns, match strings) —ignoreis overkill for our use case because we don't need its directory walker or hierarchical.gitignoretraversal.
<a>..<b>range form. Upstream parses a single arg as either a ref (e.g.HEAD) or a range (e.g.main..HEAD); we only accept a single ref. Wire the splitter once we have a range parser worth reusing.- Index scanning when invoked bare. With no args, upstream scans
HEAD's history and the index (so newly-staged-but-uncommitted
pointers fail fsck if their object isn't in the store). We only
scan the named ref's history. Implementation: pair our scan with a
git ls-files -sindex walk. Shipped — fsck loadsunexpectedGitObjectdetection. Upstream's--pointersmode flags blobs that should be pointers (per.gitattributes) but don't parse.AttrSet::from_workdir, walks every blob viascan_tree_blobs, and flags any LFS-tracked path whose blob fails to parse as a canonical pointer (or is too big).lfs.fetchexcludehonor. Skip pointers whose paths match the configured exclude pattern, otherwise users who fetched a subset see false-positive "missing" reports.
- Hook-conflict UI. When a custom hook exists, upstream prints
Hook already exists: pre-push\n\n\t<contents>\n\nTo resolve …with the merge /--force/--manualadvisory. We currently surface the install-error message inline. Owns t-update test 1. - Leading-space hook migration. Upstream rewrites old templates whose body lines have leading TAB characters (the pre-2.6 form); ours treats those as a custom hook and refuses. Owns t-update test 2.
lfs.<url>.accessmigration. Upstream rewritesprivate→basicand prunes invalid values duringupdate. Tracked but no test currently asserts it after our 0.3 cleanups (t-update test 3 was a no-op assertion).--manualmode. Print the install-by-hand instructions instead of writing the hook files.
- Compare via
git hash-object. Upstream computes git blob OIDs for both pointer texts and compares those. We compare raw byte equality of our canonical encoding against the supplied bytes — semantically identical for any real input but a small fidelity gap worth flagging.
- Remaining commands —
dedup,standalone-file,update. All niche; mostly polish.