Input::getAccessor(): Try substitution before fetching#380
Conversation
📝 WalkthroughWalkthroughAdds a cache-first fast path and a Changes
Sequence Diagram(s)sequenceDiagram
participant Caller as Caller
participant Fetchers as Fetcher System
participant Scheme as Input Scheme (git/github/path/tarball)
participant Store as Store/Cache
Caller->>Fetchers: getAccessor(input, fastOnly=true)
Fetchers->>Scheme: getAccessor(settings, store, input, fastOnly=true)
alt Scheme returns accessor (fast/cache hit)
Scheme-->>Fetchers: {{accessor, input}}
Fetchers->>Fetchers: fixupAccessor (fingerprint/provenance)
Fetchers-->>Caller: accessor (fast result)
else Scheme returns nullopt (fast miss)
Scheme-->>Fetchers: nullopt
Fetchers->>Store: attempt substitution (makeStoreAccessor)
alt substitution succeeds
Store-->>Fetchers: substituted accessor
Fetchers->>Fetchers: fixupAccessor
Fetchers-->>Caller: substituted accessor
else substitution fails
Fetchers->>Scheme: getAccessor(settings, store, input, fastOnly=false)
Scheme-->>Fetchers: {{accessor, input}} (full fetch)
Fetchers->>Fetchers: fixupAccessor
Fetchers-->>Caller: fetched accessor
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
0c1d355 to
83c7503
Compare
Substitution is generally faster. However, in case the underlying fetcher already has the input locally (e.g. in the tarball/Git cache, or if it's a local Git repo), we don't want to substitute. So InputScheme::getAccessor() now has a `fastOnly` argument to return an accessor only if it doesn't involve any slow, remote fetching. So Input::getAccessor() now does the following: 1. InputScheme::getAccessor(fastOnly=true) 2. Try substitution 3. InputScheme::getAccessor(fastOnly=false)
83c7503 to
a6af221
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/libfetchers/github.cc (1)
282-287:⚠️ Potential issue | 🟠 Major
fastOnlystill performs remote ref resolution.When
revis absent,getRevFromRef()runs before thefastOnlyreturn at Line 310. That means the "fast" probe can still hit the forge API and then repeat the same lookup on the slow path. Returnstd::nulloptbefore resolving floating refs here, or consult only a local ref-to-rev cache.Also applies to: 310-311
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libfetchers/github.cc` around lines 282 - 287, The fast-only probe is doing remote ref resolution because when input.getRev() is empty you call getRevFromRef(settings, store, input) (setting rev and upstreamTreeHash) before checking the fastOnly flag; move the fastOnly early-return so you return std::nullopt (or consult a local ref-to-rev cache) immediately when fastOnly is set and rev is not provided, instead of invoking getRevFromRef(); update the logic around input.getRev(), getRevFromRef(), and the fastOnly check so the fast path never calls the forge API (also apply the same change to the similar code at the 310-311 site).src/libfetchers/git.cc (1)
859-862:⚠️ Potential issue | 🟠 MajorDefer default-HEAD resolution in
fastOnlymode.For remote inputs without an explicit
ref,getDefaultRef()can invokereadHeadCached(), which falls back togit ls-remote, before thefastOnlyexit at Line 921. That reintroduces network I/O into the local-only phase and can repeat the lookup on the slow path. InfastOnlymode this should use cached HEAD metadata only, or returnstd::nulloptand let the slow path refresh it.Also applies to: 920-923
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libfetchers/git.cc` around lines 859 - 862, When running in fastOnly mode and the input has no explicit ref (input.getRef() is empty), avoid calling getDefaultRef(repoInfo, shallow) because it may call readHeadCached() which can fall back to network I/O; instead, if fastOnly is true and the repo is remote, do not compute/insert a default "ref" (leave input.attrs without "ref" or set it to nullopt) so the slow path can resolve HEAD later. Update the block that currently does auto ref = originalRef ? *originalRef : getDefaultRef(...) and the similar logic at the later site (around the fastOnly exit) to short-circuit: if fastOnly && !originalRef then skip getDefaultRef/readHeadCached and preserve the absence of ref. Ensure you reference input.getRef(), canDoShallow(), getDefaultRef(...), readHeadCached(), input.attrs.insert_or_assign("ref", ...) and the fastOnly flag when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/libfetchers/fetchers.cc`:
- Around line 371-375: The try/catch around the fast-path accessor lookup in
fetchers.cc is swallowing all exceptions from scheme->getAccessor and hiding
real failures; remove the catch-all so exceptions propagate, or replace it with
a narrow catch for the specific cache-miss/optional-empty signal returned by
getAccessor (i.e., only handle the explicit "no accessor" case via the
std::nullopt/return path), and otherwise let errors from scheme->getAccessor
propagate so fixupAccessor(res->first, std::move(res->second)) is only reached
when a valid accessor is returned.
In `@src/libfetchers/include/nix/fetchers/fetchers.hh`:
- Around line 243-255: The two getAccessor overloads are mutually recursive and
can recurse forever if neither is overridden; change the non-fast overload to
fail explicitly instead of calling the bool overload. Replace the body of
std::pair<ref<SourceAccessor>, Input> getAccessor(const Settings&, Store&, const
Input&) to throw a clear exception (e.g. std::runtime_error("getAccessor not
implemented for this scheme")) or call std::terminate, so missing
implementations are detected deterministically; keep the bool fastOnly overload
returning std::nullopt as the default behavior. Ensure the thrown message
references getAccessor so it's easy to locate in logs.
---
Outside diff comments:
In `@src/libfetchers/git.cc`:
- Around line 859-862: When running in fastOnly mode and the input has no
explicit ref (input.getRef() is empty), avoid calling getDefaultRef(repoInfo,
shallow) because it may call readHeadCached() which can fall back to network
I/O; instead, if fastOnly is true and the repo is remote, do not compute/insert
a default "ref" (leave input.attrs without "ref" or set it to nullopt) so the
slow path can resolve HEAD later. Update the block that currently does auto ref
= originalRef ? *originalRef : getDefaultRef(...) and the similar logic at the
later site (around the fastOnly exit) to short-circuit: if fastOnly &&
!originalRef then skip getDefaultRef/readHeadCached and preserve the absence of
ref. Ensure you reference input.getRef(), canDoShallow(), getDefaultRef(...),
readHeadCached(), input.attrs.insert_or_assign("ref", ...) and the fastOnly flag
when making the change.
In `@src/libfetchers/github.cc`:
- Around line 282-287: The fast-only probe is doing remote ref resolution
because when input.getRev() is empty you call getRevFromRef(settings, store,
input) (setting rev and upstreamTreeHash) before checking the fastOnly flag;
move the fastOnly early-return so you return std::nullopt (or consult a local
ref-to-rev cache) immediately when fastOnly is set and rev is not provided,
instead of invoking getRevFromRef(); update the logic around input.getRev(),
getRevFromRef(), and the fastOnly check so the fast path never calls the forge
API (also apply the same change to the similar code at the 310-311 site).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a52ffb75-7edb-4994-b956-b4677ddc3ea2
📒 Files selected for processing (4)
src/libfetchers/fetchers.ccsrc/libfetchers/git.ccsrc/libfetchers/github.ccsrc/libfetchers/include/nix/fetchers/fetchers.hh
There was a problem hiding this comment.
♻️ Duplicate comments (3)
src/libfetchers/include/nix/fetchers/fetchers.hh (1)
243-255:⚠️ Potential issue | 🟠 MajorMutual recursion between default overloads can cause infinite recursion.
If a derived
InputSchemeoverrides neithergetAccessoroverload:
getAccessor(settings, store, input, false)callsgetAccessor(settings, store, input)(line 248)getAccessor(settings, store, input)callsgetAccessor(settings, store, input, false).value()(line 254)This results in infinite recursion and a stack overflow. Break the cycle by having the 3-parameter overload throw an explicit error instead of delegating.
,
🔧 Proposed fix
virtual std::pair<ref<SourceAccessor>, Input> getAccessor(const Settings & settings, Store & store, const Input & input) const { - return getAccessor(settings, store, input, false).value(); + throw Error("getAccessor not implemented for input scheme '%s'", schemeName()); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libfetchers/include/nix/fetchers/fetchers.hh` around lines 243 - 255, The two default overloads of getAccessor are mutually recursive and can infinite-loop if a derived InputScheme overrides neither; change the 3-parameter overload getAccessor(const Settings&, Store&, const Input&) to not call the 4-parameter version but instead throw an explicit exception (e.g., std::runtime_error) with a clear message that this overload must be implemented by the derived class (or that fastOnly behavior is unsupported), so callers get a deterministic failure instead of stack overflow; update the function body of getAccessor(const Settings&, Store&, const Input&) accordingly and keep the existing 4-parameter default that returns std::nullopt for fastOnly.src/libfetchers/fetchers.cc (2)
370-375:⚠️ Potential issue | 🟠 MajorDon't swallow all exceptions from the fast-path.
The
catch (...) {}block silently hides any exception from the fast-path accessor lookup. Sincestd::nulloptalready signals "no cached accessor available," real errors (permission failures, corrupt cache, OOM, etc.) should propagate rather than being silently ignored.,
🔧 Proposed fix — remove the try/catch or catch only expected signals
/* See if the input is in the fetcher cache. */ - try { - if (auto res = scheme->getAccessor(settings, store, *this, true)) - return fixupAccessor(res->first, std::move(res->second)); - } catch (...) { - } + if (auto res = scheme->getAccessor(settings, store, *this, true)) + return fixupAccessor(res->first, std::move(res->second));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libfetchers/fetchers.cc` around lines 370 - 375, The empty catch-all is hiding real errors from the fast-path accessor lookup; remove the surrounding try { ... } catch (...) { } around the call to scheme->getAccessor(settings, store, *this, true) so that genuine exceptions propagate, or replace it with a narrow catch that only handles the specific expected, non-fatal signal (e.g., a custom "not found" exception) if one exists; ensure the code still returns std::nullopt when getAccessor returns no cached accessor and that fixupAccessor(res->first, std::move(res->second)) is unchanged.
384-387:⚠️ Potential issue | 🟡 MinorBare
catch (...)also swallows unexpected exceptions.While catching
Error &for substitution failures is reasonable, the subsequentcatch (...) {}can hide non-Errorexceptions (e.g.,std::bad_alloc). Consider removing or logging it.🔧 Proposed fix
catch (Error & e2) { debug("substitution of input '%s' failed: %s", to_string(), e2.info().msg); - } catch (...) { }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libfetchers/fetchers.cc` around lines 384 - 387, The bare catch(...) after the Error & e2 handler swallows non-Error exceptions (e.g., std::bad_alloc); in the function containing the substitution code that calls to_string() and uses debug(...) with Error & e2, remove the empty catch(...) or replace it with a handler that logs the unexpected exception (using debug or another logger) and then rethrows (or std::terminate) so fatal errors aren't silently ignored; ensure the Error & e2 block remains to handle expected substitution failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/libfetchers/fetchers.cc`:
- Around line 370-375: The empty catch-all is hiding real errors from the
fast-path accessor lookup; remove the surrounding try { ... } catch (...) { }
around the call to scheme->getAccessor(settings, store, *this, true) so that
genuine exceptions propagate, or replace it with a narrow catch that only
handles the specific expected, non-fatal signal (e.g., a custom "not found"
exception) if one exists; ensure the code still returns std::nullopt when
getAccessor returns no cached accessor and that fixupAccessor(res->first,
std::move(res->second)) is unchanged.
- Around line 384-387: The bare catch(...) after the Error & e2 handler swallows
non-Error exceptions (e.g., std::bad_alloc); in the function containing the
substitution code that calls to_string() and uses debug(...) with Error & e2,
remove the empty catch(...) or replace it with a handler that logs the
unexpected exception (using debug or another logger) and then rethrows (or
std::terminate) so fatal errors aren't silently ignored; ensure the Error & e2
block remains to handle expected substitution failures.
In `@src/libfetchers/include/nix/fetchers/fetchers.hh`:
- Around line 243-255: The two default overloads of getAccessor are mutually
recursive and can infinite-loop if a derived InputScheme overrides neither;
change the 3-parameter overload getAccessor(const Settings&, Store&, const
Input&) to not call the 4-parameter version but instead throw an explicit
exception (e.g., std::runtime_error) with a clear message that this overload
must be implemented by the derived class (or that fastOnly behavior is
unsupported), so callers get a deterministic failure instead of stack overflow;
update the function body of getAccessor(const Settings&, Store&, const Input&)
accordingly and keep the existing 4-parameter default that returns std::nullopt
for fastOnly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 234a975e-7aa4-4d44-a8c8-28281172b695
📒 Files selected for processing (5)
src/libfetchers/fetchers.ccsrc/libfetchers/git.ccsrc/libfetchers/github.ccsrc/libfetchers/include/nix/fetchers/fetchers.hhsrc/libfetchers/path.cc
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/libfetchers/fetchers.cc (1)
371-375:⚠️ Potential issue | 🟠 MajorDo not hide real fast-path failures.
std::nulloptalready covers the "no local accessor" case. Catching...here also suppresses genuine fast-path errors and can silently fall through to substitution or a remote fetch instead of surfacing the real failure.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/libfetchers/fetchers.cc` around lines 371 - 375, The catch-all block around the fast-path call masks real errors: remove the try { ... } catch(...) { } wrapper so exceptions from scheme->getAccessor(...) and fixupAccessor(...) propagate instead of being swallowed; keep the existing optional handling (std::nullopt) to represent "no local accessor" and only add targeted exception handling if you intend to handle a specific, documented exception type from getAccessor or fixupAccessor.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/libfetchers/fetchers.cc`:
- Around line 371-375: The catch-all block around the fast-path call masks real
errors: remove the try { ... } catch(...) { } wrapper so exceptions from
scheme->getAccessor(...) and fixupAccessor(...) propagate instead of being
swallowed; keep the existing optional handling (std::nullopt) to represent "no
local accessor" and only add targeted exception handling if you intend to handle
a specific, documented exception type from getAccessor or fixupAccessor.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f8fc8252-99be-4c44-8015-b49bdaa00cb4
📒 Files selected for processing (2)
src/libfetchers/fetchers.ccsrc/libfetchers/tarball.cc
Motivation
Substitution is generally faster than fetching. However, in case the underlying fetcher already has the input locally (e.g. in the tarball/Git cache, or if it's a local Git repo), we don't want to substitute. So
InputScheme::getAccessor()now has afastOnlyargument to return an accessor only if it doesn't involve any slow, remote fetching.So
Input::getAccessor()now does the following:InputScheme::getAccessor(fastOnly=true)InputScheme::getAccessor(fastOnly=false)Context
Summary by CodeRabbit
Improvements
Refactor