Skip to content

fix(build): stop build-time projection from caching an empty corpus on Windows#11

Merged
phil-scott-78 merged 1 commit into
mainfrom
fix/build-empty-search-llms-startup-ordering
May 29, 2026
Merged

fix(build): stop build-time projection from caching an empty corpus on Windows#11
phil-scott-78 merged 1 commit into
mainfrom
fix/build-empty-search-llms-startup-ordering

Conversation

@phil-scott-78
Copy link
Copy Markdown
Contributor

Problem

On Windows, dotnet run -- build could produce a site whose search index and llms.txt are empty even though every HTML page rendered correctly — and the build still reported success. Every route logged SiteProjection: failed to project /<route>/, skipping with InvalidOperationException: The server has not been started or no web application was configured. Reproduced on a vanilla AddDocSite project; works on Linux CI and in dev/serve mode.

Root cause (instrumented + confirmed)

AuditRunner.StartAsync (a hosted service) fires its initial audit pass fire-and-forget. In build mode that pass resolves LinkAuditor and self-fetches every route through ISiteProjectionRenderedHtmlFetcherHttpDispatcher.CreateClient(). Startup instrumentation showed this runs before the in-process TestServer has started (its Application is still null), so CreateHandler() throws.

SiteProjection.RenderOneAsync's broad catch treated that infrastructure failure identically to a per-page content error: it returned null, SeedAsync completed with an empty corpus, and AsyncLazy (which only evicts on fault) cached the poison. The search/llms emitters then consumed the empty projection. Linux/dev win the startup race (server starts first); Windows loses it deterministically.

Fix

  1. OrderingAuditRunner runs its initial pass on IHostApplicationLifetime.ApplicationStarted instead of inside StartAsync. That fires only after every hosted service (the web server included) has started, so the self-fetch never races server start. Build-time link-audit coverage is preserved.
  2. Fail loud, never cache poison — added SelfFetchUnavailableException; HttpDispatcher.CreateClient throws it for both the un-started-TestServer and no-Kestrel-address cases. RenderOneAsync deliberately does not catch it (catch (Exception ex) when (ex is not SelfFetchUnavailableException)), so an infra failure faults SeedAsync, AsyncLazy evicts, and the next access retries instead of baking an empty corpus.

The report also suggested an "abort the build if 0-of-N routes project" guardrail; intentionally skipped to keep the change surgical — fixes #1 + #2 already make this failure mode either work or fail loud.

Verification

  • Full solution builds clean; 880 unit + 48 integration tests pass (integration suite exercises the real TestServer self-fetch path).
  • DocSiteScaffoldExample build: search index n=2 with both docs, llms.txt lists both pages — 3/3 runs (previously a deterministic empty build on Windows).
  • BareHostSearchExample (~100-doc corpus): 85 pages, 484 heading-level search docs.

Tests added/updated

  • HttpDispatcherTests (new) — un-started TestServer and no-address server both surface SelfFetchUnavailableException.
  • SiteProjectionTests — infra failure during seed faults rather than caching empty, then retries to a populated corpus once the server is up.
  • AuditRunnerTests — updated for the ApplicationStarted gate.

…n Windows

On Windows, `dotnet run -- build` could ship an empty search index and
header-only llms.txt while reporting success. AuditRunner.StartAsync fired its
initial pass fire-and-forget; in build mode that pass self-fetched every route
through the SiteProjection before the in-process TestServer had started. The
self-fetch threw "server has not been started", RenderOneAsync swallowed it as a
per-page skip, SeedAsync completed with an empty corpus, and AsyncLazy cached the
poison that the search/llms emitters then consumed. Linux/dev won the startup
race; Windows lost it deterministically.

Two fixes:

1. Ordering: AuditRunner now runs its initial pass on
   IHostApplicationLifetime.ApplicationStarted instead of inside StartAsync, so
   the web server is guaranteed up before any self-fetch. Preserves build-time
   link-audit coverage.

2. Fail loud: add SelfFetchUnavailableException; HttpDispatcher.CreateClient
   throws it when the TestServer isn't started or Kestrel has no address.
   SiteProjection.RenderOneAsync no longer catches it, so an infrastructure
   failure faults SeedAsync (AsyncLazy evicts and retries) rather than caching an
   empty corpus as if the crawl had completed.

Tests: HttpDispatcherTests pins the wrapped-exception behavior; SiteProjection
faults-then-retries to a populated corpus instead of caching empty; AuditRunner
tests updated for the ApplicationStarted gate.
@phil-scott-78 phil-scott-78 merged commit ba78518 into main May 29, 2026
5 checks passed
@phil-scott-78 phil-scott-78 deleted the fix/build-empty-search-llms-startup-ordering branch May 29, 2026 14:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant