Internal links on the published docs site break regularly. Root cause: the docs are 13 separately-built Astro/Starlight sites (docs/ + 12 lexicons) that get stitched into one tree at .docs-dist/chant/ by scripts/build-docs.sh. Each Starlight build only validates intra-site slug links — cross-site links written as raw paths (link: '/lexicons/k8s/gke-composites/' in docs/astro.config.mjs, hand-written [text](/chant/foo) in MDX) are opaque href strings to the per-site builds. Nothing catches typos, renames, or base-prefix mistakes until users hit a 404 in prod.
The unified .docs-dist/chant/ tree is the only place where cross-site links resolve against real files, so any validation must run against that assembled output — not against individual lexicon builds.
Part 1: Link checker (done)
just docs-check-links runs lychee in offline mode against the unified output. Lychee crawls every emitted HTML file and resolves both relative and root-relative links against .docs-dist/, matching how GitHub Pages serves the site. Asset references (CSS/JS/fonts/etc.) and Starlight's pagefind/ index are excluded — focus is on page-to-page navigation.
Part 2: Systemic base-prefix bug (root cause of most breakage)
First lychee run reported ~603 broken links. Initial guess (lexicon sidebars missing the /chant/ base prefix) was wrong. Audit of docs/src/content/docs/** reveals the actual breakdown:
- 485 broken root-relative links across 120 source files — the dominant class
- 483 are
/api/* references in docs/src/content/docs/api/* — auto-generated by TypeDoc on every build, so hand-fixes are useless
- 2 are hand-written
/lexicons/* links: the Introduction page (getting-started/introduction.mdx:16, 11 lexicon links on one line) and lexicon-authoring/observation.mdx:260
- 1 is a hardcoded literal in
packages/core/src/codegen/docs-sections.ts:153 that gets embedded in every generated lexicon overview page ([Serialization](/serialization/output-formats))
- 262 links already use the correct
/chant/... form elsewhere — style is mixed across the codebase today
- The remaining ~115 of the 603 lychee errors are cross-doc bugs unrelated to base-prefixing (e.g.
chant/guide/multi-lexicon from two tutorials, mis-prefixed temporal links, the temporal ops/worker-profiles confusion)
Root cause: the classic Astro/Starlight gotcha — root-relative markdown links inside .md/.mdx content do not get the configured base: '/chant' prepended at build time. Astro only base-prefixes its own internal navigation (sidebar link: entries, Starlight slug: entries) — never link hrefs that appear inside markdown body content. The TypeDoc-generated API docs and any hand-written [foo](/bar/) link both fall through this gap.
Fix: shared rehype-base-url plugin
Add a small rehype plugin (packages/core/src/codegen/rehype-base-url.ts) that walks the HAST tree and rewrites <a href> attributes starting with / to start with <base>/ instead. Register it in all 13 Astro configs (main + 12 lexicons). The lexicon configs are emitted from a shared template in packages/core/src/codegen/docs.ts:225-241, so the codegen patch propagates the wiring to all 12 lexicons on next regenerate. The hardcoded /serialization/... literal in docs-sections.ts:153 gets fixed at source (write /chant/serialization/output-formats) since the plugin in a lexicon site would otherwise mis-prefix it to /chant/lexicons/<name>/serialization/....
Plugin honors a projectBase: '/chant' option so it skips already-prefixed /chant/... links across all 13 sites, making it fully idempotent against the 262 already-correct links.
Tasks
Done when
just docs-check-links exits 0 on a clean main (or with only residual cross-doc bugs tracked separately)
- CI blocks PRs that introduce broken internal doc links
- The systemic base-prefix bug is fixed at the plugin + codegen level so adding a new lexicon or regenerating TypeDoc API docs doesn't re-introduce it
Out of scope
- External link checking (HTTP/HTTPS) — separate, flakier concern
- Migrating to a single mono-Astro site or a typed URL module — possible follow-up
Internal links on the published docs site break regularly. Root cause: the docs are 13 separately-built Astro/Starlight sites (
docs/+ 12 lexicons) that get stitched into one tree at.docs-dist/chant/byscripts/build-docs.sh. Each Starlight build only validates intra-site slug links — cross-site links written as raw paths (link: '/lexicons/k8s/gke-composites/'indocs/astro.config.mjs, hand-written[text](/chant/foo)in MDX) are opaque href strings to the per-site builds. Nothing catches typos, renames, or base-prefix mistakes until users hit a 404 in prod.The unified
.docs-dist/chant/tree is the only place where cross-site links resolve against real files, so any validation must run against that assembled output — not against individual lexicon builds.Part 1: Link checker (done)
just docs-check-linksruns lychee in offline mode against the unified output. Lychee crawls every emitted HTML file and resolves both relative and root-relative links against.docs-dist/, matching how GitHub Pages serves the site. Asset references (CSS/JS/fonts/etc.) and Starlight'spagefind/index are excluded — focus is on page-to-page navigation.Part 2: Systemic base-prefix bug (root cause of most breakage)
First lychee run reported ~603 broken links. Initial guess (lexicon sidebars missing the
/chant/base prefix) was wrong. Audit ofdocs/src/content/docs/**reveals the actual breakdown:/api/*references indocs/src/content/docs/api/*— auto-generated by TypeDoc on every build, so hand-fixes are useless/lexicons/*links: the Introduction page (getting-started/introduction.mdx:16, 11 lexicon links on one line) andlexicon-authoring/observation.mdx:260packages/core/src/codegen/docs-sections.ts:153that gets embedded in every generated lexicon overview page ([Serialization](/serialization/output-formats))/chant/...form elsewhere — style is mixed across the codebase todaychant/guide/multi-lexiconfrom two tutorials, mis-prefixed temporal links, the temporalops/worker-profilesconfusion)Root cause: the classic Astro/Starlight gotcha — root-relative markdown links inside
.md/.mdxcontent do not get the configuredbase: '/chant'prepended at build time. Astro only base-prefixes its own internal navigation (sidebarlink:entries, Starlightslug:entries) — never link hrefs that appear inside markdown body content. The TypeDoc-generated API docs and any hand-written[foo](/bar/)link both fall through this gap.Fix: shared rehype-base-url plugin
Add a small rehype plugin (
packages/core/src/codegen/rehype-base-url.ts) that walks the HAST tree and rewrites<a href>attributes starting with/to start with<base>/instead. Register it in all 13 Astro configs (main + 12 lexicons). The lexicon configs are emitted from a shared template inpackages/core/src/codegen/docs.ts:225-241, so the codegen patch propagates the wiring to all 12 lexicons on next regenerate. The hardcoded/serialization/...literal indocs-sections.ts:153gets fixed at source (write/chant/serialization/output-formats) since the plugin in a lexicon site would otherwise mis-prefix it to/chant/lexicons/<name>/serialization/....Plugin honors a
projectBase: '/chant'option so it skips already-prefixed/chant/...links across all 13 sites, making it fully idempotent against the 262 already-correct links.Tasks
just docs-check-linkstarget (lychee, offline mode, excludes assets + pagefind)rehype-base-urlplugin + unit testdocs/astro.config.mjs+docs/package.jsonpackages/core/src/codegen/docs.tstemplate to emit plugin wiring in all 12 lexicon configs/serialization/output-formatslink indocs-sections.ts:153lexicon-authoring/docs-site.mdxdocumenting that cross-site links must use the full/chant/...pathDone when
just docs-check-linksexits 0 on a cleanmain(or with only residual cross-doc bugs tracked separately)Out of scope