fix(docs): strip .md extension from internal link targets#69
Conversation
Mintlify routes URLs without the \`.md\` extension. A markdown link like \`[index](./index/index.md)\` 404s when clicked, even when the target file exists on disk and its extensionless URL form (\`/.../index/index\`) returns 200. Add a post-process step (TS + Rust templates) that rewrites all internal markdown link targets: [label](dir/index.md) → [label](dir) [label](leaf.md) → [label](leaf) [label](path.md#frag) → [label](path#frag) Skip URL-scheme targets (http://, mailto:, data:) and qualified paths starting with #. The /index.md → parent-dir collapse aligns with the splice's strip-/index registration so URLs match Mintlify's native dir-page resolution. Symptom (TS): clicking any module link from a package landing (e.g. \`[index](./index/index.md)\` on \`/sdks/typescript/api/logger\`) navigated to \`.../index/index.md\` which Mintlify treats as a literal path and 404s. After the strip, it navigates to \`.../index\` (200) which auto-resolves to the landing. Live sdks/typescript/api/ and sdks/rust/api/ rewritten in-place so the fix applies immediately on pull. 207 TS files updated, 0 Rust files (cargo-doc-md uses in-page anchors so no rewrite needed; the step is defensive for embedded README content).
|
Important Review skippedToo many files! This PR contains 209 files, which is 59 over the limit of 150. ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (209)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a post-processing step in the documentation generation workflows for both Rust and TypeScript to strip .md extensions from internal Markdown links, aligning with Mintlify's routing requirements. Feedback identifies several technical limitations in the Python script's regex logic, including its failure to ignore links within code spans, incorrect handling of links with titles, and inconsistent transformations for index.md files. Additionally, the duplication of this complex logic across multiple workflow files was flagged as a maintenance risk, with a recommendation to extract the script into a shared, reusable component.
| link_re = re.compile(r'(?<!\!)\[([^\]]*)\]\(([^)]+)\)') | ||
|
|
||
| def strip_md(target: str) -> str: | ||
| first_seg = target.split("/", 1)[0] | ||
| if ":" in first_seg: | ||
| return target | ||
| path, _, frag = target.partition("#") | ||
| frag = ("#" + frag) if frag else "" | ||
| if path.endswith("/index.md"): | ||
| path = path[: -len("/index.md")] or "." | ||
| elif path.endswith(".md"): | ||
| path = path[:-3] | ||
| return path + frag | ||
|
|
||
| for f in pathlib.Path(".").rglob("*.md"): | ||
| text = f.read_text(encoding="utf-8") | ||
| def fix(m: re.Match) -> str: | ||
| return f"[{m.group(1)}]({strip_md(m.group(2))})" | ||
| new = link_re.sub(fix, text) |
There was a problem hiding this comment.
The regex-based link stripping logic has several limitations that should be addressed:
- Code Span Safety: The current implementation does not exclude links within backticks or code blocks (e.g.,
`[label](file.md)`). This violates the general rule requiring parsing logic to respect CommonMark inline code span rules to avoid modifying content inside code snippets. - Links with Titles: Markdown links can include titles (e.g.,
[label](url.md "title")). The regex([^)]+)captures the title as part of the target group, causingstrip_mdto fail to remove the extension because the string ends with a quote instead of.md. - Index Inconsistency:
path.endswith("/index.md")does not match a link that is exactlyindex.md. In that case, it falls through to the.mdstrip logic and becomesindex, whereas./index.mdbecomes./. This inconsistency may affect how Mintlify resolves directory-level pages.
Consider using a more robust parsing approach that identifies code spans first and separates the URL from any optional title.
References
- When processing Markdown/MDX files to escape curly braces for JSX compatibility, ensure the logic ignores content within inline code spans (backticks) to prevent breaking code snippets.
| python3 - <<'PY' | ||
| import pathlib | ||
| import re | ||
|
|
||
| link_re = re.compile(r'(?<!\!)\[([^\]]*)\]\(([^)]+)\)') | ||
|
|
||
| def strip_md(target: str) -> str: | ||
| first_seg = target.split("/", 1)[0] | ||
| if ":" in first_seg: | ||
| return target | ||
| path, _, frag = target.partition("#") | ||
| frag = ("#" + frag) if frag else "" | ||
| if path.endswith("/index.md"): | ||
| path = path[: -len("/index.md")] or "." | ||
| elif path.endswith(".md"): | ||
| path = path[:-3] | ||
| return path + frag | ||
|
|
||
| for f in pathlib.Path(".").rglob("*.md"): | ||
| text = f.read_text(encoding="utf-8") | ||
| def fix(m: re.Match) -> str: | ||
| label, target = m.group(1), m.group(2) | ||
| return f"[{label}]({strip_md(target)})" | ||
| new = link_re.sub(fix, text) | ||
| if new != text: | ||
| f.write_text(new, encoding="utf-8") | ||
| PY |
There was a problem hiding this comment.
This Python script is duplicated from the Rust documentation template. Duplicating complex post-processing logic across multiple workflow files increases maintenance overhead and the risk of inconsistent behavior. Consider extracting this logic into a shared script or a reusable local action to ensure that fixes (such as handling code spans or link titles) are applied consistently across all SDK documentation.
Symptom
Clicking any module link from a package landing in the TypeScript SDK (e.g. `index` on `/sdks/typescript/api/logger`) 404s because Mintlify treats the link target literally — `.../index/index.md` is not a routed URL even though the file exists on disk and its extensionless form `.../index/index` returns 200.
Fix
New post-process step (TS + Rust templates) that rewrites every internal markdown link target:
The `/index.md` → parent-dir collapse aligns with the splice's strip-`/index` registration so URLs match Mintlify's native dir-page resolution.
External URLs (`http://`, `mailto:`, `data:`) and fragment-only links pass through unchanged.
Live state
`sdks/typescript/api/` and `sdks/rust/api/` rewritten in-place so the fix applies immediately when pulled:
Verified
`curl -sI http://localhost:3340/sdks/typescript/api/logger/index\` → `200 OK`
`curl -sI http://localhost:3340/sdks/typescript/api/logger/logger\` → `307` (Mintlify auto-resolves to first child of `logger/logger/`)