Skip to content

Fix internal links involving README files resulting in 404 errors#3013

Open
CCJK123 wants to merge 2 commits intorust-lang:masterfrom
CCJK123:master
Open

Fix internal links involving README files resulting in 404 errors#3013
CCJK123 wants to merge 2 commits intorust-lang:masterfrom
CCJK123:master

Conversation

@CCJK123
Copy link

@CCJK123 CCJK123 commented Jan 30, 2026

Fixes #984.

Did my best to address the concerns mentioned in #1921, by opting to:

  • Use the mdbook-markdown/pulldown-cmark and url crates instead of regex to identify and correct the relevant links (comment advising against regex usage)
  • Modify the index preprocessor instead of fix_link in mdbook-html (previously adjust_links) to maintain clear separation between the preprocessor and the rest of the codebase (comment mentioning this consideration)

Also duplicated some code from HtmlRenderOptions to get the markdown options for creating a parser via mdbook-markdown, not sure whether this should be abstracted out elsewhere. The testcases I wrote just use the default options.

Hopefully this sufficiently addresses the issue, but do lmk if there's any changes/improvements I shd make!

`README` files

Previously, the `index` preprocessor only changed the file names of
`README` to `index.md`, but did not modify internal links pointing to
said files to point to the renamed file instead. This resulted in
internal links to `README` files being broken and giving 404 errors.

This commit rectifies this by using the
`mdbook-markdown`/`pulldown-cmark` to find links within the book, and
the `url` crate to filter out internal links to `README` files, which
are then adjusted accordingly.

Fixes rust-lang#984.
@rustbot rustbot added the S-waiting-on-review Status: waiting on a review label Jan 30, 2026
@CCJK123
Copy link
Author

CCJK123 commented Feb 2, 2026

Realised that this approach seems to be lossy due to the current way pulldown-cmark and pulldown-cmark-to-cmark work, and can result in some information being lost, and possibly resulted in the following test failures:

  • markdown::basic_markdown: A " gets converted to , though there are other "s in the same test that aren't converted; disabling smart_punctuation does not resolve the issue
  • markdown::definition_lists: <p> tag is not preserved within <dd> fsr
  • preprocessor::extension_compatibility and renderer::backends_receive_render_context_via_stdin: Newlines at the end of markdown content get removed (see Empty lines and whitespace are not preserved Byron/pulldown-cmark-to-cmark#94)
  • rendering::header_links: Nesting order of <strong> and <em> tags was reversed, inconsequential in this case but not necessarily so in other cases

Perhaps regex is still a viable approach, though it'll need to account for the various types of links. Tolerating some information loss is also an option. Otherwise, the issue will continue to be unresolved due to being blocked by pulldown-cmark/pulldown-cmark-to-cmark.

Would appreciate input on how best to proceed, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: waiting on a review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Internal links to README.md are broken

2 participants