feat: per-page canonical via mathlib3→mathlib4 map#181
feat: per-page canonical via mathlib3→mathlib4 map#181FordUniver wants to merge 1 commit intoleanprover-community:deprecate-banner-and-canonicalfrom
Conversation
Replaces 'every mathlib3 page canonicalises to mathlib4 docs root' with a per-page lookup against `mathlib4_canonical_map.yaml`. The mathlib4 root version is the many-to-one canonical pattern Google explicitly rejects (5 common mistakes with rel=canonical), so the signal is silently dropped. The map (~3000 entries, ~93% of ported mathlib3 modules) was built from the wiki-maintained mathlib4-port-status YAML with mathlib4 git rename chasing, Defs/Basic split handling, deprecated_module shim filtering, and a curated dictionary of directory renames git similarity detection cannot follow (GroupCat → Grp, IsROrC → RCLike, etc.). mathlib3 is frozen so a one-off snapshot is sufficient — the generator script is kept separately and not included here. Modules without a mapping fall back to the existing self-canonical behavior so course-hosted mirror copies are still de-duplicated.
beee52d to
3fb42c3
Compare
|
Thanks! You may have answered this somewhere else, but what happens if some of the mathlib4 files get moved / renamed? I guess then the canonical link will be broken. Is that going to be a problem? |
Not an SEO expert and I assume this also depends on each search engine, but I am pretty certain that a broken canonical link will just be ignored. Here is an example of a Django docs page with a broken canonical link. If you look for some content of that page verbatim on Google it does list the deprecated 1.11 page and not the broken 6.0 link, which is what I would expect. So a reasonable practice seems to be to just do a "best effort mapping" and assume that by the time the links go stale the newer version has already established itself in the ranking enough for it to not matter. I also checked if |
bryangingechen
left a comment
There was a problem hiding this comment.
OK, makes sense! @kim-em: what do you think?
|
We should check a deployed version of this btw to double check the canonical links are set as expected for the right pages but not the root page. |
No description provided.