diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 04e481ba..db09d805 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -200,9 +200,47 @@ The Plugin API is the primary way to implement additional functionality in Harpe For inline config option annotations inside list items, plain text `(Added in: vX.Y.Z)` is fine — using the component mid-sentence is awkward. Reserve `` for standalone placement after headings. -## Verifying Redirects +## Redirects -The site has a large set of legacy URLs configured in [redirects.ts](redirects.ts) (non-versioned `/docs/*` paths) and [historic-redirects.ts](historic-redirects.ts) (versioned `/docs/4.X/*` paths). To make sure none of them silently start 404'ing — for example after a Reference page is renamed or removed — there is a verification script that checks every `from` path against the live docs site. +The site is hosted as a static build on GitHub Pages, so there is no server-side router that can rewrite incoming URLs. All redirect handling is therefore done **client-side** by the [`@docusaurus/plugin-client-redirects`](https://docusaurus.io/docs/api/plugins/@docusaurus/plugin-client-redirects) plugin: at build time it emits a small HTML page for every `from` path that immediately navigates the browser to the `to` path. Because the redirect HTML is generated per-path, every redirect must be enumerated explicitly — there is no pattern, prefix, or wildcard support. + +### `redirects.ts` vs. `historic-redirects.ts` + +The redirect rules are split across two files to keep the two distinct populations of legacy URLs visually and logically separated: + +| File | Source paths | Targets | When to edit | +| ---------------------------------------------- | ------------------------------------------------------------------ | ------------------------------- | -------------------------------------------------------------------- | +| [redirects.ts](redirects.ts) | Non-versioned `/docs/*` paths (e.g. `/docs/developers/rest`) | Mostly `/reference/v5/*` | Whenever a current Reference or Learn page is renamed or removed. | +| [historic-redirects.ts](historic-redirects.ts) | Versioned `/docs/4.X/*` paths and the GitBook `/docs/v/4.X/*` form | Almost always `/reference/v4/*` | Rarely — only when new analytics data surfaces a missed legacy path. | + +The split exists because `historic-redirects.ts` is treated as a frozen, append-only artifact derived from historical analytics, while `redirects.ts` is part of the day-to-day editing surface. Both files export `RedirectRule[]`; `redirects.ts` simply re-exports the concatenation as `redirects` for the plugin config to consume. + +### How historic redirects were populated + +The full set of versioned and GitBook-era paths in `historic-redirects.ts` was generated from real traffic, not guesswork: + +1. Export pageview data from Google Analytics 4 (property `harper.fast - GA4`) for the past several months. The current snapshot is checked in as [scripts/harper-docs-analytics.csv](scripts/harper-docs-analytics.csv) (Oct 2025 – Feb 2026). +2. Filter to paths under `/docs/4.X/*` and `/docs/v/4.X/*` that the modern site no longer serves. +3. For each path, hand-map it to the closest equivalent under `/reference/v4/*` (or `/learn`, `/release-notes`, etc. when the page has no direct successor) and group multiple sources onto a single target. +4. Add the result as a new rule in `historic-redirects.ts`. + +To regenerate this list when fresh analytics become available, re-export the CSV, run [scripts/verify-redirects.mjs](scripts/verify-redirects.mjs) to confirm the existing rules still resolve, and then diff the CSV against the configured `from` paths to find the new omissions. + +### Paths intentionally not redirected + +A small set of paths show up in analytics but are deliberately not given redirects, because they are not real pages the user meant to visit: + +- `/docs/4.X/~gitbook/pdf` and `/docs/v/4.X/~gitbook/pdf` — GitBook's PDF export endpoint, not a content page. +- `/docs/4.X/4.X/...` (double version prefix) — malformed URLs from broken inbound links. +- `/docs/4.4./getting-started/` — typo path with an extra dot in the version segment. +- `/robots.txt`, `/404.html`, `/search` — site infrastructure, served directly by Docusaurus. +- One-off junk in analytics (`/learnjira`, `/view///`, etc.) — unrelated to docs traffic. + +A handful of very-low-traffic non-versioned strays (e.g. `/docs/developers/`, `/docs/foundations/`, `/docs/administration/edge`, `/docs/5.0/migration-guide`) are also not currently redirected. They are candidates for future cleanup but each accounts for only one or two pageviews per quarter, so the maintenance cost outweighs the win. + +### Verifying redirects + +To make sure no configured redirect silently starts 404'ing — for example after a Reference page is renamed or removed — there is a verification script that checks every `from` path against the live docs site: ```bash node scripts/verify-redirects.mjs @@ -218,6 +256,22 @@ Common flags: Run it after editing either redirect file, after large Reference reorganizations, or as part of a periodic check. +### Future work + +The current setup is the most we can do under static hosting: every redirect is an enumerated rule that compiles to its own HTML stub. This has two notable limitations: + +- **No wildcard or prefix support.** Patterns like "redirect every `/docs/v/4.X/*` URL by stripping the `/v/` segment" cannot be expressed as a single rule — each path must be listed individually. The full `/v/` GitBook rewrite alone took 100+ explicit rules. +- **Build-time only.** Adding a redirect requires a new build and deploy. There is no way to author a redirect quickly without going through CI. + +If the site is eventually re-hosted on a platform with a real request lifecycle — Harper Fabric is the natural candidate, but any Node.js host (or even a CDN edge-rewrite layer) would do — we should migrate the redirect handling to a server-side router with pattern support. At that point we could: + +- Replace the entire GitBook `/v/` block with a single regex rule (`/^\/docs\/v\/(.*)$/` → `/docs/$1`). +- Normalize trailing slashes once in the router instead of as duplicated rules. +- Author redirects without needing a full Docusaurus rebuild. +- Optionally emit redirect metrics (which legacy paths still receive traffic) directly from the router rather than relying on GA exports. + +Until then, every new redirect goes in `redirects.ts` (or, for historical paths discovered later, `historic-redirects.ts`) as an explicit rule. + ## Known Issues ### `docusaurus serve` 404s on `/docs/4.X` paths diff --git a/historic-redirects.ts b/historic-redirects.ts index 61681eb1..e257ae3c 100644 --- a/historic-redirects.ts +++ b/historic-redirects.ts @@ -20,9 +20,17 @@ type RedirectRule = { // Paths that are junk/artifacts we intentionally skip (no redirect): // /~gitbook/pdf — GitBook PDF export URL, not a real page +// /docs/v/4.X/~gitbook/pdf — same, GitBook /v/ form // /docs/4.X/4.X/... — malformed double-version paths // /docs/4.4./getting-started/ — typo path with extra dot +// GitBook /v/ URL pattern: +// When Harper docs were hosted on GitBook, the version selector produced URLs +// of the form /docs/v/4.X/. After migrating off GitBook these became +// dead links, so they are redirected to the same targets as their prefix-less +// /docs/4.X/ siblings. See the "GitBook /v/ versioned URL prefix" +// section at the bottom of this file. + export const historicRedirects: RedirectRule[] = [ // ── Version roots ────────────────────────────────────────────────────────── { from: ['/docs/4.1', '/docs/4.2', '/docs/4.3', '/docs/4.4', '/docs/4.5', '/docs/4.6'], to: '/reference/v4' }, @@ -1808,4 +1816,162 @@ export const historicRedirects: RedirectRule[] = [ from: ['/docs/4.1/release-notes/1.alby', '/docs/4.3/technical-details/release-notes/4.tucker/3.2.1'], to: '/release-notes', }, + + // ── GitBook /v/ versioned URL prefix ────────────────────────────────────── + // Legacy URL pattern from when Harper docs were hosted on GitBook. Every + // `/docs/v/4.X/` is the GitBook-equivalent of `/docs/4.X/` and + // maps to the same target as its prefix-less sibling above. + // Source: GA pageview data (Oct 2025 – Feb 2026), 102 unique paths. + // Only `/docs/v/4.1/~gitbook/pdf` is intentionally skipped (PDF export URL). + { + from: ['/docs/v/4.1/getting-started', '/docs/v/4.2/getting-started'], + to: '/learn', + }, + { + from: ['/docs/v/4.1/install-harperdb/linux', '/docs/v/4.2/deployments/upgrade-hdb-instance'], + to: '/learn/getting-started/install-and-connect-harper', + }, + { + from: ['/docs/v/4.1/support', '/docs/v/4.2', '/docs/v/4.2/technical-details/reference/architecture', '/docs/v/4.4'], + to: '/reference/v4', + }, + { from: '/docs/v/4.2/technical-details/reference/analytics', to: '/reference/v4/analytics/overview' }, + { from: '/docs/v/4.2/developers/components/writing-extensions', to: '/reference/v4/components/extension-api' }, + { from: '/docs/v/4.2/technical-details/reference/globals', to: '/reference/v4/components/javascript-environment' }, + { + from: [ + '/docs/v/4.1/add-ons-and-sdks', + '/docs/v/4.2/developers/applications', + '/docs/v/4.2/developers/applications/debugging', + '/docs/v/4.2/developers/components/drivers', + '/docs/v/4.2/developers/components/sdks', + ], + to: '/reference/v4/components/overview', + }, + { from: '/docs/v/4.1/configuration', to: '/reference/v4/configuration/overview' }, + { from: '/docs/v/4.1/jobs', to: '/reference/v4/database/jobs' }, + { from: '/docs/v/4.1/reference/limits', to: '/reference/v4/database/schema' }, + { from: '/docs/v/4.1/reference/storage-algorithm', to: '/reference/v4/database/storage-algorithm' }, + { from: '/docs/v/4.2/technical-details/reference/transactions', to: '/reference/v4/database/transaction' }, + { + from: [ + '/docs/v/4.1/harperdb-cloud', + '/docs/v/4.1/harperdb-cloud/alarms', + '/docs/v/4.1/harperdb-cloud/iops-impact', + '/docs/v/4.2/deployments/harperdb-cloud/instance-size-hardware-specs', + ], + to: '/reference/v4/legacy/cloud', + }, + { + from: [ + '/docs/v/4.1/custom-functions/create-project', + '/docs/v/4.1/custom-functions/custom-functions-operations', + '/docs/v/4.1/custom-functions/define-helpers', + '/docs/v/4.1/custom-functions/example-projects', + '/docs/v/4.1/custom-functions/host-static', + '/docs/v/4.1/custom-functions/requirements-definitions', + '/docs/v/4.1/custom-functions/templates', + '/docs/v/4.1/custom-functions/using-npm-git', + ], + to: '/reference/v4/legacy/custom-functions', + }, + { + from: [ + '/docs/v/4.1/audit-logging', + '/docs/v/4.1/logging', + '/docs/v/4.1/transaction-logging', + '/docs/v/4.2/administration/logging/transaction-logging', + ], + to: '/reference/v4/logging/overview', + }, + { + from: [ + '/docs/v/4.1/sql-guide/insert', + '/docs/v/4.1/sql-guide/reserved-word', + '/docs/v/4.1/sql-guide/select', + '/docs/v/4.1/sql-guide/sql-geospatial-functions', + '/docs/v/4.1/sql-guide/sql-geospatial-functions/geoarea', + '/docs/v/4.1/sql-guide/sql-geospatial-functions/geodifference', + '/docs/v/4.1/sql-guide/sql-geospatial-functions/geodistance', + '/docs/v/4.1/sql-guide/sql-geospatial-functions/geolength', + '/docs/v/4.1/sql-guide/sql-geospatial-functions/geonear', + '/docs/v/4.1/sql-guide/update', + '/docs/v/4.2/developers/sql-guide/date-functions', + '/docs/v/4.2/developers/sql-guide/features-matrix', + '/docs/v/4.2/developers/sql-guide/reserved-word', + '/docs/v/4.2/developers/sql-guide/sql-geospatial-functions', + ], + to: '/reference/v4/operations-api/sql', + }, + { + from: [ + '/docs/v/4.1/clustering', + '/docs/v/4.1/clustering/enabling-clustering', + '/docs/v/4.1/clustering/managing-subscriptions', + '/docs/v/4.1/clustering/naming-a-node', + '/docs/v/4.1/clustering/requirements-and-definitions', + '/docs/v/4.2/developers/clustering/creating-a-cluster-user', + '/docs/v/4.2/developers/clustering/enabling-clustering', + '/docs/v/4.2/developers/clustering/things-worth-knowing', + '/docs/v/4.4/developers/clustering/enabling-clustering', + ], + to: '/reference/v4/replication/clustering', + }, + { from: '/docs/v/4.2/administration/cloning', to: '/reference/v4/replication/overview' }, + { from: '/docs/v/4.1/reference/content-types', to: '/reference/v4/rest/content-types' }, + { from: '/docs/v/4.2/technical-details/reference/headers', to: '/reference/v4/rest/headers' }, + { from: '/docs/v/4.2/developers/real-time', to: '/reference/v4/rest/websockets' }, + { from: '/docs/v/4.1/security/basic-auth', to: '/reference/v4/security/basic-authentication' }, + { + from: '/docs/v/4.2/developers/clustering/certificate-management', + to: '/reference/v4/security/certificate-management', + }, + { from: '/docs/v/4.1/security/jwt-auth', to: '/reference/v4/security/jwt-authentication' }, + { + from: [ + '/docs/v/4.1/harperdb-studio/create-account', + '/docs/v/4.1/harperdb-studio/enable-mixed-content', + '/docs/v/4.1/harperdb-studio/instance-example-code', + '/docs/v/4.1/harperdb-studio/manage-schemas-browse-data', + '/docs/v/4.1/harperdb-studio/organizations', + '/docs/v/4.1/harperdb-studio/resources', + '/docs/v/4.2/administration/harperdb-studio/create-account', + '/docs/v/4.2/administration/harperdb-studio/manage-charts', + '/docs/v/4.2/administration/harperdb-studio/manage-functions', + ], + to: '/reference/v4/studio/overview', + }, + { from: '/docs/v/4.2/developers/operations-api/users-and-roles', to: '/reference/v4/users-and-roles/operations' }, + { + from: ['/docs/v/4.1/security/users-and-roles', '/docs/v/4.2/developers/security/users-and-roles'], + to: '/reference/v4/users-and-roles/overview', + }, + { + from: [ + '/docs/v/4.1/harperdb-4.2-pre-release/release-notes/2.penny', + '/docs/v/4.1/release-notes', + '/docs/v/4.1/release-notes/1.alby/1.1.0', + '/docs/v/4.1/release-notes/1.alby/1.3.0', + '/docs/v/4.1/release-notes/3.monkey/3.1.0', + '/docs/v/4.1/release-notes/3.monkey/3.1.1', + '/docs/v/4.1/release-notes/3.monkey/3.1.4', + '/docs/v/4.1/release-notes/4.tucker/4.0.0', + '/docs/v/4.1/release-notes/4.tucker/4.0.1', + '/docs/v/4.1/release-notes/4.tucker/4.0.2', + '/docs/v/4.1/release-notes/4.tucker/4.1.0', + '/docs/v/4.1/technical-details/release-notes/4.tucker/2.2.3', + '/docs/v/4.2/technical-details/release-notes/1.alby/1.3.0', + '/docs/v/4.2/technical-details/release-notes/2.penny/2.2.3', + '/docs/v/4.2/technical-details/release-notes/3.monkey/3.1.5', + '/docs/v/4.2/technical-details/release-notes/4.tucker/3.1.4', + '/docs/v/4.2/technical-details/release-notes/4.tucker/4.0.2', + '/docs/v/4.2/technical-details/release-notes/4.tucker/4.2.4', + '/docs/v/4.2/technical-details/release-notes/4.tucker/4.3.17', + '/docs/v/4.4/technical-details/release-notes/4.tucker/4.0.7', + ], + to: '/release-notes', + }, + { from: '/docs/v/4.2/technical-details/release-notes/4.tucker/1.3.1', to: '/release-notes/v1-alby/1.3.1' }, + { from: '/docs/v/4.2/release-notes/2.penny/2.1.1', to: '/release-notes/v2-penny/2.1.1' }, + { from: '/docs/v/4.1/release-notes/3.monkey/3.0.0', to: '/release-notes/v3-monkey/3.0.0' }, ];