Skip to content

Commit 45a5326

Browse files
fix(seo): leetcode 中文 slug 旧 URL 301 到拼音新路径
GSC 报 41 条 /docs/CommunityShare/Leetcode/<中文文件名> 的 404。根因: lib/source.ts 里的 transformer 会把 leetcode 目录下含中文的文件名转成拼音 slug,但 next.config.mjs 的 wildcard 只做前缀替换没做 slug 拼音化,跳过去 slug 还是中文导致 404。 改动: - scripts/generate-leetcode-slug-map.mjs 构建时扫目录,输出 generated/leetcode-slug-map.json「中文 stem → 拼音 stem」字面映射 - proxy.ts 新增 redirectLeetcodeIfNeeded:Edge 端 O(1) 查表单跳 301 - next.config.mjs 删 leetcode wildcard,改由 middleware 处理避免二跳 - prebuild 挂钩生成脚本,保证 JSON 永远同步最新文件列表 - 补 dev_docs/leetcode_slug_redirect.md 讲清方案
1 parent 244f806 commit 45a5326

6 files changed

Lines changed: 271 additions & 7 deletions

File tree

dev_docs/leetcode_slug_redirect.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# Leetcode 中文 slug 301 重定向
2+
3+
## 为什么需要这东西
4+
5+
`lib/source.ts` 里的 transformer 会把 `app/docs/career/interview-prep/leetcode/` 下含中文的
6+
文件名转成拼音 slug(例如 `2309兼具大小写的最好英文字母_translated.md` 对外变成
7+
`/docs/career/interview-prep/leetcode/2309-jian-ju-...-translated`)。
8+
9+
但历史上 Google Search Console 已经索引了**两批**旧 URL:
10+
11+
1. `/docs/CommunityShare/Leetcode/<中文原文件名>` —— Option C IA 大重组前的旧路径
12+
2. `/docs/CommunityShare/Leetcode/<拼音 slug>` —— 拼音化上线后被 Google 发现但还没编入
13+
14+
`next.config.mjs` 里只写了一条 wildcard `/docs/CommunityShare/Leetcode/:path*`
15+
`/docs/career/interview-prep/leetcode/:path*`**只做前缀替换不做 slug 拼音化**
16+
所以第 1 批 URL 跳到新路径之后 slug 还是中文,目标页依然 404。
17+
18+
GSC 实测 41 条 404 全都是这个问题。
19+
20+
## 现在的方案
21+
22+
**构建时生成字面映射表 + middleware O(1) 查表 301**。选型考虑:
23+
24+
- ❌ 直接在 `next.config.mjs` 里列出来 41 条:手写脆,文件增删没人同步
25+
- ❌ path-to-regexp wildcard 传参:空格 / `[]` / 中文在 Next.js 路由匹配里不稳
26+
- ❌ 在 middleware 里动态跑 `pinyin-pro`:整本字典(~1MB+)塞进 Edge bundle 太大
27+
-**构建时扫目录 + 输出 JSON,middleware 导入 JSON 查表**:Edge bundle 只多几 KB
28+
29+
## 组件
30+
31+
| 文件 | 作用 |
32+
| ---------------------------------------- | -------------------------------------------------------------------------------------------------------- |
33+
| `scripts/generate-leetcode-slug-map.mjs` | 扫 leetcode 目录,对含中文的文件名跑和 `lib/source.ts` 一致的拼音化算法,输出 JSON |
34+
| `generated/leetcode-slug-map.json` | 构建产物,`中文 stem → 拼音 stem` 的字面映射(当前 32 条) |
35+
| `proxy.ts` (`redirectLeetcodeIfNeeded`) | middleware 在 `/docs/CommunityShare/Leetcode/*``/docs/career/interview-prep/leetcode/*` 上查表并 301 |
36+
| `package.json` `prebuild` | 构建前自动跑生成脚本,保证 JSON 永远最新 |
37+
38+
## 覆盖的请求形态
39+
40+
| 输入 pathname | 查表结果 | 最终 301 → |
41+
| --------------------------------------------------- | -------- | ---------------------------------------------------------------- |
42+
| `/docs/CommunityShare/Leetcode/<中文 slug>` | 命中 | `/docs/career/interview-prep/leetcode/<拼音 slug>` |
43+
| `/docs/CommunityShare/Leetcode/<ASCII slug>` | 未命中 | `/docs/career/interview-prep/leetcode/<ASCII slug>`(slug 原样) |
44+
| `/docs/career/interview-prep/leetcode/<中文 slug>` | 命中 | 同目录拼音 slug |
45+
| `/docs/career/interview-prep/leetcode/<ASCII slug>` | 未命中 | 放行(不动) |
46+
47+
## 新增 / 重命名 leetcode 文件时
48+
49+
不需要做任何事。`pnpm build` 的 prebuild 会重跑脚本,JSON 自动同步。
50+
51+
但如果 **pinyin 规则本身要改**(例如 tone、分隔符),必须**同时**改两处:
52+
53+
1. `lib/source.ts` 里的 `convertSlugToPinyin`(运行时给页面生成 slug)
54+
2. `scripts/generate-leetcode-slug-map.mjs` 里的 `convertSlugToPinyin`(构建时给 redirect 生成 slug)
55+
56+
两者算法不同步 = 301 跳过去还是 404。
57+
58+
## 本地验证
59+
60+
```bash
61+
pnpm build
62+
pnpm start # 默认 3000 端口
63+
# 带中文的旧 URL
64+
curl -I 'http://localhost:3000/docs/CommunityShare/Leetcode/2309%E5%85%BC%E5%85%B7%E5%A4%A7%E5%B0%8F%E5%86%99%E7%9A%84%E6%9C%80%E5%A5%BD%E8%8B%B1%E6%96%87%E5%AD%97%E6%AF%8D_translated'
65+
# 期望:301,Location 指向拼音 slug
66+
```
67+
68+
## GSC 善后
69+
70+
这些 301 上线后,GSC 需要时间重抓:
71+
72+
- 41 条「未找到 (404)」:点「验证修复」→ GSC 会重新爬取,看到 301 就清出 404 列表
73+
- 36 条「已发现 - 尚未编入索引」:这批本来就是****拼音 URL,路径没错,只是 Google 还没排到抓取 —— 可以手动「请求编入索引」加速,但没有代码层面要改的

generated/leetcode-slug-map.json

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
{
2+
"1234. 替换子串得到平衡字符串_translated": "1234-ti-huan-zi-chuan-de-dao-ping-heng-zi-fu-chuan-translated",
3+
"142.环形链表II_translated": "142-huan-xing-lian-biao-iitranslated",
4+
"1653. 使字符串平衡的最少删除次数_translated": "1653-shi-zi-fu-chuan-ping-heng-de-zui-shao-shan-chu-ci-shu-translated",
5+
"1664生成平衡数组的方案数_translated": "1664-sheng-cheng-ping-heng-shu-zu-de-fang-an-shu-translated",
6+
"1825求出 MK 平均值_translated": "1825-qiu-chu-mk-ping-jun-zhi-translated",
7+
"1828统计一个圆中点的数目_translated": "1828-tong-ji-yi-ge-yuan-zhong-dian-de-shu-mu-translated",
8+
"2131. 连接两字母单词得到的最长回文串": "2131-lian-jie-liang-zi-mu-dan-ci-de-dao-de-zui-chang-hui-wen-chuan",
9+
"2299强密码检验器II_translated": "2299-qiang-mi-ma-jian-yan-qi-iitranslated",
10+
"2309兼具大小写的最好英文字母_translated": "2309-jian-ju-da-xiao-xie-de-zui-hao-ying-wen-zi-mu-translated",
11+
"2335. 装满杯子需要的最短总时长_translated": "2335-zhuang-man-bei-zi-xu-yao-de-zui-duan-zong-shi-chang-translated",
12+
"2341. 数组能形成多少数对_translated": "2341-shu-zu-neng-xing-cheng-duo-shao-shu-dui-translated",
13+
"2639. 查询网格图中每一列的宽度_translated": "2639-cha-xun-wang-ge-tu-zhong-mei-yi-lie-de-kuan-du-translated",
14+
"2679.矩阵中的和_translated": "2679-ju-zhen-zhong-de-he-translated",
15+
"2894. 分类求和并作差": "2894-fen-lei-qiu-he-bing-zuo-cha",
16+
"3072. 将元素分配到两个数组中 II_translated": "3072-jiang-yuan-su-fen-pei-dao-liang-ge-shu-zu-zhong-iitranslated",
17+
"345. 反转字符串中的元音字母_translated": "345-fan-zhuan-zi-fu-chuan-zhong-de-yuan-yin-zi-mu-translated",
18+
"46.全排列": "46-quan-pai-lie",
19+
"538.把二叉搜索树转换为累加树_translated": "538-ba-er-cha-sou-suo-shu-zhuan-huan-wei-lei-jia-shu-translated",
20+
"6323. 将钱分给最多的儿童_translated": "6323-jiang-qian-fen-gei-zui-duo-de-er-tong-translated",
21+
"76最小覆盖子串_translated": "76-zui-xiao-fu-gai-zi-chuan-translated",
22+
"93复原Ip地址": "93-fu-yuan-ip-di-zhi",
23+
"994.腐烂的橘子_translated": "994-fu-lan-de-ju-zi-translated",
24+
"[121]买卖股票的最佳时期_translated": "121-mai-mai-gu-piao-de-zui-jia-shi-qi-translated",
25+
"[1333]餐厅过滤器_translated": "1333-can-ting-guo-l-qi-translated",
26+
"[146]LRU 缓存_translated": "146lru-huan-cun-translated",
27+
"[1545]找出第 N 个二进制字符串中的第 K 位": "1545-zhao-chu-di-n-ge-er-jin-zhi-zi-fu-chuan-zhong-de-di-k-wei",
28+
"[213]打家劫舍 II_translated": "213-da-jia-jie-she-iitranslated",
29+
"[2490]回环句_translated": "2490-hui-huan-ju-translated",
30+
"[2562]找出数组的串联值_translated": "2562-zhao-chu-shu-zu-de-chuan-lian-zhi-translated",
31+
"[2582]递枕头_translated": "2582-di-zhen-tou-translated",
32+
"brief_alternate 作业帮忙_translated": "briefalternate-zuo-ye-bang-mang-translated",
33+
"剑指 Offer II 021. 删除链表的倒数第 n 个结点_translated": "jian-zhi-offerii021-shan-chu-lian-biao-de-dao-shu-di-n-ge-jie-dian-translated"
34+
}

next.config.mjs

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -187,12 +187,10 @@ const config = {
187187
destination: "/docs/projects/:path*",
188188
statusCode: 301,
189189
},
190-
// CommunityShare 分家:Leetcode 归求职刷题,其他按主题归 community/*
191-
{
192-
source: "/docs/CommunityShare/Leetcode/:path*",
193-
destination: "/docs/career/interview-prep/leetcode/:path*",
194-
statusCode: 301,
195-
},
190+
// CommunityShare 分家:其他按主题归 community/*
191+
// ↑ Leetcode 的 301 从这里挪到 proxy.ts,因为 lib/source.ts 会把中文文件名拼音化,
192+
// 前缀替换 wildcard 跳过去的 URL slug 还是中文,目标页仍 404。proxy.ts 改用构建时
193+
// 生成的 slug 映射做字面匹配,单跳 301 到正确拼音 URL。
196194
{
197195
source: "/docs/CommunityShare/Language/:path*",
198196
destination: "/docs/community/language/:path*",

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
"private": true,
55
"scripts": {
66
"dev": "next dev",
7-
"prebuild": "node scripts/escape-angles.mjs && tsx scripts/generate-leaderboard.mjs",
7+
"prebuild": "node scripts/escape-angles.mjs && tsx scripts/generate-leaderboard.mjs && node scripts/generate-leetcode-slug-map.mjs",
88
"build": "next build",
99
"start": "next start -p 3000",
1010
"test": "vitest run",

proxy.ts

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,63 @@
11
import { NextResponse, type NextRequest } from "next/server";
2+
import leetcodeSlugMap from "@/generated/leetcode-slug-map.json";
3+
4+
/**
5+
* Leetcode 旧 URL / 中文 slug 301 到拼音 slug 的新路径。
6+
*
7+
* 背景:
8+
* lib/source.ts 的 transformer 把 career/interview-prep/leetcode/ 下含中文的文件名转成拼音 slug。
9+
* 但 GSC 旧索引里存着 /docs/CommunityShare/Leetcode/<中文原文件名> 的 URL,
10+
* next.config.mjs 的 wildcard 只做前缀替换,没做 slug 拼音化,跳过去依然 404。
11+
* 在这里用构建时生成的 slug map 做 O(1) 查表,单跳 301 到正确拼音 URL。
12+
*
13+
* 覆盖的请求形态:
14+
* 1. /docs/CommunityShare/Leetcode/<中文 slug> → 拼音新路径
15+
* 2. /docs/CommunityShare/Leetcode/<拼音或纯 ASCII slug> → 新路径同 slug(兼容老收藏)
16+
* 3. /docs/career/interview-prep/leetcode/<中文 slug> → 同目录拼音 slug(防止用户手抖)
17+
*
18+
* 为什么不走 next.config 的 redirects:
19+
* path-to-regexp 对方括号 / 空格 / 中文的处理不稳,不如 middleware 字面匹配可靠。
20+
*/
21+
const SLUG_MAP = leetcodeSlugMap as Record<string, string>;
22+
const LEETCODE_NEW_BASE = "/docs/career/interview-prep/leetcode";
23+
const LEETCODE_OLD_BASE = "/docs/CommunityShare/Leetcode";
24+
25+
function redirectLeetcodeIfNeeded(req: NextRequest): NextResponse | null {
26+
const { pathname } = req.nextUrl;
27+
28+
let baseMatched: "old" | "new" | null = null;
29+
let rest = "";
30+
if (pathname.startsWith(LEETCODE_OLD_BASE + "/")) {
31+
baseMatched = "old";
32+
rest = pathname.slice(LEETCODE_OLD_BASE.length + 1);
33+
} else if (pathname.startsWith(LEETCODE_NEW_BASE + "/")) {
34+
baseMatched = "new";
35+
rest = pathname.slice(LEETCODE_NEW_BASE.length + 1);
36+
} else {
37+
return null;
38+
}
39+
40+
if (!rest) return null;
41+
42+
// Next.js pathname 已经 decode,但保险起见再 decode 一次,兼容爬虫可能发来的二次编码
43+
let rawSlug: string;
44+
try {
45+
rawSlug = decodeURIComponent(rest);
46+
} catch {
47+
rawSlug = rest;
48+
}
49+
50+
const mapped = SLUG_MAP[rawSlug];
51+
const targetSlug = mapped ?? rawSlug;
52+
53+
// 新路径 + ASCII slug 命中原样:放行,不绕圈
54+
if (baseMatched === "new" && !mapped) return null;
55+
56+
// 新路径 + 中文 slug / 旧路径任意 slug:301 到新路径 + 拼音(或原 ASCII)slug
57+
const url = req.nextUrl.clone();
58+
url.pathname = `${LEETCODE_NEW_BASE}/${targetSlug}`;
59+
return NextResponse.redirect(url, 301);
60+
}
261

362
/**
463
* IP geo 判断默认 locale,并写入 cookie 供 Server Component 读取。
@@ -12,6 +71,10 @@ import { NextResponse, type NextRequest } from "next/server";
1271
* cookie 有效期 1 年,用户在 /settings 页切换语言时会覆盖此 cookie。
1372
*/
1473
export function proxy(req: NextRequest) {
74+
// Leetcode 老 URL / 中文 slug 优先做 301,避免后续 locale 逻辑给 404 页种 cookie
75+
const leetcodeRedirect = redirectLeetcodeIfNeeded(req);
76+
if (leetcodeRedirect) return leetcodeRedirect;
77+
1578
// 用户已选过语言,尊重选择不覆盖
1679
if (req.cookies.get("locale")) {
1780
return NextResponse.next();
Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
/**
2+
* 构建时扫描 app/docs/career/interview-prep/leetcode/*.md(x),
3+
* 把「中文/含特殊字符的文件名」→「拼音 slug」的映射写进 generated/leetcode-slug-map.json。
4+
*
5+
* 为什么要这个 map:
6+
* lib/source.ts 里的 transformer 会把 leetcode 目录下含中文的文件名转成拼音 slug(对外 URL)。
7+
* GSC 旧索引里还存着 /docs/CommunityShare/Leetcode/<中文原文件名> 这类 URL,
8+
* next.config.mjs 只做了前缀替换 wildcard,slug 没拼音化,跳过去还是 404。
9+
* proxy.ts (Next 16 middleware) 要在 edge 端 O(1) 查表把旧 URL 301 到正确拼音路径,
10+
* 又不能把 pinyin-pro 的整本字典塞进 edge bundle,所以构建时先把映射固化成 JSON。
11+
*
12+
* 生成规则必须和 lib/source.ts 的 convertSlugToPinyin 完全一致,否则链接对不上。
13+
*/
14+
import fs from "node:fs";
15+
import path from "node:path";
16+
import { fileURLToPath } from "node:url";
17+
import { pinyin } from "pinyin-pro";
18+
19+
const __filename = fileURLToPath(import.meta.url);
20+
const __dirname = path.dirname(__filename);
21+
const PROJECT_ROOT = path.resolve(__dirname, "..");
22+
const LEETCODE_DIR = path.join(
23+
PROJECT_ROOT,
24+
"app/docs/career/interview-prep/leetcode",
25+
);
26+
const OUTPUT_FILE = path.join(PROJECT_ROOT, "generated/leetcode-slug-map.json");
27+
28+
/**
29+
* 与 lib/source.ts 中 convertSlugToPinyin 保持同步。
30+
* 入参:单个 slug 片段(一般是文件名 stem)。
31+
* 无中文直接原样返回;有中文则按拼音 + 非字母数字清洗 + 连字符拼接。
32+
*/
33+
function convertSlugToPinyin(text) {
34+
const decodedText = decodeURIComponent(text);
35+
if (!/[\u4e00-\u9fa5]/.test(decodedText)) return text;
36+
return pinyin(decodedText, {
37+
toneType: "none",
38+
type: "array",
39+
nonZh: "consecutive",
40+
})
41+
.map((t) => t.toLowerCase().replace(/[^a-z0-9]/g, ""))
42+
.filter(Boolean)
43+
.join("-");
44+
}
45+
46+
/**
47+
* 从文件名去掉 locale / 扩展名后缀,还原 Fumadocs 会当 slug 的 stem。
48+
* 2309兼具大小写的最好英文字母_translated.md → 2309兼具大小写的最好英文字母_translated
49+
* 2241-design-an-atm-machine.zh.md → 2241-design-an-atm-machine
50+
* [146]LRU 缓存_translated.md → [146]LRU 缓存_translated
51+
*/
52+
function stripSuffix(filename) {
53+
let stem = filename.replace(/\.(md|mdx)$/i, "");
54+
stem = stem.replace(/\.(en|zh)$/i, "");
55+
return stem;
56+
}
57+
58+
function main() {
59+
if (!fs.existsSync(LEETCODE_DIR)) {
60+
console.error(`[leetcode-slug-map] 目录不存在: ${LEETCODE_DIR}`);
61+
process.exit(1);
62+
}
63+
64+
const files = fs
65+
.readdirSync(LEETCODE_DIR)
66+
.filter((f) => /\.(md|mdx)$/i.test(f));
67+
68+
const map = {};
69+
const collisions = [];
70+
71+
for (const file of files) {
72+
const stem = stripSuffix(file);
73+
const pinyinSlug = convertSlugToPinyin(stem);
74+
if (pinyinSlug === stem) continue; // 无中文,不需要映射
75+
if (map[stem] && map[stem] !== pinyinSlug) {
76+
collisions.push({ stem, existing: map[stem], incoming: pinyinSlug });
77+
}
78+
map[stem] = pinyinSlug;
79+
}
80+
81+
if (collisions.length) {
82+
console.warn(
83+
`[leetcode-slug-map] 检测到 slug 冲突 ${collisions.length} 条:`,
84+
collisions,
85+
);
86+
}
87+
88+
fs.mkdirSync(path.dirname(OUTPUT_FILE), { recursive: true });
89+
fs.writeFileSync(OUTPUT_FILE, JSON.stringify(map, null, 2) + "\n", "utf8");
90+
91+
console.log(
92+
`[leetcode-slug-map] 生成 ${Object.keys(map).length} 条映射 → ${path.relative(PROJECT_ROOT, OUTPUT_FILE)}`,
93+
);
94+
}
95+
96+
main();

0 commit comments

Comments
 (0)