refactor: split site (Node) and automation (Python) responsibilities by kimchanhyung98 · Pull Request #63 · letsescape/laravel-docs

kimchanhyung98 · 2026-05-02T07:12:02Z

Summary

scripts/ 의 .mjs 와 package.json hook 으로 섞여 있던 번역 자동화 / 사이트 빌드 도구를 두 책임 영역으로 분리합니다.

사이트(Docusaurus) : 메인페이지(홈) 와 정적 빌드만 책임. package.json 에서 prebuild/postbuild/validate-anchors/sync:versions 제거.
자동화(.github/docs-updater, Python) : 번역, 사이드바, 번역 구조 검증, 빌드 산출물 anchor 검증, 미버전 경로 redirect HTML 생성을 담당.
워크플로우 : update-docs.yml 은 Python only, deploy.yml 은 Node typecheck/build 후 Python redirect 생성 / anchor 검증 → Pages 업로드 / 배포.

주요 변경

scripts/markdown-link-utils.mjs, validate-translation-structure.mjs, find-link-context.mjs, find-missing-links.mjs → Python 으로 이관. main.py 마지막 단계가 structure_validator 호출.
scripts/create-latest-doc-redirects.mjs, validate-anchors.mjs, sync-versioned-links.mjs → build_redirect_generator.py / build_anchor_validator.py 로 이관. master 사이드바 API 링크 정규화는 parse_documentation_md 의 latest_stable 처리로 흡수.
scripts/serve-build.mjs 가 13.x 점 디렉터리, trailing slash, .html cleanUrls fallback 모두 처리.
playwright.config.ts.webServer 가 npm run start (dev 서버) 위에서 동작. e2e/build.spec.ts 제거 (build/ 존재는 build 명령 자체가 보장). hash scroll 검증은 dev 서버 한계로 anchor 매핑 검증으로 변경.
.github/docs-updater/.ai-context/workflow.md 가 책임 분리 모델·mermaid 흐름·로컬 검증 명령 명시. tests/test_project_boundaries.py 가 npm script 안에 .github/ / python 호출이 있으면 실패.

검증 결과

cd .github/docs-updater && uv run pytest -q → 85 passed
npm run typecheck -- --pretty false → 0 errors
npm run build → 0 warning, 0 error
cd .github/docs-updater && uv run python build_redirect_generator.py → 101 redirects
cd .github/docs-updater && uv run python build_anchor_validator.py → 23250/23250 OK
npm run test:e2e → 70 passed
브라우저 smoke (prod 빌드) : 홈 / /docs/13.x/ / /docs/13.x/upgrade#upgrade-13.0 / /docs/pulse / /en/docs/pulse 모두 콘솔 0건, anchor·redirect 정상

Test plan

cd .github/docs-updater && uv run pytest -q
npm run typecheck -- --pretty false
npm run build
cd .github/docs-updater && uv run python build_redirect_generator.py
cd .github/docs-updater && uv run python build_anchor_validator.py
npm run test:e2e
브라우저 spot check (홈 / docs root / upgrade anchor / unversioned redirect)

Move scripts/markdown-link-utils.mjs and the translation-only validators into .github/docs-updater so the update-docs workflow can fail before committing translated output. structure_validator now runs at the end of main.py with the same JS semantics (anchor / heading / internal-link diff). Debugging CLIs (find_link_context, find_missing_links) move with their dependency. Co-Authored-By: Claude <noreply@anthropic.com>

build_redirect_generator generates the unversioned -> latest stable redirect HTML for each locale (both /docs/<slug>/index.html and /docs/<slug>.html shapes). build_anchor_validator checks that every markdown #fragment in versioned_docs/ resolves to an actual id in the built HTML, including the .html cleanUrls variant. deploy.yml now sets up uv alongside Node and runs both tools after npm run build, while update-docs.yml is Node-free and Python only. Also drops the matching scripts/*.mjs and the prebuild/postbuild hook in package.json. Co-Authored-By: Claude <noreply@anthropic.com>

Drop prebuild/postbuild hooks, sync:versions, and validate-anchors from package.json so site tooling no longer reaches into the automation domain. serve still ships its own static server (handles the dotted 13.x directory and .html cleanUrls fallback) and serve:docusaurus stays as the upstream baseline. playwright.config.ts now boots npm run start for the e2e webServer; tests that depend on prod-build behaviour either verify anchor mapping directly (docs-rendering) or wait for hydration (homepage), and the build-output existence assertions move out (build exit code is the source of truth). Co-Authored-By: Claude <noreply@anthropic.com>

workflow.md spells out the responsibility boundary between the Node site (hosting + landing page) and the Python automation (.github/docs-updater + workflows). README mirrors the deploy/update split so contributors hit the right pipeline. test_project_boundaries fails if any npm script ever calls into .github/ or python so the divide cannot regress silently. Co-Authored-By: Claude <noreply@anthropic.com>

sonarqubecloud · 2026-05-02T07:12:43Z

Quality Gate failed

Failed conditions
1 Security Hotspot

See analysis details on SonarQube Cloud

gemini-code-assist

Code Review

This pull request refactors the documentation update and deployment workflow by migrating legacy Node.js scripts to a Python-based automation suite. The changes introduce robust tools for anchor validation, redirect generation, and structural consistency checks between source and translated documents. Review feedback highlights opportunities to improve Markdown parsing for nested brackets, refine anchor and heading extraction using regular expressions to prevent false positives, and broaden external URL detection to support additional protocols.

gemini-code-assist · 2026-05-02T07:15:05Z

+def extract_markdown_links(text: str) -> list[MarkdownLink]:
+    """`[label](url)` 형태의 링크를 모두 추출. fenced/inline code 안은 무시."""
+    stripped = strip_code(text)
+    links: list[MarkdownLink] = []
+    i = 0
+    length = len(stripped)
+    while i < length:
+        label_start = stripped.find("[", i)
+        if label_start < 0:
+            break
+
+        label_end = stripped.find("]", label_start + 1)
+        if label_end < 0:
+            break
+
+        if label_end + 1 >= length or stripped[label_end + 1] != "(":
+            i = label_end + 1
+            continue
+
+        url_end = stripped.find(")", label_end + 2)
+        if url_end < 0:
+            break
+
+        url = _strip_title_suffix(stripped[label_end + 2 : url_end])
+        if url:
+            links.append(
+                MarkdownLink(
+                    text=stripped[label_start + 1 : label_end],
+                    url=url,
+                )
+            )
+        i = url_end + 1
+    return links


현재의 루프 기반 파싱 로직은 [label with [nested] brackets](url)와 같이 링크 라벨 내부에 대괄호가 포함된 경우를 올바르게 처리하지 못합니다(첫 번째 ]에서 라벨이 끝난 것으로 간주함). 정규표현식을 사용하면 중첩된 대괄호(한 단계 수준)를 포함한 링크를 더 안정적으로 추출할 수 있습니다.

def extract_markdown_links(text: str) -> list[MarkdownLink]: """`[label](url)` 형태의 링크를 모두 추출. fenced/inline code 안은 무시.""" stripped = strip_code(text) links: list[MarkdownLink] = [] # 중첩된 대괄호를 한 단계까지 허용하는 정규표현식 pattern = r'\[((?:[^\[\]]|\[[^\[\]]*\])*)\]\(([^)]+)\)' for match in re.finditer(pattern, stripped): label, raw_url = match.groups() url = _strip_title_suffix(raw_url) if url: links.append(MarkdownLink(text=label, url=url)) return links

gemini-code-assist · 2026-05-02T07:15:05Z

+def extract_anchors(text: str) -> list[str]:
+    """`<a name="...">` 명시적 앵커를 코드 영역을 제외하고 추출."""
+    anchors: list[str] = []
+    stripped = strip_code(text)
+    index = 0
+    length = len(stripped)
+    while index < length:
+        tag_start = stripped.find("<a", index)
+        if tag_start < 0:
+            break
+
+        tag_end = stripped.find(">", tag_start + 2)
+        if tag_end < 0:
+            break
+
+        tag = stripped[tag_start : tag_end + 1]
+        name_pos = tag.find("name=")
+        if name_pos >= 0:
+            quote_index = name_pos + len("name=")
+            if quote_index < len(tag):
+                quote = tag[quote_index]
+                if quote in ('"', "'"):
+                    value_start = quote_index + 1
+                    value_end = tag.find(quote, value_start)
+                    if value_end >= 0:
+                        anchors.append(tag[value_start:value_end])
+        index = tag_end + 1
+    return anchors


extract_anchors 함수에서 stripped.find("<a", index)를 사용하면 <area>, <address> 등 <a> 태그가 아닌 요소와도 매칭될 수 있습니다. 또한 name= 검색 시 data-name= 등 의도하지 않은 속성이 포함될 위험이 있습니다. 정규표현식을 사용하여 정확히 <a> 태그의 name 속성만 추출하는 것이 안전합니다.

Suggested change

def extract_anchors(text: str) -> list[str]:

"""`<a name="...">` 명시적 앵커를 코드 영역을 제외하고 추출."""

anchors: list[str] = []

stripped = strip_code(text)

index = 0

length = len(stripped)

while index < length:

tag_start = stripped.find("<a", index)

if tag_start < 0:

break

tag_end = stripped.find(">", tag_start + 2)

if tag_end < 0:

break

tag = stripped[tag_start : tag_end + 1]

name_pos = tag.find("name=")

if name_pos >= 0:

quote_index = name_pos + len("name=")

if quote_index < len(tag):

quote = tag[quote_index]

if quote in ('"', "'"):

value_start = quote_index + 1

value_end = tag.find(quote, value_start)

if value_end >= 0:

anchors.append(tag[value_start:value_end])

index = tag_end + 1

return anchors

def extract_anchors(text: str) -> list[str]:

"""<a name=\"...\"> 명시적 앵커를 코드 영역을 제외하고 추출."""

import re

stripped = strip_code(text)

return re.findall(r'<a\s+[^>]*name=["\']([^"\']+)["\']', stripped, re.IGNORECASE)

gemini-code-assist · 2026-05-02T07:15:06Z

+def extract_headings(text: str) -> list[Heading]:
+    """`#` 으로 시작하는 ATX heading 만 추출. setext heading은 다루지 않는다."""
+    stripped = strip_code(text)
+    headings: list[Heading] = []
+    for line in stripped.split("\n"):
+        level = 0
+        while level < len(line) and line[level] == "#":
+            level += 1
+        if level < 1 or level > 6:
+            continue
+        if level >= len(line):
+            continue
+        if line[level] not in (" ", "\t"):
+            continue
+        headings.append(Heading(level=level, text=line[level + 1 :].strip()))
+    return headings


extract_headings 함수가 행 시작 부분의 공백(Indentation)을 고려하지 않습니다. Markdown 명세(CommonMark)에 따르면 헤딩 앞에 최대 3개의 공백이 허용되므로, lstrip()을 사용하여 이를 처리하는 것이 좋습니다. 또한 인덱스 범위를 벗어나는 에러를 방지하기 위해 level 체크 로직을 보강했습니다.

Suggested change

def extract_headings(text: str) -> list[Heading]:

"""`#` 으로 시작하는 ATX heading 만 추출. setext heading은 다루지 않는다."""

stripped = strip_code(text)

headings: list[Heading] = []

for line in stripped.split("\n"):

level = 0

while level < len(line) and line[level] == "#":

level += 1

if level < 1 or level > 6:

continue

if level >= len(line):

continue

if line[level] not in (" ", "\t"):

continue

headings.append(Heading(level=level, text=line[level + 1 :].strip()))

return headings

def extract_headings(text: str) -> list[Heading]:

"""`#` 으로 시작하는 ATX heading 만 추출. setext heading은 다루지 않는다."""

stripped = strip_code(text)

headings: list[Heading] = []

for line in stripped.split("\n"):

line = line.lstrip()

level = 0

while level < len(line) and line[level] == "#":

level += 1

if level < 1 or level > 6:

continue

if level >= len(line) or line[level] not in (" ", "\t"):

continue

headings.append(Heading(level=level, text=line[level + 1 :].strip()))

return headings

gemini-code-assist · 2026-05-02T07:15:06Z

+def _is_external_url(href: str) -> bool:
+    lower = href.lower()
+    return (
+        lower.startswith("http://")
+        or lower.startswith("https://")
+        or lower.startswith("mailto:")
+    )


_is_external_url 함수에서 // (프로토콜 상대 경로), javascript:, tel: 등 다양한 외부 링크 스키마를 체크하도록 보강하는 것이 좋습니다.

def _is_external_url(href: str) -> bool: lower = href.lower() return lower.startswith(("http://", "https://", "mailto:", "tel:", "javascript:", "//"))

gemini-code-assist · 2026-05-02T07:15:06Z

+                html_path,
+                html_path.read_text(encoding="utf-8"),
+            )
+            if f'id="{anchor}"' in html:


f'id="{anchor}"' in html 방식은 실제 HTML 요소의 ID 속성뿐만 아니라 코드 예제나 주석 내의 텍스트와도 매칭될 수 있어 위양성(False Positive)이 발생할 가능성이 있습니다. 정규표현식을 사용하여 최소한 id="..." 형태의 속성인지 확인하는 로직을 보강하면 더 정확한 검증이 가능합니다.

augmentcode · 2026-05-02T07:15:47Z

🤖 Augment PR Summary

Summary: Refactors the repo so the Docusaurus site (Node) and the docs automation pipeline (Python) have a strict responsibility boundary.

Key changes:

Moves translation-structure validation and build-output processing (redirect generation + built-HTML anchor validation) into .github/docs-updater Python tools.
Updates update-docs.yml to run Python-only (uv + pytest + main.py) and commit only updater outputs.
Updates deploy.yml to run Node typecheck/build, then run Python redirect generation and built-site anchor validation post-build.
Simplifies package.json by removing build hooks and other automation scripts to avoid site→automation coupling.
Adjusts Playwright to run against the Docusaurus dev server; updates e2e assertions accordingly.
Adds Python test coverage for markdown link utils, structure validation, redirect generation, anchor validation, and boundary enforcement.

Technical notes: Introduces a boundary test to prevent npm scripts from invoking Python or .github/ automation, and documents the new workflow model in .ai-context/workflow.md.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 1 suggestion posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-05-02T07:15:47Z

+    if not url.startswith(prefix):
+        return None
+    end = url.find("/", len(prefix))
+    return url[len(prefix) : end] if end >= 0 else None


docs_version_from_url() returns None for URLs like /docs/13.x (no trailing slash), which is exactly what to_url_path() produces for installation.md. That makes src_version None and skips {{version}} replacement / relative-link version prefixing, which can mis-resolve targets and cause incorrect anchor-validation failures.

Severity: high

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR refactors the docs site and automation tooling by moving translation structure checks and build-artifact validations from Node scripts into a dedicated Python pipeline under .github/docs-updater, and simplifying the site’s npm scripts/e2e setup accordingly.

Changes:

Migrates translation structure validation, anchor validation, and redirect generation from scripts/*.mjs into Python modules with pytest coverage.
Simplifies package.json scripts and updates workflows so update-docs becomes Python-only while deploy runs Node build + Python post-processing/validation.
Adjusts Playwright to run against the dev server (npm run start) and updates e2e specs for the new responsibility split.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
scripts/validate-translation-structure.mjs	Removed (validation moved to Python).
scripts/validate-anchors.mjs	Removed (validation moved to Python).
scripts/sync-versioned-links.mjs	Removed (logic covered in Python `main.py`).
scripts/serve-build.mjs	Minor arg parsing rename for clarity.
scripts/markdown-link-utils.mjs	Removed (ported to Python utils).
scripts/find-missing-links.mjs	Removed (ported to Python CLI).
scripts/find-link-context.mjs	Removed (ported to Python CLI).
scripts/create-latest-doc-redirects.mjs	Removed (redirect generation moved to Python).
playwright.config.ts	Switches e2e webServer from static build to dev server.
package.json	Removes automation hooks; keeps Docusaurus/site-only scripts.
e2e/homepage.spec.ts	Adds wait to avoid hydration timing flakiness.
e2e/docs-rendering.spec.ts	Replaces hash-scroll assertion with rendered heading presence check.
e2e/build.spec.ts	Removed (build existence asserted elsewhere).
README.md	Updates workflow responsibility description to match refactor.
.github/workflows/update-docs.yml	Removes Node steps; runs Python pipeline only.
.github/workflows/deploy.yml	Adds uv/Python steps; runs Python redirect + anchor validation post-build.
.github/docs-updater/tests/test_structure_validator.py	New pytest coverage for structure validation parity.
.github/docs-updater/tests/test_project_boundaries.py	New test enforcing no Python/.github calls from npm scripts.
.github/docs-updater/tests/test_markdown_link_utils.py	New pytest coverage for markdown parsing utilities.
.github/docs-updater/tests/test_main.py	Adds regression tests for latest-stable API link handling.
.github/docs-updater/tests/test_build_redirect_generator.py	New pytest coverage for redirect generation output.
.github/docs-updater/tests/test_build_anchor_validator.py	New pytest coverage for built-anchor validation.
.github/docs-updater/structure_validator.py	New Python implementation of translation structure validation + reporting.
.github/docs-updater/markdown_link_utils.py	New Python markdown parsing utilities (ported from Node).
.github/docs-updater/main.py	Integrates structure validation + latest-stable sidebar behavior.
.github/docs-updater/find_missing_links.py	New Python CLI for missing/extra link debugging.
.github/docs-updater/find_link_context.py	New Python CLI for finding missing link context.
.github/docs-updater/build_redirect_generator.py	New Python build redirect generator for latest stable docs.
.github/docs-updater/build_anchor_validator.py	New Python validator for markdown anchors vs built HTML ids.
.github/docs-updater/.ai-context/workflow.md	Expanded documentation describing the new responsibility split.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+def to_url_path(docs_root: Path, md_path: Path) -> str:
+    parts = md_path.relative_to(docs_root).parts
+    version = parts[0].removeprefix("version-")
+    tail = "/".join(parts[1:])[:-3]
+    if tail == "installation":
+        return f"/docs/{version}"
+    return f"/docs/{version}/{tail}"


+    if not url.startswith(prefix):
+        return None
+    end = url.find("/", len(prefix))
+    return url[len(prefix) : end] if end >= 0 else None


+def test_to_url_path_treats_installation_as_version_root(tmp_path: Path):
+    docs_root = tmp_path / "versioned_docs"
+    md = docs_root / "version-13.x" / "installation.md"
+    md.parent.mkdir(parents=True)
+    md.write_text("# Installation\n", encoding="utf-8")
+    assert bav.to_url_path(docs_root, md) == "/docs/13.x"


+        if part.startswith("version-"):
+            return part[len("version-") :]


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e4dde103da

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-02T07:16:10Z

+    if not url.startswith(prefix):
+        return None
+    end = url.find("/", len(prefix))
+    return url[len(prefix) : end] if end >= 0 else None


Handle version root URLs in docs version parser

docs_version_from_url returns None for URLs like /docs/13.x because it only succeeds when another / exists after the version segment. In this commit to_url_path now emits exactly /docs/<version> for installation.md, so any relative anchor link from that page (for example requests#...) is resolved without a version prefix and is reported as missing HTML even when build/docs/<version>/requests(.html|/index.html) exists. This creates false failures in deploy-time anchor validation for installation-page links.

Useful? React with 👍 / 👎.

kimchanhyung98 and others added 4 commits May 2, 2026 15:56

Copilot AI review requested due to automatic review settings May 2, 2026 07:12

Copilot started reviewing on behalf of kimchanhyung98 May 2, 2026 07:12 View session

kimchanhyung98 assigned yun-as May 2, 2026

gemini-code-assist Bot reviewed May 2, 2026

View reviewed changes

augmentcode Bot reviewed May 2, 2026

View reviewed changes

Copilot AI reviewed May 2, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed May 2, 2026

View reviewed changes

kimchanhyung98 marked this pull request as draft May 2, 2026 07:16

		if part.startswith("version-"):
		return part[len("version-") :]

Conversation

kimchanhyung98 commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

주요 변경

검증 결과

Test plan

Uh oh!

sonarqubecloud Bot commented May 2, 2026

Quality Gate failed

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot commented May 2, 2026

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kimchanhyung98 commented May 2, 2026 •

edited

Loading

augmentcode Bot May 2, 2026 •

edited

Loading