Skip to content

Add docs-lint workflow and fix table issues across the docs tree#22

Open
chrisdp wants to merge 11 commits intorokudev:v2.0from
chrisdp:fix-table-issues
Open

Add docs-lint workflow and fix table issues across the docs tree#22
chrisdp wants to merge 11 commits intorokudev:v2.0from
chrisdp:fix-table-issues

Conversation

@chrisdp
Copy link
Copy Markdown
Contributor

@chrisdp chrisdp commented May 5, 2026

Summary

Adds a markdown table linter that runs on PRs as a GitHub Actions workflow, and fixes every issue it surfaces across the existing docs tree. Brings the repo to 0 lint issues across 664 files.

What the linter catches

Severity Rule What it catches
error pipe-col-count row column count mismatch
error pipe-no-blank-above pipe table without blank line above (won't render as a table)
error escaped-html-in-cell cell content like \<ul> rendering as literal text
warning adjacent-tables two pipe tables with same column count and no heading between
warning html-blank-between-tags blank between sibling <tr>/<td>s — breaks GitHub/IDE preview
warning html-blank-in-cell blank in cell content — breaks GitHub/IDE preview

Tooling lives at package.json (repo root) and .github/scripts/docs-lint/. The workflow runs only on changed .md files in synced paths (docs/, reference/, custom_pages/, custom_blocks/) and emits ::error/::warning workflow commands so issues appear inline on the PR diff. Errors fail the check; warnings don't.

See this PR for an example of lint issues bring surfaced.

Doc fixes the linter surfaced

  • 5 real visible render bugs: escaped \<ul>/\<script> rendering as literal text in cells, a pipe table missing the blank line above it that would let it render as a table.
  • 11 missing subsection headings in stacked tables (Channel types / Track assignments labels in roku-originals specs, an XPath label in the Spanish ingest spec).
  • 1,381 blank lines between sibling HTML tags removed across 89 files. ReadMe renders these fine; GitHub and IDE markdown previews end the HTML block at the first blank line and break the table from there.
  • 79 in-cell content blanks converted to portable HTML — <br /><br /> for paragraph breaks, <ul>/<li> for bulleted lists, <dl>/<dt>/<dd> for term-definition pairs, <pre><code class=\"language-X\"> for embedded code so Prism still highlights.
  • A handful of inline grammar typos fixed in-place where they were already in flight (e.g. "may be on the following" → "may be one of the following"; one mangled markdown link in continue-watching-cloud.md rewritten).

Markdown cross-doc links ([text](doc:slug)) and external links were preserved in markdown form to match the existing repo convention.

Why now

Many of these patterns render correctly on the published ReadMe site but break in stricter renderers — GitHub's markdown preview, in-IDE markdown previews, and downstream tooling. Surfacing them on PRs prevents new ones from landing.

chrisdp added 11 commits May 5, 2026 17:45
Adds a node-based linter for table issues across the synced docs paths
(docs/, reference/, custom_pages/, custom_blocks/). Runs via npm:

    npm run lint:tables                    # all files
    npm run lint:tables -- file [file...]  # specific files

Rules and severity:
  error    pipe-col-count        row column count mismatch
  error    pipe-no-blank-above   table won't render without blank line above
  error    pipe-split            single table split by stray blank line
  error    escaped-html-in-cell  cell content like \<ul> renders as literal text
  warning  html-blank-row        blank line after </tr> breaks GitHub/IDE preview

Implementation uses unified+remark-parse+remark-gfm to find pipe tables
in mdast; HTML-table checks scan source text directly because remark
fragments multi-line HTML blocks on blank lines.
Runs the table linter on PRs that touch markdown in synced doc paths or
the linter itself. Steps:

  1. Diff base..head to find changed .md files in docs/, reference/,
     custom_pages/, custom_blocks/.
  2. Run lint-tables.mjs on those files.

The script emits ::error / ::warning workflow commands when GITHUB_ACTIONS
is set, so issues appear inline as PR annotations on changed lines.
Errors fail the check; warnings don't.
Restructures the table linter as a generic docs-lint orchestrator so
future rules (link checks, smart quotes, etc.) can be added as drop-in
modules without spawning new workflows or npm scripts.

Layout:
  .github/scripts/docs-lint/
    index.mjs          orchestrator
    lib/report.mjs     shared reporter + GitHub annotation emit
    rules/tables.mjs   table rules (extracted from lint-tables.mjs)

Adding a new rule module:
  1. Drop a file in rules/ exporting `id` and `check({...})`.
  2. Import + register it in index.mjs's RULES list.

Workflow renamed: lint-tables.yml -> docs-lint.yml.
npm script renamed: lint:tables -> lint:docs.
The previous refactor commit renamed the workflow file but didn't update
its contents. This updates the workflow name, job name, paths filter, and
script invocation to match the new docs-lint orchestrator layout.
Five fixes flagged by docs-lint as errors:

- ifscreen.md: add blank line above the methods table so it renders.
- static-analysis-tool/index.md: convert pipe table with escaped \<ul>/\<li>
  /\<strong> to HTML so list markup renders instead of literal text.
- channel-manifest.md: replace \<script> escape with inline-code `<script>`
  in the rsg_version description cell.
- scenegraph/xml-elements/component.md: replace \<children\> and \<script\>
  escapes with inline-code `<children>` / `<script>` in the name attribute
  description cell.
- The Roku Channel/video-on-demand/delivery/ingest-specifications-spanish.md:
  remove stray backslashes in 223 \<br/> separators in the genre table row.
Rename and downgrade the rule that flagged consecutive pipe tables.

Renamed: pipe-split -> adjacent-tables.
Severity: error -> warning.

The original rule fired for any two adjacent pipe tables with the same
column count. In practice most cases were not accidental splits but
distinct tables that share shape — the right fix is usually a heading
between them, not a structural repair. Reframe accordingly.
Six files had two or more adjacent pipe tables stacked with no heading or
label between them, leaving readers to infer what each table covered:

- The Roku Channel/video-on-demand/roku-originals/{acquisitions-specs,
  features-specs, post-alternative-specs, post-scripted-specs,
  post-spanish-specs}.md: add **Channel types** and **Track assignments**
  bold labels above the second and third tables in the master audio
  deliverables section.
- The Roku Channel/video-on-demand/delivery/ingest-specifications-spanish.md:
  add **Atributo XPath** label above the cuePoint XPath table.

Bold paragraph labels match the existing convention in these files (e.g.
"**Atributo de tipo de cuePoint**") and don't disturb the heading hierarchy.
Generalize the html-blank-row rule into two rules:

  warning  html-blank-between-tags  blank between two HTML tags inside a <table> —
                                    safe to delete (auto-fixable).
  warning  html-blank-in-cell       blank between cell content (text, blockquote,
                                    list) — needs editorial judgment to convert
                                    to <br /> or restructure.

Both rules report the same underlying portability issue: CommonMark/GFM ends an
HTML block at the first blank line, so a blank inside <table>...</table> breaks
the rest of the table in stricter renderers (GitHub, IDE previews) regardless
of what surrounds it. ReadMe's renderer is more permissive, so published pages
look fine; this is a portability/preview cleanup, not a production fix.

Add an auto-fixer for the easy case:

    npm run fix:html-blank-rows

The fixer mirrors html-blank-between-tags: removes blanks where the previous
non-blank line ends with `>` and the next non-blank starts with `<`. Single-
line tables (`<table>...</table>` on one line) are skipped. Code-fence aware.
Bulk auto-fix from `npm run fix:html-blank-rows`. Removes blank lines
inside <table> blocks where the previous non-blank line ends with `>`
and the next non-blank starts with `<` (i.e., between sibling <tr>s,
<td>s, <th>s, <thead>/<tbody>, etc.).

These blanks render fine on ReadMe but break GitHub and in-IDE markdown
previews because CommonMark/GFM ends an HTML block at the first blank
line. Pure deletions — no content changed.

89 files, 3128 lines removed. Includes one manual fix in
roevpcipher.md (Blowfish cell, blank between text and blockquote
note → replaced with <br /><br />).
Fixes 79 html-blank-in-cell warnings across 23 files. Each was a blank
line inside an HTML <table> cell that surrounded markdown content
(bulleted list, code fence, blockquote, term/definition pair, or two
paragraphs of prose). CommonMark/GFM ends an HTML block at the first
blank line, so the table breaks in stricter renderers — ReadMe is fine,
GitHub and IDE previews are not.

Patterns applied per cell content type:

  intro: + bulleted list   ->  intro:<br /><br /><ul><li>...</li></ul>
  intro: + nested table    ->  intro:<br /><br /><table>...</table>
  intro: + code fence      ->  intro:<br /><br /><pre><code class="language-X">...</code></pre>
  two paragraphs of prose  ->  para1<br /><br />para2
  term + definition pairs  ->  <dl><dt>term</dt><dd>def</dd>...</dl>
  trailing blank in cell   ->  removed entirely

Markdown-inside-cell elements were converted to their HTML equivalents
where applicable: **bold** -> <strong>, `code` -> <code>, *italic* -> <em>.
Code fences inside cells were converted to <pre><code class="language-X">...
</code></pre> with HTML entities for angle brackets so the snippets still
highlight via Prism.js. Cross-doc links kept their existing markdown form
[text](doc:slug) — that's the established convention in this repo.

A few inline grammar typos surfaced during this pass were fixed in place
(e.g., "may be on the following" -> "may be one of the following";
"Contains an a" -> "Contains a"; one badly-mangled markdown link in
continue-watching-cloud.md was rewritten to use plain inline code).
The fixer was scaffolding for the initial bulk cleanup pass. Going
forward, the docs-lint workflow flags new violations on PR and
contributors fix them inline. Keeping the linter as the durable artifact
and dropping the standalone fixer + its npm script.
@chrisdp
Copy link
Copy Markdown
Contributor Author

chrisdp commented May 5, 2026

Here is an example of the github rich diff viewing working after the changes. Before it would render the rawr html for tables as raw text. Now it renders as an updated table:
image

@chrisdp chrisdp marked this pull request as ready for review May 5, 2026 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant