Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 21 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,36 @@
# Changelog

All notable changes to HiHTML are documented in this file, which is (mostly) AI-generated and (always) human-edited. Dependency updates may or may not be called out specifically.
All notable changes to hihtml are documented in this file, which is (mostly) AI-generated and (always) human-edited. Dependency updates may or may not be called out specifically.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and the project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [1.3.0-beta] - 2026-05-13

### Added

* Added string-based functions to programmatic API:
- `checkCodeString(content, options?)` validates an HTML string and checks it for deprecated markup, mirroring `checkCode` for string-based pipelines
- `checkLinksString(content, options?)` checks all external http/https URLs found in an HTML string, mirroring `checkLinks` for string-based pipelines
- `minifyString(content, options?)` minifies an HTML string and returns it, without any file I/O—useful in content-pipeline contexts such as Eleventy transforms, middleware, and SSR handlers
* Extended URL extraction in link checking to also detect URLs in unquoted attributes (e.g., `href=https://example.com`, which is valid HTML)

### Changed

* Improved performance across several areas:
- Directory traversal now fans out subdirectories in parallel (`Promise.all`)
- `HtmlValidate` instances are cached per preset, avoiding re-initialization across calls to `validate()`/`checkCode()`
- URL-extraction regexes in the link checker are compiled once at module load instead of per-call; extraction now uses `matchAll`
- HTML Minifier Next import and preset resolution are cached per preset, avoiding repeated work across calls to `minifyString()`
- Ignore-list entries are pre-classified into hostnames (Set) and prefix entries once per `checkLinks()` call, enabling O(1) exact-hostname lookup in the hot path

## [1.2.0-beta] - 2026-05-11

### Added

* Added `validation.ignore`, a list of HTML-validate rule IDs to suppress, mirroring `links.ignore`
- Ignored messages appear in validation output (marked as ignored) but are not counted as errors and do not block minification when using `--all`/`-a`
- Supported in configuration (`.hihtml.json`/`package.json`) and programmatically via `checkCode(files, { ignore: […] })`
- `ValidationResult` now includes `countIgnored`; `ValidationMessage` now includes `ignored?: boolean`
- `ResultCodeValidation` now includes `countIgnored`; `MessageValidation` now includes `ignored?: boolean`
* Added `-s`/`--settings <file>` flag to load configuration from a specific JSON file, overriding the default CWD config lookup
- Accepts any JSON file, reading the `"hihtml"` key if present (same convention as `package.json`), otherwise using the root object
- `loadConfig()` now accepts an optional `filePath` parameter for the same behavior programmatically
Expand Down
60 changes: 48 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# HiHTML, the HTML Processing Supertool (Beta)
# hihtml, the HTML Processing Supertool (Beta)

[![npm version](https://img.shields.io/npm/v/hihtml.svg)](https://www.npmjs.com/package/hihtml) [![Build status](https://github.com/j9t/hihtml/workflows/Tests/badge.svg)](https://github.com/j9t/hihtml/actions) [![Socket](https://badge.socket.dev/npm/package/hihtml)](https://socket.dev/npm/package/hihtml) [![GitHub Sponsors](https://badgen.net/static/Support/Open%20Source/cyan)](https://github.com/j9t/hihtml?sponsor=1)

HiHTML—“High Quality HTML”—bundles key HTML tools into one, making HTML validation and semantics control, link checking, and minification as easy as it gets: [HTML-validate](https://html-validate.org/) for validation, [ObsoHTML](https://github.com/j9t/obsohtml) for deprecated markup detection, Node’s built-in `http`/`https` for link checking, and [HTML Minifier Next](https://github.com/j9t/html-minifier-next) for minification. HiHTML provides a CLI and a programmatic API, and comes with strong defaults but is still highly configurable.
hihtml—“high-quality HTML”—bundles several key HTML tools into one, making HTML validation and semantics control, link checking, and minification as easy as it gets: [HTML-validate](https://html-validate.org/) for validation, [ObsoHTML](https://github.com/j9t/obsohtml) for deprecated markup detection, Node’s built-in `http`/`https` for link checking, and [HTML Minifier Next](https://github.com/j9t/html-minifier-next) for minification. hihtml provides a CLI and a programmatic API, and comes with strong defaults but is still highly configurable.

## Usage

Expand All @@ -14,11 +14,11 @@ HiHTML—“High Quality HTML”—bundles key HTML tools into one, making HTML
npm i hihtml
```

Recommended: Just run HiHTML via `npx hihtml`.
Recommended: Just run hihtml via `npx hihtml`.

#### Execution

Without options, HiHTML validates HTML files and checks for deprecated markup in the current directory. Use flags to control behavior:
Without options, hihtml validates HTML files and checks for deprecated markup in the current directory. Use flags to control behavior:

| Flag | Description |
|---|---|
Expand Down Expand Up @@ -103,7 +103,7 @@ npx hihtml -q -a -i src -o dist
### 2. Programmatic API

```js
import { checkCode, checkLinks, minify, collect } from 'hihtml';
import { checkCode, checkCodeString, checkLinks, checkLinksString, minify, minifyString, collect } from 'hihtml';

const files = await collect('./src');

Expand All @@ -115,6 +115,11 @@ const links = await checkLinks(files);

const minification = await minify(files, files); // in-place
// { files: [{ path, sizeOriginal, sizeMinified }], saved }

// String variants—same result types, no file I/O
const minified = await minifyString('<p>Hello world</p>');
const codeGate = await checkCodeString('<p><div>Nope</div></p>');
const linksCleaned = await checkLinksString('<a href=https://example.com/>Example</a>');
```

#### `collect(dir, extensions?, excludedDirs?)`
Expand All @@ -126,37 +131,68 @@ Recursively collects HTML files from `dir`. Returns `Promise<string[]>`.

#### `checkCode(filePaths, options?)`

Validates HTML files and checks for deprecated markup. Returns `Promise<CheckResult>` with `validation` (HTML-validate result) and `deprecation` (ObsoHTML result) properties.
Validates HTML files and checks for deprecated markup. Returns `Promise<ResultCode>` with `validation` (HTML-validate result) and `deprecation` (ObsoHTML result) properties.

* `options.preset`: HTML-validate preset name (default: `'standard'`)
* `options.ignore`: List of [HTML-validate rule IDs](https://html-validate.org/rules/index.html) to suppress (default: `[]`)

#### `checkCodeString(content, options?)`

Validates an HTML string and checks for deprecated markup. Returns `Promise<ResultCode>`—same shape as `checkCode`. Useful in content-pipeline contexts (Eleventy transforms, middleware, SSR) where HTML is available as a string rather than a file.

* `options.preset`: HTML-validate preset name (default: `'standard'`)
* `options.ignore`: List of HTML-validate rule IDs to suppress (default: `[]`)

Note: `result.validation.files[0].path` and `result.deprecation.files[0].path` will be `'(string input)'`, not a real file path.

#### `checkLinks(filePaths, options?)`

Checks all external http/https URLs (`href`, `src`, `srcset`, `action` attributes) found in the given HTML files. Each unique URL is checked once; results are mapped back to every file it appears in. Returns `Promise<LinkCheckResult>`.
Checks all external http/https URLs (`href`, `src`, `srcset`, `action` attributes) found in the given HTML files. Each unique URL is checked once; results are mapped back to every file it appears in. Returns `Promise<ResultLinks>`.

* `options.timeout`: Request timeout in milliseconds (default: `10000`)
* `options.concurrency`: Maximum concurrent requests (default: `8`)
* `options.warnOnPermanentRedirects`: Warn on 301/308 permanent redirects (default: `false`)
* `options.ignore`: List of hostnames or URL prefixes to skip (default: `[]`)
* `options.onStart`: Called once with the total number of URLs to check
* `options.onProgress`: Called after each URL is checked

Links are checked via HEAD request, falling back to GET on 405. 4xx and 5xx responses are reported as broken. Skipped URLs (from the ignore list) appear in results with `skipped: true` and are never counted as broken.

#### `checkLinksString(content, options?)`

Checks all external http/https URLs found in an HTML string. Returns `Promise<ResultLinks>`—same shape as `checkLinks`. Useful when HTML is available as a string rather than a file, e.g., to check links in a fetched document or API response.

* `options.timeout`: Request timeout in milliseconds (default: `10000`)
* `options.concurrency`: Maximum concurrent requests (default: `8`)
* `options.warnOnPermanentRedirects`: Warn on 301/308 permanent redirects (default: `false`)
* `options.ignore`: List of hostnames or URL prefixes to skip (default: `[]`)
* `options.onStart`: Called once with the total number of URLs to check
* `options.onProgress`: Called after each URL is checked

Note: `result.files[0].path` will be `'(string input)'`, not a real file path. `result.countFileErrors` will always be `0`.
Comment thread
j9t marked this conversation as resolved.

#### `minify(filePaths, outputPaths, options?)`

Minifies HTML files using HTML Minifier Next. Returns `Promise<MinificationResult>`.
Minifies HTML files using HTML Minifier Next. Returns `Promise<ResultMinification>`.

* `outputPaths`: Parallel array of output paths; pass the same value as `filePaths` for in-place minification
* `options.preset`: HTML Minifier Next preset name (default: `'comprehensive'`)
* `options.options`: Additional HTML Minifier Next options to merge with the preset

#### `minifyString(content, options?)`

Minifies an HTML string using HTML Minifier Next. Returns `Promise<string>`. Useful in content-pipeline contexts (Eleventy transforms, middleware, SSR) where HTML is available as a string rather than a file.

* `options.preset`: HTML Minifier Next preset name (default: `'comprehensive'`)
* `options.options`: Additional HTML Minifier Next options to merge with the preset

#### `loadConfig(cwd?, filePath?)`

Loads HiHTML configuration. When `filePath` is given, only that file is read (no CWD fallback); if it contains a `"hihtml"` key that value is used, otherwise the root object is used. Without `filePath`, reads `.hihtml.json` or the `"hihtml"` key in `package.json` from `cwd`. Returns `Promise<HiHTMLConfig>`.
Loads hihtml configuration. When `filePath` is given, only that file is read (no CWD fallback); if it contains a `"hihtml"` key that value is used, otherwise the root object is used. Without `filePath`, reads `.hihtml.json` or the `"hihtml"` key in `package.json` from `cwd`. Returns `Promise<HihtmlConfig>`.

## Configuration

Create a .hihtml.json file in your project root, or add a `"hihtml"` key to package.json. Both use the same format (here showing HiHTML’s defaults):
Create a .hihtml.json file in your project root, or add a `"hihtml"` key to package.json. Both use the same format (here showing hihtml’s defaults):

```json
{
Expand Down Expand Up @@ -200,12 +236,12 @@ If in doubt or in a hurry, [report issues here](https://github.com/j9t/hihtml/is

### What does ObsoHTML do here when HTML-validate already reports on deprecated markup?

At the moment, ObsoHTML catches some elements and attributes that HTML-validate doesn’t. Once HTML-validate covers everything ObsoHTML covers, ObsoHTML is going to be removed from HiHTML. Note that ObsoHTML is purely informational—it doesn’t prevent minification when used with the `--all`/`-a` flag.
At the moment, ObsoHTML catches some elements and attributes that HTML-validate doesn’t. Once HTML-validate covers everything ObsoHTML covers, ObsoHTML is going to be removed from hihtml. Note that ObsoHTML is purely informational—it doesn’t prevent minification when used with the `--all`/`-a` flag.

***

You might like some of my other work:

* Optimization tools: HiHTML (including [HTML Minifier Next](https://github.com/j9t/html-minifier-next) + [ObsoHTML](https://github.com/j9t/obsohtml)) · [Image Guard](https://github.com/j9t/image-guard) · [Compressor.js Next](https://github.com/j9t/compressorjs-next) · [.htaccess Punk](https://github.com/j9t/htaccess-punk)
* Optimization tools: hihtml (including [HTML Minifier Next](https://github.com/j9t/html-minifier-next) + [ObsoHTML](https://github.com/j9t/obsohtml)) · [Image Guard](https://github.com/j9t/image-guard) · [Compressor.js Next](https://github.com/j9t/compressorjs-next) · [.htaccess Punk](https://github.com/j9t/htaccess-punk)
* Defense tools: [IA Defensa](https://iadefensa.com/solutions/)
* Resources for quality web development: [Articles](https://meiert.com/topics/development/) · [Books](https://meiert.com/topics/books/) (including [_On Web Development_](https://meiert.com/blog/on-web-development-2/)) · [News](https://frontenddogma.com/) · [Terminology](https://webglossary.info/)
2 changes: 1 addition & 1 deletion SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Supported Versions

Only the latest and therefore current version of HiHTML is supported. It’s advised to update older versions to the latest version.
Only the latest and therefore current version of hihtml is supported. It’s advised to update older versions to the latest version.

## Reporting a Vulnerability

Expand Down
2 changes: 1 addition & 1 deletion bin/hihtml.js
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -267,4 +267,4 @@ async function saveReport(report, fileOpt) {
const reportPath = typeof fileOpt === 'string' ? fileOpt : 'hihtml-report.json';
await fs.promises.writeFile(reportPath, JSON.stringify(report, null, 2), 'utf8');
console.log(`\nReport saved to ${styleText('bold', reportPath)}`);
}
}
Loading
Loading