HardGuard25 Specification

Version: 1.3.2 Date: April 2026 Author: Sam Rogers -- Snap Synapse Spec License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Abstract

Identifiers used by humans and machines fail for predictable reasons. Most collisions and transcription errors trace back to ambiguous characters, inconsistent casing, and context-insensitive formatting. HardGuard25 defines a 25-character alphabet optimized for human readability and low error rates while retaining the full 0-9 digit set for natural versioning. This document specifies the character set, normalization rules, recommended lengths by risk profile, an optional check digit, and formatting guidelines. It does not define a global ID format or distributed generation protocol.

Alphabet

0 1 2 3 4 5 6 7 8 9 A C D F G H J K M N P R U W Y

25 characters total: 10 digits + 15 letters.

Canonical Form

Uppercase only
ASCII only
No whitespace, no punctuation

Regex

^[0-9ACDFGHJKMNPRUWY]+$

Rationale for Exclusions

Removed	Reason
O	Confusable with 0 (zero)
I	Confusable with 1 (one) and lowercase L
L	Confusable with 1 (one) and uppercase I
B	Confusable with 8; dyslexia mirror pair with D
S	Confusable with 5
Z	Confusable with 2
E	Confusable with 3 for dyslexic readers; digits take priority
V	Confusable with U in many typefaces
T	Resembles a plus sign in some contexts
X	Collides with multiplication operator; varies by locale
Q	Confusable with O in some typefaces; dyslexia mirror pair with P

Design principle: when a letter and a digit compete for the same visual slot, the digit always wins.

Comparison Matrix

Name	Chars	Bits/char	Key Exclusions
HardGuard25	25	4.64	I, L, O, B, S, Z, E, T, V, X, Q
Canadian Postal	30	4.91	D, F, I, O, Q, U
Nintendo Base-31	31	4.95	0, 1, B, I, O
Crockford Base32	32	5.00	I, L, O, U
z-base-32	32	5.00	Custom reorder, drops 0/2
RFC 4648 Base32	32	5.00	None (uses A-Z + 2-7)

Entropy and Recommended Lengths

Bits per character = log2(25) = 4.64.

Length	Bits	Unique IDs	Typical Use
4	18.6	390,625	Small inventory, tickets
5	23.2	9,765,625	Small business
6	27.9	244,140,625	Medium businesses
7	32.5	6.1 billion	Large catalogs
8	37.2	152.6 billion	Large systems
12	55.7	5.96 x 10^16	Internal tokens
16	74.2	3.55 x 10^22	Cross-system IDs
20	92.8	2.11 x 10^27	Public tokens
22	102.1	1.32 x 10^30	Internet-scale

Recommended defaults:

16 for internal systems up to millions of IDs
20 for public tokens or cross-org use
22 for long-lived, internet-scale identifiers

Collision Guidance

Birthday-bound approximation: for alphabet size N and length L, the probability of any collision after generating k IDs is approximately p = 1 - exp(-k^2 / (2 * N^L)).

Length	Space Size	Max IDs before 1e-9 collision	Max IDs before 1e-12 collision
12	5.96e16	10,918	345
14	3.72e20	689,957	21,823
16	2.33e22	6,823,938	215,584
18	1.46e25	67,627,923	2,136,731
20	9.09e27	4,264,961,375	134,875,910
22	5.68e30	106,624,034,378	3,371,897,753

Normalization Rules

Trim leading and trailing whitespace
Collapse any separator characters (hyphens, underscores, dots, and any Unicode whitespace)
Uppercase fold all letters
Reject any character outside the alphabet

A normalizer must be idempotent: normalize(normalize(x)) === normalize(x).

API Contract

Reference implementations in JavaScript, Python, and Go must expose the same behavioral contract:

normalize(x) returns canonical uppercase form or errors on invalid characters
validate(x) applies normalization first and returns a boolean
checkDigit(x) accepts canonical or lowercase input and computes the checksum after uppercase folding
verification helpers accept canonical, lowercase, and grouped forms, normalize first, then compare the trailing check character

Optional Check Digit

Use a Mod-25 weighted checksum appended as the last character of the ID.

Algorithm

Assign each character in the alphabet an index 0-24 based on its position in the ordered alphabet: 0=0, 1=1, ..., 9=9, A=10, C=11, D=12, F=13, G=14, H=15, J=16, K=17, M=18, N=19, P=20, R=21, U=22, W=23, Y=24.
Compute a weighted sum: sum = sum of (index[i] * (i + 1)) for each character position i (0-indexed).
The check digit is the alphabet character at index sum % 25.
Append the check digit as the last character.

Verification

To verify, normalize the full input, strip the last character, recompute the check digit, and compare. A valid ID with check digit has length >= 2 after normalization.

Properties

Catches all single-character substitution errors
Catches most adjacent transpositions
Adapted from ISO 7064 principles

Human Formatting

Group into chunks of 4 or 5 for display: ACDF-0G7H-J2KM-NP3R
Store and transmit without separators
Never overload O for 0 or I/L for 1
Use monospaced or semi-monospaced typefaces for printed IDs
Provide high contrast when displayed on screen or paper

Implementation Guidance

Generation

Use a cryptographically secure pseudorandom number generator (CSPRNG)
Map random bytes to the alphabet with unbiased rejection sampling
Do not use modulo 25 directly on raw random bytes; this introduces bias because 256 is not divisible by 25
Rejection sampling: accept byte values less than 225 (25 x 9), discard the rest, then compute value % 25

Validation

Apply normalization
Test against regex
If check digit is enabled, verify checksum
Accept lowercase and separator-filled input only if it normalizes cleanly into canonical form; reject anything outside the alphabet after normalization

Property-Based Tests

Generator never emits excluded characters (B, E, I, L, O, Q, S, T, V, X, Z)
Distribution across symbols is uniform within statistical tolerance
Normalizer is idempotent
Check digit catches all single-character substitutions

Test Vectors

Use these to verify cross-language consistency. These are 16-character IDs generated from the same seed sequence then mapped to HardGuard25 via rejection sampling.

Input bytes (hex): 6a 9f 12 88 58 31 d2 44 9b 07 6c 48 2f 5e e1 b3 ...
HardGuard25-16:    AC0G7HJK2MNPR3UW

Input bytes (hex): 3e 1a 77 c4 9d 20 11 55 cb 4a e0 f8 5c 13 a2 6d ...
HardGuard25-16:    D7H2GJ0KMW1PRUYA

Input bytes (hex): b0 45 f2 1c 8e 09 33 4c 6a 70 56 11 d1 a7 2e 90 ...
HardGuard25-16:    0CNG5J7KM2PRUWAH

Additional conformance vectors for normalization, validation, checksum calculation, and checksum verification live in conformance/vectors.json. All runtime implementations should consume the same fixture data in tests.

Accessibility Notes

Designed for dyslexia-sensitive contexts by avoiding common confusable pairs
Works well in OCR contexts with OCR-A and similar monospaced fonts
Avoid thin, condensed, or highly stylized typefaces for printed IDs
Provide chunking and high contrast when displayed

References

RFC 4122: A Universally Unique Identifier (UUID) URN Namespace
RFC 4648: The Base16, Base32, and Base64 Data Encodings
Crockford, Douglas: Base32 Encoding
Zooko Wilcox-O'Hearn: z-base-32 encoding
Nintendo Switch friend code documentation (Base-31)
Canada Post: Postal Code format guidelines
Dyslexia and human factors research on alphanumeric confusability
OCR-A and machine-readable typeface standards

flowchart TD
    A[Need human-readable IDs?] --> B{What matters most?}
    B -->|Error resistance| C[HardGuard25\n25 chars, 4.64 bits/char]
    B -->|Institutional legitimacy| D[Canadian Postal\n30 chars]
    B -->|Shorter strings, same entropy| E[Base32 family\n32 chars]
    C --> F[Small biz: 4-5 chars\nMedium: 6-7 chars\nLarge: 8+ chars\nPublic tokens: 16-22 chars]
    D --> F
    E --> F

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HardGuard25 Specification

Abstract

Alphabet

Canonical Form

Regex

Rationale for Exclusions

Comparison Matrix

Entropy and Recommended Lengths

Collision Guidance

Normalization Rules

API Contract

Optional Check Digit

Algorithm

Verification

Properties

Human Formatting

Implementation Guidance

Generation

Validation

Property-Based Tests

Test Vectors

Accessibility Notes

References

FilesExpand file tree

SPEC.md

Latest commit

History

SPEC.md

File metadata and controls

HardGuard25 Specification

Abstract

Alphabet

Canonical Form

Regex

Rationale for Exclusions

Comparison Matrix

Entropy and Recommended Lengths

Collision Guidance

Normalization Rules

API Contract

Optional Check Digit

Algorithm

Verification

Properties

Human Formatting

Implementation Guidance

Generation

Validation

Property-Based Tests

Test Vectors

Accessibility Notes

References