Summary
The character-type documentation, examples, and TextMate grammar describe seven character widths (char8, char16, char32, char64, char128, char256, char512). The compiler only implements three: char8, char16, char32, plus char as an alias for char32. Code copied from the docs that uses char64/char128/char256/char512 fails at name resolution.
There are no extended-width character literal prefixes either — only c8'…', c16'…', c32'…' exist; the docs do not currently invent ones for the wider types, but the type names themselves are pervasive.
Compiler reality (source of truth)
Rux/Include/Rux/Type.h:17-50 — TypeRef::Kind defines exactly:
Char8, Char16, Char32,
...
Char = Char32, // alias
No Char64 / Char128 / Char256 / Char512 exist anywhere in the compiler.
Type-name resolution accepts only "char", "char8", "char16", "char32":
- Rux/Source/Hir.cpp:224-226, 690-692
- (parallel entries exist in Sema.cpp for the same three widths)
Char-literal/string-prefix parsing accepts only c8, c16, c32:
- Rux/Source/Hir.cpp:387-389 (string prefixes)
- Rux/Source/Hir.cpp:397-399 (char-literal prefixes)
Repro
let a: char64 = 'A'; // unknown type
let b: char128 = 'A'; // unknown type
let c: char256 = 'A'; // unknown type
let d: char512 = 'A'; // unknown type
All four declarations should compile per the docs; none do.
Every doc/asset that needs editing
Web/src/docs/types/char.md
- L8 — overview: "from narrow ASCII-range characters up to extended private-use and future Unicode planes".
- L19-27 — char-types table; rows for char64, char128, char256, char512.
- L65-82 — entire ### char64, char128, char256, char512 section, including:
- L65 heading.
- L67 introduction "Extended-width character types with no defined Unicode semantics".
- L69-72 use-case bullets.
- L74-75 "behave as unsigned integers… implicit conversion to u64, u128, etc. is not permitted." (Note: u128 is also not a real type — covered separately in the integer-types issue.)
- L77-79 example let raw = 0x0001f600 as char64;.
- L81-82 warning callout "compiler may emit a warning when assigning a plain integer literal to char64–char512…".
- L94 — recommendation aside: "use the qualified type names (char8, char16, etc.)" — etc. implies wider types exist; can stay if rephrased to be explicit about the three real names.
- L213-215 — surrogates note: "Wider types (char64 and above) may hold surrogate values as raw integers, but they carry no Unicode meaning in that context."
- L267-268 — recommendations bullet: "Avoid char64–char512…".
- L283-284 — FFI note: "Wider types have no standard C equivalent and are passed by pointer in generated C interop headers."
Web/src/examples/Char.rux
- L5-8 — char64, char128, char256, char512 declarations.
(L1-4, L9-18 are correct and should stay.)
Web/.vitepress/grammars/rux.tmLanguage.json
- L557 — char type-name regex: \b(char|char8|char16|char32|char64|char128|char256|char512)\b — should be \b(char|char8|char16|char32)\b.
Suggested fix
Trim docs/examples/grammar to match the compiler — char, char8, char16, char32 only. If extended-width character types are on the roadmap, gate the section behind a "Planned" / "Not yet implemented" callout instead of presenting them as available.
The L65-82 section is interesting in its own right: rather than describing real Unicode/text-encoding semantics, it positions the wide types as "unsigned integers that aren't integers" for "raw code units from non-Unicode encodings" and "alignment padding" — which is a curious feature to document. If they're not coming, the simplest fix is to delete the section entirely; the type table already covers the three real widths.
Summary
The character-type documentation, examples, and TextMate grammar describe seven character widths (
char8,char16,char32,char64,char128,char256,char512). The compiler only implements three:char8,char16,char32, pluscharas an alias forchar32. Code copied from the docs that useschar64/char128/char256/char512fails at name resolution.There are no extended-width character literal prefixes either — only
c8'…',c16'…',c32'…'exist; the docs do not currently invent ones for the wider types, but the type names themselves are pervasive.Compiler reality (source of truth)
Rux/Include/Rux/Type.h:17-50—TypeRef::Kinddefines exactly:No Char64 / Char128 / Char256 / Char512 exist anywhere in the compiler.
Type-name resolution accepts only "char", "char8", "char16", "char32":
Char-literal/string-prefix parsing accepts only c8, c16, c32:
Repro
All four declarations should compile per the docs; none do.
Every doc/asset that needs editing
Web/src/docs/types/char.md
Web/src/examples/Char.rux
(L1-4, L9-18 are correct and should stay.)
Web/.vitepress/grammars/rux.tmLanguage.json
Suggested fix
Trim docs/examples/grammar to match the compiler — char, char8, char16, char32 only. If extended-width character types are on the roadmap, gate the section behind a "Planned" / "Not yet implemented" callout instead of presenting them as available.
The L65-82 section is interesting in its own right: rather than describing real Unicode/text-encoding semantics, it positions the wide types as "unsigned integers that aren't integers" for "raw code units from non-Unicode encodings" and "alignment padding" — which is a curious feature to document. If they're not coming, the simplest fix is to delete the section entirely; the type table already covers the three real widths.