Summary
Web/src/docs/types/char.md describes a char-handling story that disagrees with the compiler in roughly a dozen specific places. The defects fall into three categories with different fix paths:
- Pure doc bugs — compiler is internally consistent and intentional; doc is the outlier. Fix in this repo, no input needed elsewhere.
- Roadmap-dependent — doc describes coherent features that don't exist anywhere in the codebase. A maintainer needs to decide implement vs. delete before docs can move.
- Tracked as compiler bugs — doc is correct; compiler is the outlier. Filed separately under
rux-lang/Rux; this issue links to them so a doc-fixer doesn't accidentally rewrite text that's about to become true.
Category 1 — pure doc bugs (fixable here)
1a. Nonexistent char types: char64, char128, char256, char512
TypeRef::Kind (Rux/Include/Rux/Type.h:17-50) defines exactly Char8, Char16, Char32, with Char = Char32 as alias. No wider char types exist in the compiler. Type-name resolution at Hir.cpp:687-692 and Sema.cpp:1008-1011 accepts only char, char8, char16, char32.
Lines affected:
- L8 — overview alludes to "extended private-use and future Unicode planes."
- L19-27 — types table; rows for
char64, char128, char256, char512.
- L65-82 — entire
### char64, char128, char256, char512 section.
- L77-79 — example
let raw = 0x0001f600 as char64;.
- L94 — "use the qualified type names (
char8, char16, etc.)" — etc. implies wider types exist.
- L213-215 — surrogates note about "wider types (
char64 and above)."
- L267-268 — recommendations bullet about avoiding
char64–char512.
- L283-284 — FFI note about "wider types have no standard C equivalent."
Fix: trim to char8 / char16 / char32 / char throughout. If extended-width chars are on the roadmap, gate behind a "Planned" callout instead.
1b. String interpolation "{c}" (L187)
The line:
let t: String = "{c}"; // "€" (interpolation)
is wrong on three independent points:
- Bare "…" literals are typed Slice, not String (Hir.cpp:386-394, Sema.cpp:763-768). Assignment errors with cannot assign 'Slice' to 'String'.
- There is no string interpolation in the language. The lexer doesn't scan for { inside strings, the parser has no interpolation segment branch, and no InterpolatedString AST node exists. "{c}" lexes as the literal three-character string {, c, }.
- Even if interpolation existed, the surrounding example at L186 calls String.From(c) where c: char32, but Std/Src/String.rux:33-44 only provides From(*const char8, uint) and From(char8[]) — no char overload.
A grep across Web/src/docs/ shows L187 is the only place rendered docs use "{ident}"-style syntax, so cleanup is narrow.
Fix: delete L187 entirely. Possibly also rewrite L181-186 to point users at whatever the supported "char to one-char String" path actually is today (see Category 2d for the underlying gap).
1c. Typo at L156
"the program trow exception"
Should read "throws an exception." Incidental; fold into whatever PR addresses L153-162 (which is itself bucket-2; see below).
Category 2 — roadmap-dependent (needs maintainer decision)
These describe coherent features that simply don't exist. Each needs a call: implement, or delete from the docs.
2a. The as? operator (L161, L177, L252-258, L269)
Rux/Source/Parser.cpp:1364-1385 parses cast expressions and accepts only as and is. There is no Question-token branch after the AsKeyword match. x as? char16 lexes to x, as, char16, ?, and the trailing ? fails to parse.
2b. Nullable type syntax T? (L161)
? is lexed as TokenKind::Question (Lexer.cpp:552) and used in exactly one place: the ternary at Parser.cpp:1237. No type parser admits a postfix ?, and no Optional / Nullable variant exists in TypeRef::Kind (Type.h:17-50).
If 2a and 2b were both implemented, the existing null literal still wouldn't fit T? for value types — null is currently the C-style null pointer literal, assignable only to pointer types via the special-case rule at Sema.cpp:99-101.
2c. let s: String = "..." pattern (L186, plus likely elsewhere)
Bare string literals are Slice<char8>. Assigning one to a String binding errors. Either IsAssignableTo needs a Slice<char8> → String coercion rule, or the docs need to stop using this pattern. (A grep would show how widespread the misuse is — worth a sweep before the fix lands.)
2d. String::From(charN) overload (L186)
Std/Src/String.rux:33-44 provides only the two existing From overloads. A From(char32) (or From(charN) for each width) overload would need to be added before the doc example can stand. This is a stdlib gap, not a compiler gap — could be filed against rux-lang/Std if the maintainer wants to implement.
Per-feature call needed: for each of 2a-2d, either
- implement and leave doc text in place,
- gate behind a "Planned" callout linking a tracking issue, or
- delete the doc text entirely.
I'd default to (c) for 2a/2b (large language features, no infrastructure in place) and (a) for 2c/2d (small stdlib/compiler tweaks that would close real ergonomic gaps). But that's a maintainer call.
Category 3 — compiler bugs (filed separately, do not rewrite the doc)
The doc is correct on these points; the compiler doesn't match. Filed under rux-lang/Rux:
3a. Char widening is not implicit (#)
char.md:143-151 correctly describes implicit widening for char8 → char16 → char32. Sema.cpp:81-103 is missing the rule. Filed at rux-lang/Rux#<widening-issue> (placeholder). Don't edit L143-151; they're the spec.
3b. Cast validation skips range / surrogate / runtime panic (#)
char.md:32-39, :153-162, :175-177, :249-258 all describe validation behavior on as casts (constant out-of-range error, runtime panic, surrogate rejection). The compiler does none of it. Filed at rux-lang/Rux#<cast-validation-issue> (placeholder). Don't rewrite these sections; the maintainer has a design call to make on that issue about whether docs or compiler should win.
Category 4 — bucket awaiting Category 3b's resolution
4a. let a: char8 = 'A'; doesn't compile (L36, L48-50, L130-138)
Unprefixed char literals are always char32 (Hir.cpp:396-401, Sema.cpp:771-776). The "minimum-width inference" claim at L130-138 doesn't match implementation; the L36 example fails with type mismatch.
This isn't filed as a separate compiler issue because it's directly entangled with 3b's design call:
- If the language adopts context-driven coercion or minimum-width inference, L36 / L130-138 are correct as-is and the compiler needs the fix.
- If the language insists on prefixed literals (c8'A', c16'字'), L36 / L48-50 / L130-138 all need rewriting to use prefixed forms.
Summary
Web/src/docs/types/char.mddescribes a char-handling story that disagrees with the compiler in roughly a dozen specific places. The defects fall into three categories with different fix paths:rux-lang/Rux; this issue links to them so a doc-fixer doesn't accidentally rewrite text that's about to become true.Category 1 — pure doc bugs (fixable here)
1a. Nonexistent char types:
char64,char128,char256,char512TypeRef::Kind(Rux/Include/Rux/Type.h:17-50) defines exactlyChar8,Char16,Char32, withChar = Char32as alias. No wider char types exist in the compiler. Type-name resolution atHir.cpp:687-692andSema.cpp:1008-1011accepts onlychar,char8,char16,char32.Lines affected:
char64,char128,char256,char512.### char64, char128, char256, char512section.let raw = 0x0001f600 as char64;.char8,char16, etc.)" —etc.implies wider types exist.char64and above)."char64–char512.Fix: trim to char8 / char16 / char32 / char throughout. If extended-width chars are on the roadmap, gate behind a "Planned" callout instead.
1b. String interpolation
"{c}"(L187)The line:
is wrong on three independent points:
A grep across Web/src/docs/ shows L187 is the only place rendered docs use "{ident}"-style syntax, so cleanup is narrow.
Fix: delete L187 entirely. Possibly also rewrite L181-186 to point users at whatever the supported "char to one-char String" path actually is today (see Category 2d for the underlying gap).
1c. Typo at L156
Should read "throws an exception." Incidental; fold into whatever PR addresses L153-162 (which is itself bucket-2; see below).
Category 2 — roadmap-dependent (needs maintainer decision)
These describe coherent features that simply don't exist. Each needs a call: implement, or delete from the docs.
2a. The
as?operator (L161, L177, L252-258, L269)Rux/Source/Parser.cpp:1364-1385parses cast expressions and accepts onlyasandis. There is noQuestion-token branch after theAsKeywordmatch.x as? char16lexes tox,as,char16,?, and the trailing?fails to parse.2b. Nullable type syntax
T?(L161)?is lexed asTokenKind::Question(Lexer.cpp:552) and used in exactly one place: the ternary atParser.cpp:1237. No type parser admits a postfix?, and noOptional/Nullablevariant exists inTypeRef::Kind(Type.h:17-50).If 2a and 2b were both implemented, the existing
nullliteral still wouldn't fitT?for value types —nullis currently the C-style null pointer literal, assignable only to pointer types via the special-case rule atSema.cpp:99-101.2c.
let s: String = "..."pattern (L186, plus likely elsewhere)Bare string literals are
Slice<char8>. Assigning one to aStringbinding errors. EitherIsAssignableToneeds aSlice<char8>→Stringcoercion rule, or the docs need to stop using this pattern. (A grep would show how widespread the misuse is — worth a sweep before the fix lands.)2d.
String::From(charN)overload (L186)Std/Src/String.rux:33-44provides only the two existing From overloads. AFrom(char32)(orFrom(charN)for each width) overload would need to be added before the doc example can stand. This is a stdlib gap, not a compiler gap — could be filed againstrux-lang/Stdif the maintainer wants to implement.Per-feature call needed: for each of 2a-2d, either
I'd default to (c) for 2a/2b (large language features, no infrastructure in place) and (a) for 2c/2d (small stdlib/compiler tweaks that would close real ergonomic gaps). But that's a maintainer call.
Category 3 — compiler bugs (filed separately, do not rewrite the doc)
The doc is correct on these points; the compiler doesn't match. Filed under
rux-lang/Rux:3a. Char widening is not implicit (#)
char.md:143-151correctly describes implicit widening forchar8→char16→char32.Sema.cpp:81-103is missing the rule. Filed atrux-lang/Rux#<widening-issue>(placeholder). Don't edit L143-151; they're the spec.3b. Cast validation skips range / surrogate / runtime panic (#)
char.md:32-39,:153-162,:175-177,:249-258all describe validation behavior onascasts (constant out-of-range error, runtime panic, surrogate rejection). The compiler does none of it. Filed atrux-lang/Rux#<cast-validation-issue>(placeholder). Don't rewrite these sections; the maintainer has a design call to make on that issue about whether docs or compiler should win.Category 4 — bucket awaiting Category 3b's resolution
4a.
let a: char8 = 'A';doesn't compile (L36, L48-50, L130-138)Unprefixed char literals are always
char32 (Hir.cpp:396-401,Sema.cpp:771-776). The "minimum-width inference" claim at L130-138 doesn't match implementation; the L36 example fails with type mismatch.This isn't filed as a separate compiler issue because it's directly entangled with 3b's design call: