Skip to content

Integrate Unicode escape handling into lexer/parser and remove unicode std dependency from JSON#101

Merged
DanexCodr merged 5 commits into
mainfrom
copilot/remove-unicode-std-update-json-library
Apr 13, 2026
Merged

Integrate Unicode escape handling into lexer/parser and remove unicode std dependency from JSON#101
DanexCodr merged 5 commits into
mainfrom
copilot/remove-unicode-std-update-json-library

Conversation

Copilot AI commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

This change moves full \uXXXX escape handling into compiler string parsing and eliminates the JSON library’s dependency on the Unicode std module. JSON now handles Unicode escape parsing/normalization through local helpers while the obsolete Unicode std package is removed.

  • Lexer/parser Unicode handling

    • Added full Unicode escape decoding for string literals in Java lexer/parser paths.
    • Supports surrogate-pair validation (high + required low surrogate) and rejects malformed/lone surrogate escapes.
    • Keeps escape semantics centralized in language parsing rather than stdlib post-processing.
  • JSON stdlib decoupling

    • Removed use {unicode} from json stdlib.
    • Replaced all Unicode.* calls with local JSON helpers (isDigit, isHexDigit, hexValue, hex4, isHighSurrogate, isLowSurrogate).
    • Preserves JSON \u escape normalization behavior without external stdlib coupling.
  • Stdlib cleanup

    • Removed std/unicode/Unicode.cod.
    • Removed the dedicated Unicode stdlib comprehensive test.
    • Added ignore entries for generated demo runtime artifacts under src/main/cod/demo/src/bin/ and src/main/cod/demo/src/idx/.

Example of the new parser-side behavior:

// now decoded by lexer/parser (with surrogate validation), not stdlib
if (escaped == 'u') {
    UnicodeEscapeResult unicodeResult = decodeUnicodeEscape();
    escapedStr = unicodeResult.text;
}

Copilot AI and others added 5 commits April 13, 2026 09:46
Agent-Logs-Url: https://github.com/DanexCodr/Coderive/sessions/055bd537-c2da-45f9-80b8-e022af247aa2

Co-authored-by: DanexCodr <216312766+DanexCodr@users.noreply.github.com>
Agent-Logs-Url: https://github.com/DanexCodr/Coderive/sessions/055bd537-c2da-45f9-80b8-e022af247aa2

Co-authored-by: DanexCodr <216312766+DanexCodr@users.noreply.github.com>
@DanexCodr DanexCodr marked this pull request as ready for review April 13, 2026 10:09
@DanexCodr DanexCodr merged commit 44b2c15 into main Apr 13, 2026
@DanexCodr DanexCodr deleted the copilot/remove-unicode-std-update-json-library branch April 13, 2026 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants