Skip to content

fix: std.manifestYamlDoc passes through Unicode characters natively#1017

Open
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:fix/yaml-unicode-escape
Open

fix: std.manifestYamlDoc passes through Unicode characters natively#1017
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:fix/yaml-unicode-escape

Conversation

@He-Pin

@He-Pin He-Pin commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

Summary

YamlRenderer hardcoded escapeUnicode = true, causing all non-ASCII characters to be escaped as \uXXXX sequences in YAML output. C++ jsonnet, go-jsonnet, and jrsonnet all pass through Unicode characters directly.

Changed escapeUnicode from true to false in both key and value visitors.

Cross-implementation comparison

Expression cpp-jsonnet 0.21.0 go-jsonnet 0.22.0 jrsonnet 0.5.0-pre99 sjsonnet (before) sjsonnet (after)
std.manifestYamlDoc("世界") "世界" "世界" "世界" "\u4e16\u754c" "世界"
std.manifestYamlDoc("café") "café" "café" "café" "caf\u00e9" "café"
std.manifestYamlDoc("🌍") "🌍" "🌍" "🌍" "\ud83c\udf0d" "🌍"
std.manifestYamlDoc({name: "世界"}) name: "世界" name: "世界" name: "世界" name: "\u4e16\u754c" name: "世界"

Test plan

  • ./mill sjsonnet.jvm[3.3.7].test — all suites green
  • New regression test: yaml_unicode_native_output.jsonnet (golden verified against C++ jsonnet)

Motivation:
YamlRenderer hardcoded escapeUnicode = true, causing all non-ASCII
characters to be escaped as \uXXXX sequences. Both C++ jsonnet and
jrsonnet pass through Unicode characters directly in YAML output.

Modification:
- YamlRenderer.scala: Change escapeUnicode from true to false in
  both the key visitor (line 26) and value visitor (line 68).

Result:
std.manifestYamlDoc("世界") now outputs "世界" instead of
"\u4e16\u754c", matching C++ jsonnet and jrsonnet.

Cross-implementation comparison:
| Expression                           | C++ jsonnet  | sjsonnet (before)      | sjsonnet (after) |
|--------------------------------------|-------------|------------------------|-------------------|
| std.manifestYamlDoc("世界")          | "世界"      | "\u4e16\u754c" ❌     | "世界" ✅         |
| std.manifestYamlDoc("café")          | "café"      | "caf\u00e9" ❌        | "café" ✅         |
| std.manifestYamlDoc("🌍")            | "🌍"        | "\ud83c\udf0d" ❌     | "🌍" ✅           |
| std.manifestYamlDoc({name: "世界"})  | name: 世界  | name: \u4e16\u754c ❌ | name: 世界 ✅     |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant