Conversation
emit_value wrote string values and object keys as raw bytes between quotes, so any value containing a quote, backslash, or control byte produced invalid JSON. add emit_escaped: escape '"' and '\', control bytes 0x00-0x1F via the short escapes \b \t \n \f \r else \u00xx, leaving other bytes (including valid UTF-8) verbatim per RFC 8259. this is the minimal structural escaping, not the #338 ensure-ascii policy. document the parse/emit representation asymmetry (parse yields raw wire bytes, emit treats str_val as logical bytes) on both sides; the deeper contract question is tracked in #340. Closes #337
page_size() on linux hardcoded 4096, wrong on aarch64 kernels running 16 KiB or 64 KiB pages. Capture AT_PAGESZ from the auxiliary vector once at startup into a shared OS-layer global (_pagesz) and have each arch's page_size() read it, falling back to 4096 only when the auxv genuinely lacks it. The entrypoint publishes the value via capture_pagesz() right after _envp, and (under --pie) after _rt_relocate has applied the relocations, so it is ordinary post-relocation code that may touch globals. std.runtime.linux.reloc keeps its own private pre-relocation read of AT_PAGESZ for the RELRO mprotect, since that runs before any global is safe to reference; the OS layer is the single consumer-facing page-size source of truth. Closes #336
exact-byte assertions for quote/backslash escaping and for control bytes (named short escapes plus \u00xx); the emitted document for a tree with quote+backslash+control re-parses cleanly through the module's own parser; object-key escaping covered with exact bytes and a clean re-parse.
docs(changelog): backfill unreleased output/self-reloc/RELRO entries
fix(data/json): escape strings and keys on emit (RFC 8259)
page_size() fell back to 4096 when the captured _pagesz was 0. AT_PAGESZ is mandatory on linux, so a 0 at call time means the auxv lacked it (broken or nonstandard environment) or a caller ran before _rt_init published — and on a large-page kernel the fallback would hand that early caller 4096 while later callers see 16K, an inconsistency worse than either constant. Panic naming the invariant rather than fabricating a value; no in-tree caller runs before the startup capture, so the normal path is unaffected.
feat(system): source linux page_size from AT_PAGESZ (#336)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Release dev -> main. Three merges since the RELRO release:
std.data.jsonemit now escapes strings and object keys per RFC 8259 (quotes, backslashes, control bytes) — emitted trees were invalid JSON for any such content. The parse-raw/emit-logical representation asymmetry is documented loudly and tracked in data/json: parse and emit disagree on string representation (raw wire bytes vs logical bytes) #340.os.page_size()returns the runtime AT_PAGESZ captured from the auxv at startup instead of a hardcoded 4096 (correct on 16K/64K-page aarch64 kernels); panics instead of guessing when the invariant is breached._rt_relocatekeeps its private pre-relocation read.Verified: mach test . 571→575 green through the sweep; page-size probe (independent auxv read == page_size()) passes on x86_64 native + aarch64/riscv64 qemu in default and --pie builds; JSON emit output revalidated through the module's own parser. No other dev work is swept in.