You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #38 introduced eager RFC 8259 validation, which costs ~10–48x slowdown on quickdecode.parse + access 3 fields vs the lazy/main baseline (see PR description for the full bench table). The dominant cost is in string-content validation, which currently makes three independent passes over every string's raw bytes.
Current state
For every string span between two " structurals, the eager pass calls validate_string_span(span), which today runs:
span.iter().any(|&b| b < 0x20) — reject raw control characters
A separate byte-by-byte escape-grammar walker — reject \a, \uZZZZ, dangling \, truncated \uXX, etc.
Each pass walks the full string. For payloads where strings dominate (most real-world JSON), this means traversing string content ~3 times per parse.
In real traffic these checks almost always pass (invalid UTF-8 from non-UTF-8-aware upstreams is the most common rejection cause; control chars and bad escapes are rare). So the work is wasted on the happy path.
One byte-level state machine that simultaneously tracks:
UTF-8 continuation state (using DFA tables like the Hoehrmann decoder, or simdjson-style validation)
Whether the current byte is a control char (< 0x20)
Whether the previous byte was \ (so the next byte must be a valid escape introducer; u enters a 4-hex sub-state)
An ASCII-only fast path: if a 32/64-byte chunk has no high bits set, no bytes < 0x20, and no \, advance past it in one SIMD compare. The non-ASCII / has-backslash slow path falls back to the state machine.
Acceptance criteria
Single-pass string-content validator replacing the current 3-pass validate_string_span.
SIMD fast path for ASCII-only chunks (no \, no control, no high bit).
Performance: close enough to lazy/main throughput that the eager-by-default mode is acceptable for the API-gateway use case — target within 2x of lazy on real-world payloads (open to revision based on measurements).
No regression in correctness: every test in tests/rfc8259_compliance.rs and tests/json_test_suite.rs continues to pass with identical error codes.
Out of scope
Optimizing scalar gap dispatch (check_gap / validate_number). That's a secondary cost and a separate concern.
Merging the eager pass into the SIMD scanner itself. Doable but requires touching the AVX2/NEON code paths and the crosscheck proptest; tackle only if the in-pass optimization is insufficient.
PR #38 introduced eager RFC 8259 validation, which costs ~10–48x slowdown on
quickdecode.parse + access 3 fieldsvs the lazy/main baseline (see PR description for the full bench table). The dominant cost is in string-content validation, which currently makes three independent passes over every string's raw bytes.Current state
For every string span between two
"structurals, the eager pass callsvalidate_string_span(span), which today runs:span.iter().any(|&b| b < 0x20)— reject raw control charactersstd::str::from_utf8(span)— reject non-UTF-8 byte sequences\a,\uZZZZ, dangling\, truncated\uXX, etc.Each pass walks the full string. For payloads where strings dominate (most real-world JSON), this means traversing string content ~3 times per parse.
In real traffic these checks almost always pass (invalid UTF-8 from non-UTF-8-aware upstreams is the most common rejection cause; control chars and bad escapes are rare). So the work is wasted on the happy path.
Proposed optimization
真正"省事"的优化方向是合并 UTF-8 + control char + escape 三个扫描为单次扫描(一遍字节走完三个状态机),且对常见 ASCII-only 字符串走 SIMD 快路径。
Concretely:
simdjson-style validation)\(so the next byte must be a valid escape introducer;uenters a 4-hex sub-state)\, advance past it in one SIMD compare. The non-ASCII / has-backslash slow path falls back to the state machine.Acceptance criteria
validate_string_span.\, no control, no high bit).tests/rfc8259_compliance.rsandtests/json_test_suite.rscontinues to pass with identical error codes.Out of scope
check_gap/validate_number). That's a secondary cost and a separate concern.Bench reference
PR #38 description has the before/after table. Key data points (parse + access 3 fields):