This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Rust JSON decoder (cdylib + rlib) exposed to LuaJIT via FFI. Optimized for parse-once / extract-a-few-fields / discard. The competitive edge over lua-cjson comes from never building a Lua table — Phase 1 records only structural offsets, Phase 2 lazily decodes the fields the caller actually asks for. Crate name in Cargo.toml is qjson; the compiled artifact is libqjson.so.
The Makefile is the canonical entry point; make help lists targets.
make build # cargo build --release → target/release/libqjson.so
make test # cargo test --release + busted Lua tests
make lint # cargo clippy --release --all-targets -- -D warnings
make bench # OpenResty LuaJIT benchmark vs lua-cjson and simdjsonUnder the hood / for narrower invocations:
# Single Rust integration test
cargo test --release --test ffi_smoke parse_and_free_roundtrip
# Single Rust unit test (e.g. inside src/doc.rs)
cargo test --release doc::tests::parses_simple_object
# Lua tests bypassing the Makefile
LD_LIBRARY_PATH=./target/release \
busted --lua=$(command -v luajit) tests/lua --lpath='./lua/?.lua'
# Scalar-only test run (no SIMD) — CI runs this gate
cargo test --release --no-default-features
# Force the FFI panic-barrier code path
cargo test --features test-panic --releaseffi.load("qjson") uses dlopen, which respects LD_LIBRARY_PATH — not LuaJIT's package.cpath. The Makefile sets LD_LIBRARY_PATH=target/release for test/bench; if you invoke busted or luajit directly, set it yourself.
make lint runs clippy only (with -D warnings); cargo fmt --check is intentionally not part of the lint gate because the codebase uses manual column alignment in struct definitions and compact single-line literals that default rustfmt would reflow. See the README "Roadmap / Deferred" entry on fmt for context.
Phase 1 (src/scan/, called from Document::parse_with_options): a structural scanner walks the input once and writes the byte offset of every non-string-interior { } [ ] : , " into doc.indices. Then validate_depth is run unconditionally; in EAGER mode, validate_trailing and validate_eager_values (number ABNF + string content + UTF-8) follow. In LAZY mode, value-level checks are skipped and rely on the lazy decode path at field-access time. A u32::MAX sentinel is appended. The scanner is selected at first use via OnceCell in src/scan/mod.rs:
Avx2Scanner(gated by theavx2cargo feature, default-on) when bothavx2andpclmulqdqare detected at runtime.ScalarScannerotherwise.
Validation level depends on qjson_options.mode. EAGER (default): a post-scan pass walks indices and validates RFC 8259 number ABNF, string content (no unescaped control chars), and UTF-8 — parse fails on any value-level violation. LAZY (opt-in): bracket/quote balance + max-depth only; value-level errors surface when the offending field is accessed (lua-cjson-equivalent behavior). Trailing-content rejection and value-level validation are eager-only; max-depth (default 1024, configurable up to 4096) is enforced in both modes.
Phase 2 (src/cursor.rs, src/path.rs, src/decode/): path strings are parsed by a zero-alloc PathIter into PathSeg::Key | Idx. A Cursor (a (idx_start, idx_end) pair into doc.indices) is walked to the target, optionally caching sibling spans in doc.skip (SkipCache) so repeated lookups on the same container skip brace-counting. Strings are decoded into doc.scratch only when they contain escapes; otherwise the original buffer slice is handed back.
get_strpointer lifetime. The(ptr, len)returned byqjson_get_str/qjson_cursor_get_strpoints into either the original input buffer ordoc.scratch. Any subsequent*_get_strcall on the same doc may invalidate prior pointers (scratch buffer reuse). The LuaJIT wrapper preserves this contract by callingffi.string(ptr, len)immediately to copy into a Lua string — do not change that.- Buffer lifetime.
Document<'a>borrows the input slice.qjson_parsetransmutes'ato'staticand trusts the caller to keep the buffer alive for the document's lifetime. The LuaJIT wrapper enforces this by stashing the original string under_holdon the Doc table so Lua GC keeps it pinned. indicesstores offsets only, not types. Token type is recovered frombuf[indices[i]]. Do not add a type tag — the 25% memory win is intentional.- Single-threaded.
qjson_docis not Sync/Send across threads;RefCellis used forscratchandskip. - FFI panic barrier. Every
pub unsafe extern "C"function insrc/ffi.rswraps its body incatch_unwindand converts a panic intoQJSON_OOM. Preserve this pattern on any new export — a panic crossing the FFI boundary is undefined behavior.
src/
lib.rs crate root
ffi.rs extern "C" surface, qjson_* symbols, panic barrier
doc.rs Document (indices + scratch + skip cache)
cursor.rs Cursor + path resolution + skip-cache walk
path.rs zero-alloc path-string iterator
decode/ lazy string / number decode
scan/ ScalarScanner, Avx2Scanner, runtime dispatch
skip_cache.rs Phase 2 sibling-skip cache
error.rs qjson_err + qjson_type enums (must stay in sync with include/qjson.h and lua/qjson.lua)
lua/qjson.lua LuaJIT wrapper (ffi.cdef + Doc/Cursor metatables)
include/qjson.h public C header
tests/ Rust integration tests + tests/lua/ busted suite
benches/ lua_bench.lua vs lua-cjson/simdjson; fixtures/ has small_api.json + medium_resp.json
The enum values in src/error.rs are duplicated in include/qjson.h and lua/qjson.lua (the latter only encodes the T_* type tags and NOT_FOUND = 2). Keep all three in sync when adding/renumbering codes.
.github/workflows/ci.yml runs three Rust matrix points and one Lua job:
cargo test --release(default features → AVX2 on, falls through to scalar on non-AVX2 hardware at runtime)cargo test --release --no-default-features— scalar scanner only; catches AVX2-vs-scalar divergencecargo test --features test-panic --release— exercises the FFI panic barrier- Lua busted suite under LuaJIT (depends on the Rust job passing)
If you add a scanner code path, run gate 2 locally; the cross-check test (tests/scanner_crosscheck.rs) is the main defense against backend drift and uses proptest — the .proptest-regressions file is intentionally committed.
- Deferred / "we'll do this later" decisions go in
README.mdunder Roadmap / Deferred, one bullet per item, so each can be picked up individually. Don't park them in code comments or scratch files.