Skip to content

Add Spanned<T> for source locations, plus a serde-saphyr comparison benchmark#63

Open
jskoiz wants to merge 3 commits into
mainfrom
claude/recursing-chatelet-a2a820
Open

Add Spanned<T> for source locations, plus a serde-saphyr comparison benchmark#63
jskoiz wants to merge 3 commits into
mainfrom
claude/recursing-chatelet-a2a820

Conversation

@jskoiz

@jskoiz jskoiz commented Jun 6, 2026

Copy link
Copy Markdown
Owner

Summary

Adds Spanned<T> for capturing the source location of deserialized values, plus a reproducible head-to-head benchmark against serde-saphyr (the motivation for the feature work). Three commits:

  1. serde-saphyr comparison benchmark — a new example and a documented BENCHMARKS.md section.
  2. Spanned<T> — capture the source span of a successfully deserialized value.
  3. rustfmt on the benchmark example.

Spanned<T> — source locations for parsed values

Error spans only appear on failure; Spanned<T> exposes the location of a successful read (byte offsets plus line/column), which is what config linters, language servers, and "this setting came from line N" tooling need.

use saneyaml::Spanned;

#[derive(serde::Deserialize)]
struct Config { name: Spanned<String> }

let cfg: Config = saneyaml::from_str("name: api\n")?;
assert_eq!(*cfg.name, "api");        // Deref to the value
assert_eq!(cfg.name.line(), 1);       // ...plus where it came from

Design: built on the existing spanful Node tree — each span-bearing deserializer recognizes a private marker struct in deserialize_struct and answers with the current node's span (which already carries line/column), so there is no second parse and no retained source buffer. The five deserialize_struct hooks are inert guards for every other type.

  • Available on from_str, from_slice, from_node, and nested struct fields.
  • Serializes transparently as the inner value.
  • Equality, ordering, and hashing consider only the value, never the span (so it stays a valid map key).
  • Spanless on from_value (line 0), by design.
  • Adds no dependencies — the runtime closure remains ryu + serde.

serde-saphyr comparison benchmark

examples/serde_saphyr_headtohead.rs compares both crates on identical bytes into identical target types (idiomatic from_str), across three axes; results documented in docs/BENCHMARKS.md.

cargo run --release --example serde_saphyr_headtohead

On an Apple M4 Pro, both libraries on shipping defaults, saneyaml is ~1.8–2.2× faster (dynamic-value corpus, nested typed struct, and flat typed records — the widest margin on serde-saphyr's own target shape). As elsewhere in BENCHMARKS.md, the trustworthy signal is the same-run ratio, not absolute ns/byte. serde-saphyr is a dev-dependency only.

Verification

Run locally and green: cargo fmt --check, cargo test (full suite, 38 binaries) + doctests, cargo clippy --all-targets -D warnings, scripts/check-feature-clippy.sh, cargo doc -D missing_docs, cargo deny check, the runtime_dependency_closure trust guard, and the PUBLIC_API.txt snapshot check (regenerated to include Spanned).

Not included: the miette feature

A miette::Diagnostic integration was also prototyped, but it adds an (optional) dependency, which the runtime_dependency_closure trust guard intentionally rejects — [dependencies] is asserted to be exactly {ryu, serde}. That's a dependency-policy decision, so it is deliberately left out of this PR. Options to land it later: a separate saneyaml-miette companion crate (preserves the two-dependency guarantee) or an explicit relaxation of the trust guard for optional deps.

jskoiz added 3 commits June 6, 2026 08:47
Benchmark saneyaml against serde-saphyr across dynamic-value and typed-struct deserialization in a new example, and document the results in BENCHMARKS.md. Add serde-saphyr as a dev-dependency.
Spanned<T> pairs a deserialized value with the source span (byte offsets plus line and column) it was read from, built on the existing node span tree via a private deserializer protocol. It is available on the from_str, from_slice, and from_node read paths and nested struct fields, serializes transparently, and adds no dependencies. The from_value path remains spanless.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant