Open
Conversation
Replace all 32 instances of incorrect `#[cfg_attr(std, inline)]` syntax with `#[cfg_attr(feature = "std", inline)]`. The `cfg(std)` predicate was never true, so no inline hints were being applied to reader functions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove unnecessary unsafe in varint fast paths (use safe slice creation) - Fix read_len bounds check to prevent overflow on 32-bit targets - Cap read_packed capacity hint to avoid excessive allocation - Fix read_varint64 fast path to advance self.start on Error::Varint - Add truncation comment in varint64 fast path matching slow path docs - Fix cfg_attr(std, inline) -> cfg_attr(feature = "std", inline) in writer.rs - Add fast-path tests for varint32 (1-4 byte) and varint64 (1-9 byte) - Add fast-path overflow error tests for both varint32 and varint64 - Improve error assertions to check specific error variants Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All benchmarks pass with no regressions. Fixed-width reads (~7.6-7.8us/10k) are ~3x faster than varint reads (~24us/10k), confirming from_le_bytes optimization effectiveness. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- read_fixed*: use .get().ok_or() for defensive bounds check against bytes.len(), restoring graceful error (vs panic) when bytes slice is shorter than self.end - read_u8: add self.end boundary check to prevent sub-reader escapes - read_packed: divide capacity hint by size_of::<M>() to avoid over-allocating by up to 8x for wide element types - tests: add PackedFixed::Borrowed variant coverage to test_packed_fixed_size_hint - tests: add varint64 fast-path tests for 3-byte, 4-byte, 6-byte, 7-byte encodings - docs: add Changelog entry for v0.8.2 - docs: create CLAUDE.md with architecture patterns and conventions - fix: correct doc comment typo and indentation in reader.rs module header Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a branchless varint32 fast path on aarch64 that loads 8 bytes as a u64 and uses bit manipulation to find the varint length and extract the value without per-byte branches. The existing scalar fast path is preserved under cfg(not(target_arch = "aarch64")). Includes direct tests for the decode_varint32_branchless helper function covering all varint sizes (1-5 bytes) and the negative i32 case. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add batch_decode_varint32_neon() function using ARM64 NEON intrinsics (vld1q_u8, vshrq_n_u8, vmulq_u8, vpaddlq_*) to detect varint boundaries in 16-byte chunks in parallel, then decode individual varints using the existing branchless scalar approach. Add read_packed_int32() method to BytesReader that uses the NEON batch path on aarch64 with scalar fallback. Includes 10 new tests covering mixed sizes, edge cases, and scalar equivalence validation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tch decode Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add ARM64/NEON optimization section to CLAUDE.md documenting branchless varint decode, NEON batch decode, and read_packed_int32 patterns. Move completed plan to docs/plans/completed/. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add unit tests for all six rewritten fixed-width read methods (read_fixed32, read_fixed64, read_sfixed32, read_sfixed64, read_float, read_double) covering success, insufficient buffer, and sub-message boundary enforcement - Add unit test for read_u8 sub-message boundary enforcement - Update Changelog with missing ARM64/NEON optimization entries and new read_packed_int32 public API Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.