Skip to content

Performance improvements#108

Merged
oschwald merged 9 commits intomainfrom
greg/perf-improv
Feb 14, 2026
Merged

Performance improvements#108
oschwald merged 9 commits intomainfrom
greg/perf-improv

Conversation

@oschwald
Copy link
Owner

@oschwald oschwald commented Feb 14, 2026

  • Make lookup benchmark deterministic and more stable
  • Speed up lookup and decode hot paths

Summary by CodeRabbit

  • Bug Fixes

    • Truncated or corrupt control-byte headers now return decoding errors; iteration no longer stops on certain internal read issues.
  • Performance

    • Faster lookups and record decoding, including a dedicated ASCII fast path for strings.
    • Reduced release-build overhead by compiling out some debug/trace logging.
  • New Features

    • Added a deterministic benchmarking harness with additional benchmarks and larger sample sizes.
  • Chores

    • Package bumped to 0.27.2, removed an unused dev dependency, and adjusted logging defaults.

@coderabbitai
Copy link

coderabbitai bot commented Feb 14, 2026

Warning

Rate limit exceeded

@oschwald has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 16 minutes and 53 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

Refactors decoding to return guarded DecodeResult values, adds ASCII fast-paths, changes reader helpers to non-Result returns with early record_size validation, removes a dev-dependency and enables a log feature, and adds/updates Criterion benchmarks (new serde_usage bench + modified lookup bench).

Changes

Cohort / File(s) Summary
Manifest / Bench target
Cargo.toml
Bumped crate version (0.27.1 → 0.27.2); enabled log feature release_max_level_info; removed dev-dependency fake = "4.0"; added bench target [[bench]] name = "serde_usage" (harness = false).
Benchmarks (new + updated)
benches/serde_usage.rs, benches/lookup.rs
Added benches/serde_usage.rs with multiple Criterion benchmarks and deterministic IPv4 generation; updated benches/lookup.rs to remove fake, use an LCG for IPv4 generation, preallocate vectors, import std::net::{IpAddr, Ipv4Addr}, and increase sample size (10 → 20).
Decoder / Parsing
src/decoder.rs
Converted many internal helpers to return DecodeResult; added eat_byte, checked_offset, read_u32_be, decode_uint32/16, decode_int; introduced try_read_identifier_bytes; implemented ASCII fast-paths for string decoding; tightened pointer-following and unified decode error propagation.
Reader / Tree traversal
src/reader.rs
Validate metadata.record_size early; changed several internal APIs to non-Result returns (read_node -> usize, find_address_in_tree -> (usize, usize), find_ipv4_start -> (usize, usize)); inlined 24/28/32-byte node decoding and unified IPv4/IPv6 traversal and not-found semantics.
Iterator / within
src/within.rs
Removed error propagation from push_child (now returns ()); dropped ? on read_node inside iterator path so read_node errors are no longer propagated during iteration.
Changelog
CHANGELOG.md
Added release entry for 0.27.2 describing faster lookups, ASCII fast-paths, reduced release logging, and decoding error fixes.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Reader as Reader
    participant Decoder as Decoder
    participant DB as DataBlob

    Client->>Reader: lookup(ip)
    Reader->>DB: read_node(node_number)
    DB-->>Reader: raw node bytes (24/28/32)
    Reader->>Decoder: request decode (header/size/type)
    Decoder->>DB: eat_byte / read value bytes
    Decoder-->>Reader: DecodeResult(decoded_value) or Err
    Reader-->>Client: return data or propagate Err / not-found
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I hopped through bytes with LCG cheer,
I chased the fake away, made lookups clear,
I guard each header, bound every read,
ASCII sings true — no panic or dread,
Benchmarks hum softly as I nibble speed 🥕

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main changes in the pull request, which focus on performance optimizations across lookup, decoding, and logging paths, along with improved benchmark stability.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch greg/perf-improv

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/decoder.rs`:
- Around line 466-498: In try_read_identifier_bytes, add a short inline comment
around the pointer-restoration logic (the use of after_pointer and saved_ptr
after calling decode_pointer) that explains: after_pointer restores current_ptr
to the position after the pointer bytes (undoing the temporary dereference so
subsequent parsing resumes after the pointer), and the conditional reset to
saved_ptr is needed only when the pointed-to value is not a string to also undo
consuming the original pointer itself; reference the variables/methods
decode_pointer, after_pointer, saved_ptr, current_ptr, size_and_type, and
read_string_bytes so maintainers can easily locate the logic.

In `@src/reader.rs`:
- Around line 388-427: The function find_address_in_tree can finish all bit
iterations leaving `node` as an internal tree node (not >= node_count) which
later underflows in resolve_data_pointer; update find_address_in_tree so that if
the traversal completes the full depth (i.e., prefix_len equals 32 for IPv4 or
128 for IPv6) and `node < self.metadata.node_count` then treat it as "not found"
by setting `node = self.metadata.node_count` (so the existing check returns (0,
prefix_len)). Reference: find_address_in_tree, resolve_data_pointer, lookup,
self.metadata.node_count, and ipv4_start. Ensure both the IpInt::V4 and
IpInt::V6 branches follow this behavior.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/reader.rs (1)

462-495: ⚠️ Potential issue | 🟠 Major

Add buffer-length validation to prevent panic on corrupt metadata.

The from_source method computes search_tree_size but does not validate it fits within the buffer. This allows find_ipv4_start, called during Reader initialization, to trigger read_node with indices that exceed buffer bounds, causing an unchecked panic on corrupt databases with inflated node_count values.

Add this validation after computing search_tree_size at line 129:

Proposed fix
         let search_tree_size = (metadata.node_count as usize) * (metadata.record_size as usize) / 4;
 
+        if search_tree_size + DATA_SECTION_SEPARATOR_SIZE > buf.as_ref().len() {
+            return Err(MaxMindDbError::invalid_database(
+                "database file is too small for the declared search tree size",
+            ));
+        }
+
         let mut reader = Reader {

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/decoder.rs (1)

138-167: ⚠️ Potential issue | 🟡 Minor

eat_byte panics on truncated/corrupt databases instead of returning an error.

eat_byte does no bounds check. In size_from_ctrl_byte, for sizes 29/30/31, it's called 1–3 times in succession. If the buffer is truncated, this panics rather than returning a DecodeResult::Err. This is inconsistent with the new decode_uint32/decode_uint16/decode_int paths (Lines 295–374), which all perform checked_add + bounds validation and return proper errors.

Since this is a hot path and the panic only manifests on corrupt data, it may be an acceptable tradeoff — but worth acknowledging explicitly (e.g., a // PANIC: safety comment) so future maintainers know this is intentional.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/decoder.rs (1)

420-441: ⚠️ Potential issue | 🟡 Minor

ASCII fast-path is sound, but decode_string still uses unchecked arithmetic unlike the new integer decoders.

The is_ascii()from_utf8_unchecked optimization is correct and well-commented. However, line 428 (self.current_ptr + size) can panic on arithmetic overflow or produce an out-of-bounds index at line 429, unlike the new integer decoders that use checked_add + bounds filtering. The same unchecked pattern exists in decode_bytes (line 268) and decode_float/decode_double. Consider aligning these with the safer pattern introduced elsewhere in this PR.

🤖 Fix all issues with AI agents
In `@src/decoder.rs`:
- Around line 298-327: The decode_uint32 and decode_int functions duplicate the
big-endian assembly and bounds-check logic; extract that shared logic into a
small inline helper (e.g., fn read_u32_be(&mut self, size: usize) ->
DecodeResult<u32>) that performs the bounds check, computes new_offset, reads
the bytes into a u32 (big-endian), advances self.current_ptr, and returns the
u32; then have decode_uint32 call that helper directly and have decode_int call
the helper and cast the returned u32 to i32 (as i32), or alternatively update
the decode_int_like! macro to invoke the helper to remove the duplicate code in
both decode_uint32 and decode_int.

Improvement: in isolated A/B runs of cargo bench --bench lookup, this
change alone reduced maxminddb median from 85.145 µs to 83.384 µs
(~2.1%). maxminddb_par remained within noise.
Improvement: in isolated A/B runs of cargo bench --bench lookup, this
change alone reduced maxminddb median from 85.145 µs to 77.400 µs
(~9.1%). With numeric decoding optimizations, median reached 77.008 µs
(~9.6%). maxminddb_par stayed within noise.
Verification (release builds, cargo bench --bench lookup, same source
with only this Cargo.toml change): maxminddb median improved from 76.544
µs to 70.559 µs (~7.8% faster). maxminddb_par improved from 18.733 µs to
18.163 µs (~3.0% faster).
Verification (release builds, cargo bench --bench lookup, decoder-only
A/B): maxminddb median improved from 70.797 µs to 65.489 µs (~7.5%
faster). maxminddb_par remained within noise threshold.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/decoder.rs (3)

597-621: ⚠️ Potential issue | 🟠 Major

skip_value_inner advances current_ptr without bounds checking for string, bytes, and numeric types.

Lines 599, 607, 615, and 620 all do self.current_ptr += size without verifying the result stays within self.buf.len(). A corrupt size field will push current_ptr past the buffer, causing panics on the next read. This is inconsistent with the bounds-checked approach used elsewhere in this PR (e.g., checked_offset, read_string_bytes).

🐛 Proposed fix — use `checked_offset` for all skip advances
             TYPE_STRING | TYPE_BYTES => {
                 // String or Bytes - skip size bytes
-                self.current_ptr += size;
+                self.current_ptr = self.checked_offset(size, "skip string/bytes")?;
                 Ok(())
             }
             TYPE_DOUBLE => {
                 // Double - must be exactly 8 bytes
                 if size != 8 {
                     return Err(self.invalid_db_error(&format!("double of size {size}")));
                 }
-                self.current_ptr += size;
+                self.current_ptr = self.checked_offset(size, "skip double")?;
                 Ok(())
             }
             TYPE_FLOAT => {
                 // Float - must be exactly 4 bytes
                 if size != 4 {
                     return Err(self.invalid_db_error(&format!("float of size {size}")));
                 }
-                self.current_ptr += size;
+                self.current_ptr = self.checked_offset(size, "skip float")?;
                 Ok(())
             }
             TYPE_UINT16 | TYPE_UINT32 | TYPE_INT32 | TYPE_UINT64 | TYPE_UINT128 => {
                 // Numeric types - skip size bytes
-                self.current_ptr += size;
+                self.current_ptr = self.checked_offset(size, "skip numeric")?;
                 Ok(())
             }

408-429: 🧹 Nitpick | 🔵 Trivial

ASCII fast-path: consider the interaction with simdutf8.

When simdutf8 is enabled (but not unsafe-str-decode), you still call bytes.is_ascii() first then fall through to simdutf8::basic::from_utf8. For strings that are mostly ASCII but contain a few non-ASCII bytes (common for city names with accents), you pay for both the is_ascii() scan and then the simdutf8 validation. This is fine for short strings typical in MaxMind DBs, but worth noting for future reference.


374-389: ⚠️ Potential issue | 🟠 Major

decode_pointer lacks bounds checking — can panic on corrupt data.

Line 377 uses unchecked addition and line 378 slices self.buf without verifying that new_offset <= self.buf.len(). A truncated or corrupt database with a pointer near the end of the buffer will cause an index-out-of-bounds panic, inconsistent with the error-handling pattern used by all other decode_* functions.

Change the return type to DecodeResult<usize> and use checked_offset for bounds checking:

Proposed fix
-    fn decode_pointer(&mut self, size: usize) -> usize {
+    fn decode_pointer(&mut self, size: usize) -> DecodeResult<usize> {
         let pointer_value_offset = [0, 0, 2048, 526_336, 0];
         let pointer_size = ((size >> 3) & 0x3) + 1;
-        let new_offset = self.current_ptr + pointer_size;
+        let new_offset = self.checked_offset(pointer_size, "pointer")?;
         let pointer_bytes = &self.buf[self.current_ptr..new_offset];
         self.current_ptr = new_offset;
 
         let base = if pointer_size == 4 {
             0
         } else {
             (size & 0x7) as u8
         };
         let unpacked = to_usize(base, pointer_bytes);
 
-        unpacked + pointer_value_offset[pointer_size]
+        Ok(unpacked + pointer_value_offset[pointer_size])
     }

Update all 6 call sites (lines 234, 456, 480, 509, 538, 588) to use ? operator since they're all within functions returning DecodeResult.

🤖 Fix all issues with AI agents
In `@CHANGELOG.md`:
- Line 3: CHANGELOG indicates release 0.27.2 but the crate manifest still
reports version = "0.27.1"; update the manifest's version field in Cargo.toml to
"0.27.2" so the package version matches the CHANGELOG entry (edit the version =
"..." line in Cargo.toml).

@oschwald
Copy link
Owner Author

@coderabbitai resume

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/decoder.rs (2)

375-399: 🧹 Nitpick | 🔵 Trivial

Bounds-safe fallback, but consider documenting the behavior.

The defensive fallback when checked_add overflows or exceeds buffer length sets current_ptr = len and returns len. This prevents panics, but subsequent operations will fail at a different location than the actual corrupt pointer.

This is acceptable since changing the signature to DecodeResult<usize> would require broader refactoring, and the error will still surface. Consider adding a brief comment explaining why returning len is safe (the next read will fail with a bounds error).

📝 Suggested documentation
         let new_offset = match p.checked_add(pointer_size) {
             Some(offset) if offset <= len => offset,
             _ => {
+                // Return buffer length so subsequent reads fail with proper bounds errors
                 self.current_ptr = len;
                 return len;
             }
         };

607-631: ⚠️ Potential issue | 🟡 Minor

Minor: skip_value_inner advances current_ptr without bounds validation.

When skipping strings, bytes, doubles, floats, and numeric types, current_ptr += size is performed without verifying current_ptr + size <= buf.len(). A corrupt size value could leave current_ptr past the buffer end.

While subsequent reads would fail with proper errors, the pointer state would be inconsistent. This matches prior behavior, so it's not a regression, but consider using checked_offset for consistency with other decode methods.

🛡️ Optional: add bounds check for consistency
             TYPE_STRING | TYPE_BYTES => {
                 // String or Bytes - skip size bytes
-                self.current_ptr += size;
+                self.current_ptr = self.checked_offset(size, "skip")?;
                 Ok(())
             }

@coderabbitai
Copy link

coderabbitai bot commented Feb 14, 2026

✅ Actions performed

Reviews resumed.

Convert control-byte header reads to checked reads that return decode errors instead of panicking on truncated/corrupt input.\n\nBenchmark: cargo bench --bench lookup --features simdutf8\n- maxminddb median: 67.700 us -> 67.385 us\n- Criterion: no statistically significant change (p=0.07)\n\nThis keeps malformed-input behavior consistent with other checked decode paths without a measurable release-build regression.
Add user-visible performance and robustness changes since 0.27.1 under
the 0.27.2 release entry dated 2026-02-14.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/decoder.rs (2)

375-399: ⚠️ Potential issue | 🟡 Minor

decode_pointer silently returns buf.len() on overflow — error messages will be indirect.

When the pointer offset overflows or exceeds buffer length (lines 383–386), the function returns buf.len() rather than an error. The actual failure surfaces later as "unexpected end of buffer" from eat_byte, which obscures the root cause (a corrupt/invalid pointer). This is fine for performance, but consider at least a debug! log here so that diagnosing corrupt databases is easier in non-release builds.

Proposed improvement
             _ => {
+                debug!("invalid pointer offset at position {p}");
                 self.current_ptr = len;
                 return len;
             }

607-611: 🧹 Nitpick | 🔵 Trivial

skip_value_inner for STRING/BYTES advances current_ptr without bounds checking.

self.current_ptr += size could move current_ptr past buf.len() if size is bogus (corrupt DB). While subsequent operations will likely catch this, it differs from the stricter checked_offset pattern used elsewhere in this PR. Since this code path is also in the hot path for skipping unknown fields during deserialization, consider using checked_offset here for consistency.

♻️ Proposed fix
             TYPE_STRING | TYPE_BYTES => {
                 // String or Bytes - skip size bytes
-                self.current_ptr += size;
+                self.current_ptr = self.checked_offset(size, "skip string/bytes")?;
                 Ok(())
             }

Same pattern applies to the numeric skip paths at lines 628–631:

             TYPE_UINT16 | TYPE_UINT32 | TYPE_INT32 | TYPE_UINT64 | TYPE_UINT128 => {
                 // Numeric types - skip size bytes
-                self.current_ptr += size;
+                self.current_ptr = self.checked_offset(size, "skip numeric")?;
                 Ok(())
             }
🤖 Fix all issues with AI agents
In `@benches/serde_usage.rs`:
- Around line 1-3: Remove legacy extern crate usage: delete the line "extern
crate maxminddb;" and remove the #[macro_use] extern crate criterion
declaration; instead import Criterion macros/exports explicitly (e.g., use
criterion::{criterion_group, criterion_main, Criterion} or the specific macro
names used) in this module so it compiles under Rust 2021 edition and keeps
macro imports explicit.

In `@src/decoder.rs`:
- Around line 344-362: decode_uint16 duplicates the checked-add + size-match
logic already implemented by read_u32_be; replace the manual implementation in
decode_uint16 with a size cap check (if size > 2 ->
Err(self.invalid_db_error(...))) and then call read_u32_be(size) and cast the
returned u32 to u16 (e.g., read_u32_be(size).map(|v| v as u16)), relying on
read_u32_be to advance current_ptr and produce the same errors as before;
reference functions: decode_uint16 and read_u32_be.

Comment on lines +344 to +362
#[inline(always)]
fn decode_uint16(&mut self, size: usize) -> DecodeResult<u16> {
if size > 2 {
return Err(self.invalid_db_error(&format!("u16 of size {size}")));
}
let new_offset = self
.current_ptr
.checked_add(size)
.filter(|&offset| offset <= self.buf.len())
.ok_or_else(|| self.invalid_db_error(&format!("u16 of size {}", size)))?;
let p = self.current_ptr;
let value = match size {
0 => 0,
1 => self.buf[p] as u16,
_ => ((self.buf[p] as u16) << 8) | self.buf[p + 1] as u16,
};
self.current_ptr = new_offset;
Ok(value)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

decode_uint16 duplicates the pattern from read_u32_be — consider reusing.

decode_uint16 manually implements the same checked-add + match-on-size pattern that read_u32_be already provides. Since a u16 value fits in a u32, you could reuse read_u32_be with a size cap check and a truncating cast, similar to how decode_int works. This would eliminate ~18 lines of near-duplicate code.

♻️ Example
     #[inline(always)]
     fn decode_uint16(&mut self, size: usize) -> DecodeResult<u16> {
-        if size > 2 {
-            return Err(self.invalid_db_error(&format!("u16 of size {size}")));
-        }
-        let new_offset = self
-            .current_ptr
-            .checked_add(size)
-            .filter(|&offset| offset <= self.buf.len())
-            .ok_or_else(|| self.invalid_db_error(&format!("u16 of size {}", size)))?;
-        let p = self.current_ptr;
-        let value = match size {
-            0 => 0,
-            1 => self.buf[p] as u16,
-            _ => ((self.buf[p] as u16) << 8) | self.buf[p + 1] as u16,
-        };
-        self.current_ptr = new_offset;
-        Ok(value)
+        if size > 2 {
+            return Err(self.invalid_db_error(&format!("u16 of size {size}")));
+        }
+        self.read_u32_be(size, "u16").map(|v| v as u16)
     }
🤖 Prompt for AI Agents
In `@src/decoder.rs` around lines 344 - 362, decode_uint16 duplicates the
checked-add + size-match logic already implemented by read_u32_be; replace the
manual implementation in decode_uint16 with a size cap check (if size > 2 ->
Err(self.invalid_db_error(...))) and then call read_u32_be(size) and cast the
returned u32 to u16 (e.g., read_u32_be(size).map(|v| v as u16)), relying on
read_u32_be to advance current_ptr and produce the same errors as before;
reference functions: decode_uint16 and read_u32_be.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@benches/serde_usage.rs`:
- Around line 14-30: The generate_ipv4 function is duplicated; extract it into a
single shared benchmark utility module (e.g., a benches::common module) and
remove the duplicate copies in both benches/lookup.rs and
benches/serde_usage.rs, then have those files import and call
common::generate_ipv4; keep the same signature and LCG constants/state
initialization so behavior is unchanged and ensure the module is declared so
Cargo/Rust can find it.

Comment on lines 14 to 30
#[must_use]
fn generate_ipv4(count: u64) -> Vec<IpAddr> {
let mut ips = Vec::with_capacity(count as usize);
let mut state = 0x4D59_5DF4_D0F3_3173_u64;
for _ in 0..count {
state = state
.wrapping_mul(6_364_136_223_846_793_005)
.wrapping_add(1_442_695_040_888_963_407);
ips.push(IpAddr::V4(Ipv4Addr::new(
(state >> 24) as u8,
(state >> 32) as u8,
(state >> 40) as u8,
(state >> 48) as u8,
)));
}
ips
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find all generate_ipv4 functions in benches/
rg -A 20 'fn generate_ipv4' benches/

Repository: oschwald/maxminddb-rust

Length of output: 2182


🏁 Script executed:

#!/bin/bash
# Check for existing shared benchmark utilities
find benches/ -type f -name "*.rs" | head -20
ls -la benches/

Repository: oschwald/maxminddb-rust

Length of output: 345


Extract duplicated generate_ipv4 to a shared benchmark utility module.

The generate_ipv4 function is duplicated identically in both benches/lookup.rs and benches/serde_usage.rs (same state initialization, LCG constants, and bit-shift logic). Extract it to a shared module like benches/common.rs or a common/mod.rs to maintain a single source of truth.

🤖 Prompt for AI Agents
In `@benches/serde_usage.rs` around lines 14 - 30, The generate_ipv4 function is
duplicated; extract it into a single shared benchmark utility module (e.g., a
benches::common module) and remove the duplicate copies in both
benches/lookup.rs and benches/serde_usage.rs, then have those files import and
call common::generate_ipv4; keep the same signature and LCG constants/state
initialization so behavior is unchanged and ensure the module is declared so
Cargo/Rust can find it.

Add a dedicated Criterion benchmark suite for lookup-only, full
City/Country decode, and decode_path scenarios, with deterministic IPv4
inputs for repeatable runs.
@oschwald oschwald merged commit d2c9fab into main Feb 14, 2026
42 checks passed
@oschwald oschwald deleted the greg/perf-improv branch February 14, 2026 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant