Skip to content

Minor code simplifications#107

Merged
oschwald merged 7 commits intomainfrom
greg/simplifications
Feb 14, 2026
Merged

Minor code simplifications#107
oschwald merged 7 commits intomainfrom
greg/simplifications

Conversation

@oschwald
Copy link
Owner

@oschwald oschwald commented Feb 11, 2026

  • Use shared separator constant in Reader
  • Deduplicate fixed-width integer decoding
  • Unify array path navigation branches
  • Extract child-node push helper
  • Collapse repeated is_empty implementations
  • Factor reader test setup helpers
  • Simplify pointer traversal and keep fast paths

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced error messages and validation for pointer resolution, type mismatches, and string reads.
    • Improved bounds and pointer-following checks to produce clearer, contextual errors during navigation.
  • Refactor

    • Reduced boilerplate by generating repeated methods via macros and reusing constants.
    • Consolidated navigation and decoding logic (including index handling) for clarity and consistency.
    • Added small helper methods to centralize repeated internal operations.
  • Tests

    • Standardized test helpers and simplified test setup for reading test data.

Replace hard-coded separator sizes with DATA_SECTION_SEPARATOR_SIZE in
pointer offset logic.

This keeps a single source of truth for the format constant and reduces
the chance of drift in future edits.
@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

Warning

Rate limit exceeded

@oschwald has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 58 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

Consolidated repeated decoding and is_empty logic into macros, extracted helper functions for pointer-following and node pushes, tightened pointer-resolution and string-reading error paths, aligned a reader constant, and standardized test setup with shared helpers. No public API changes.

Changes

Cohort / File(s) Summary
Decoder & pointer logic
src/decoder.rs
Added decode_int_like macro to generate fixed-size integer decoders; replaced manual decoders. Introduced consume_typed_header, size_and_type_following_pointers (follows at most one pointer), read_string_bytes, and reworked read_str_as_bytes to follow a single pointer and produce clearer pointer-related errors.
is_empty macro for geoip types
src/geoip2.rs
Added impl_is_empty_via_default macro and replaced many manual is_empty implementations across country, city, and enterprise types with macro-generated methods.
Reader constant alignment
src/reader.rs
Replaced a local data_section_separator_size with the global DATA_SECTION_SEPARATOR_SIZE in from_source and pointer-resolution arithmetic.
Test helpers
src/reader_test.rs
Introduced init_logger() and open_test_data_reader() helpers; replaced per-test logger initialization and many Reader::open_readfile usages to standardize test setup and returned Reader<Vec<u8>> usage.
Path decoding consolidation
src/result.rs
Unified handling of PathElement::Index and IndexFromEnd in decode_path: consume array header once, compute resolved index, reuse a single skip loop, and provide contextual error messages for type mismatches.
Within traversal helper
src/within.rs
Extracted private push_child helper to encapsulate reader.read_node(...) plus stack push; removed duplicated child-read/push logic in next().

Sequence Diagram(s)

(Skipped — changes do not introduce a new multi-component sequential flow requiring visualization.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Performance improvements #105: modifies string decoding and pointer-following behavior; closely related to the read_str_as_bytes and pointer-resolution changes in this PR.

Poem

🐰 I hopped through macros, tidy and neat,
Followed one pointer — no tangled repeat,
Pushed child nodes with a hop and a shove,
Tests now gather round the helper's cove,
A rabbit cheers — the code feels sweet.

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The PR title 'Minor code simplifications' is vague and generic. While the PR does contain simplifications, the title does not convey meaningful information about the specific changes or main objectives. Consider a more specific title such as 'Refactor: consolidate integer decoding, macro-generate methods, extract test helpers' to better describe the primary changes across decoder, geoip2, reader, and test modules.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch greg/simplifications

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/reader_test.rs (2)

250-268: 🛠️ Refactor suggestion | 🟠 Major

Incomplete migration to the new helpers.

Starting from test_lookup_annonymous_ip onward, ~20 tests still use let _ = env_logger::try_init() and Reader::open_readfile(...) directly instead of the new init_logger() and open_test_data_reader(...) helpers. This defeats the deduplication goal of the PR.

For example, this test (and many others below) could be:

Proposed diff for this test
 fn test_lookup_annonymous_ip() {
-    let _ = env_logger::try_init();
-
-    let filename = "test-data/test-data/GeoIP2-Anonymous-IP-Test.mmdb";
-
-    let reader = Reader::open_readfile(filename).unwrap();
+    init_logger();
+    let reader = open_test_data_reader("GeoIP2-Anonymous-IP-Test.mmdb");

The same pattern applies to: test_lookup_density_income, test_lookup_domain, test_lookup_isp, test_lookup_asn, test_lookup_network, test_within_city, test_json_serialize, test_networks, test_default_skips_aliases, test_include_aliased_networks, test_include_networks_without_data, test_skip_empty_values, test_skip_empty_values_with_other_options, test_networks_within_scenarios, test_geoip_networks_within, test_verify_good_databases, test_verify_broken_double_format, test_verify_broken_pointers, test_verify_broken_search_tree, test_size_hints, test_ignored_any, test_enum_deserialization, and test_serde_flatten.


146-181: 🧹 Nitpick | 🔵 Trivial

test_reader and test_reader_readfile have identical bodies.

Both tests construct the same filenames and call Reader::open_readfile(filename).ok().unwrap(). The docstring on test_reader_readfile suggests it was meant to test a different code path, but the implementation is a copy of test_reader. Either deduplicate or differentiate the two.

Also, .ok().unwrap() discards the error message on failure — prefer .unwrap() directly (consistent with the new open_test_data_reader helper).

src/decoder.rs (1)

358-399: 🧹 Nitpick | 🔵 Trivial

Pointer-to-pointer enforcement is a good spec-compliance addition.

Both consume_map_header and consume_array_header now correctly reject pointer chains. The logic is clear and matches the MaxMind DB spec.

The two methods are structurally identical except for the expected type constant and label. Consider extracting a shared helper to reduce the duplication:

♻️ Optional: extract a shared consume_typed_header helper
+    fn consume_typed_header(&mut self, expected_type: u8, type_name: &str) -> DecodeResult<usize> {
+        let (size, type_num) = self.size_and_type();
+        if type_num == TYPE_POINTER {
+            let new_ptr = self.decode_pointer(size);
+            self.current_ptr = new_ptr;
+            let (size, type_num) = self.size_and_type();
+            if type_num == TYPE_POINTER {
+                return Err(self.invalid_db_error("pointer points to another pointer"));
+            }
+            if type_num == expected_type {
+                Ok(size)
+            } else {
+                Err(self.decode_error(&format!("expected {type_name}, got type {type_num}")))
+            }
+        } else if type_num == expected_type {
+            Ok(size)
+        } else {
+            Err(self.decode_error(&format!("expected {type_name}, got type {type_num}")))
+        }
+    }
+
     pub(crate) fn consume_map_header(&mut self) -> DecodeResult<usize> {
-        let (size, type_num) = self.size_and_type();
-        if type_num == TYPE_POINTER {
-            ...
-        }
+        self.consume_typed_header(TYPE_MAP, "map")
     }
 
     pub(crate) fn consume_array_header(&mut self) -> DecodeResult<usize> {
-        let (size, type_num) = self.size_and_type();
-        if type_num == TYPE_POINTER {
-            ...
-        }
+        self.consume_typed_header(TYPE_ARRAY, "array")
     }
🤖 Fix all issues with AI agents
In `@src/decoder.rs`:
- Around line 417-426: In read_string_bytes, avoid potential integer overflow
when computing new_offset by using checked_add on self.current_ptr and size; if
checked_add returns None, return the same invalid_db_error("string length
exceeds buffer") (or a new overflow-specific error) instead of proceeding,
otherwise use the checked value for bounds check, slice extraction, and to
update self.current_ptr so the function cannot be bypassed by wraparound on
32-bit targets.

@oschwald oschwald force-pushed the greg/simplifications branch from 20f32a1 to 9aa97f8 Compare February 14, 2026 03:39
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/decoder.rs`:
- Around line 363-390: The function signature for consume_typed_header should be
collapsed to satisfy rustfmt: change the multi-line signature to a single-line
form like fn consume_typed_header(&mut self, expected_type: u8, label: &str) ->
DecodeResult<usize> { and keep the body unchanged; ensure surrounding
whitespace/braces follow rustfmt style so cargo fmt passes (this affects the
consume_typed_header function and calls to TYPE_POINTER, decode_pointer,
size_and_type, invalid_db_error, and decode_error remain the same).

@oschwald oschwald force-pushed the greg/simplifications branch 2 times, most recently from c3837cd to 7c45ba7 Compare February 14, 2026 03:52
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/reader_test.rs (1)

148-163: 🧹 Nitpick | 🔵 Trivial

test_reader_readfile still manually constructs the test-data/test-data/ path.

This test duplicates the path prefix that open_test_data_reader now owns. Consider using the helper for consistency:

Proposed change
         for ip_version in &versions {
-            let filename =
-                format!("test-data/test-data/MaxMind-DB-test-ipv{ip_version}-{record_size}.mmdb");
-            let reader = Reader::open_readfile(filename).ok().unwrap();
+            let reader = open_test_data_reader(
+                &format!("MaxMind-DB-test-ipv{ip_version}-{record_size}.mmdb"),
+            );
🤖 Fix all issues with AI agents
In `@src/decoder.rs`:
- Around line 43-59: The macro decode_int_like! currently slices self.buf using
self.current_ptr + size which can overflow or index out of bounds; update the
generated function (the fn named by $name) to first compute new_offset via
self.current_ptr.checked_add(size) and verify new_offset <= self.buf.len(),
returning Err(self.invalid_db_error(...)) on failure, then perform the byte-fold
on self.buf[self.current_ptr..new_offset] and set self.current_ptr = new_offset;
this mirrors the checked_add/bounds pattern used in read_string_bytes and
ensures no panic on malformed input.
- Around line 43-59: The macro decode_int_like! currently folds bytes into
signed types (e.g., i32) causing debug-build overflow panics for negative
values; update the i32 case inside decode_int_like! (the $name for decode_int)
to accumulate into an unsigned width-matching accumulator (use 0_u32 and
u32::from(b)) and only cast the final accumulated value to i32 before returning,
leaving the other unsigned branches unchanged; ensure the fold initial value and
byte-conversion use the matching unsigned type (u32 for i32) and then cast the
final result as i32 so bit patterns are preserved without intermediate overflow.

In `@src/reader_test.rs`:
- Around line 10-16: Replace the generic panic from unwrap() in
open_test_data_reader with an expect() that includes the database parameter so
test failures show which test data file failed to open; specifically, update the
call to Reader::open_readfile(...) in the open_test_data_reader function to use
.expect() with a message that mentions the database string (and keep init_logger
as-is).

Replace repeated integer decode helpers with one macro-generated
implementation.

The generated code keeps the same bounds checks and byte-folding logic
while reducing maintenance overhead.
Handle Index and IndexFromEnd with shared traversal logic in
decode_path while preserving existing bounds and error behavior.

This removes duplicated loops and keeps the control flow easier to
reason about.
Move repeated read-and-push logic in Within iteration into a
small helper method.

This keeps traversal behavior the same while reducing duplicated error
handling code.
Replace many identical is_empty methods with a shared macro in
geoip2 model types.

This keeps behavior unchanged and significantly reduces boilerplate.
Introduce shared helpers for logger initialization and opening
common test databases in reader tests.

This removes repetitive setup code and keeps test intent clearer.
Enforce the spec rule that pointers may not resolve to pointers
while keeping map, array, and string decoding on fast direct paths.

This consolidates the pointer-related churn into one decoder change
and avoids the regression from routing every header through shared
pointer-following helpers.
@oschwald oschwald force-pushed the greg/simplifications branch from 7c45ba7 to 0659747 Compare February 14, 2026 04:04
@oschwald oschwald merged commit 3a79c76 into main Feb 14, 2026
42 checks passed
@oschwald oschwald deleted the greg/simplifications branch February 14, 2026 04:07
@coderabbitai coderabbitai bot mentioned this pull request Feb 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant