Skip to content

Fix: two ring-buffer allocator defects in pto_ring_buffer.h#431

Open
chenshengxin2026 wants to merge 1 commit intohw-native-sys:mainfrom
chenshengxin2026:fix/issue-429-ring-buffer-allocator-defects
Open

Fix: two ring-buffer allocator defects in pto_ring_buffer.h#431
chenshengxin2026 wants to merge 1 commit intohw-native-sys:mainfrom
chenshengxin2026:fix/issue-429-ring-buffer-allocator-defects

Conversation

@chenshengxin2026
Copy link
Copy Markdown
Contributor

Summary

  • Bug 1 (heap wrap-around): try_bump_heap used tail > alloc_size (strict greater-than), causing deadlock when tail == alloc_size — exactly enough space exists at [0, alloc_size) but the condition incorrectly rejected it. Fixed to tail >= alloc_size.
  • Bug 2 (DepListPool sentinel collision): alloc() computed idx = top % capacity, which returns 0 (the NULL sentinel slot) when top is a multiple of capacity. Fixed to idx = ((top - 1) % (capacity - 1)) + 1 so the index always stays in [1, capacity-1]. Overflow check tightened from used >= capacity to used >= capacity - 1 to match the reduced usable range.
  • Both fixes applied to all three affected runtimes: a2a3/tensormap_and_ringbuffer, a2a3/aicpu_build_graph, a5/tensormap_and_ringbuffer.

Testing

  • All 113 Python unit tests pass (pytest tests -m "not requires_hardware")
  • Simulation tests (./ci.sh -p a2a3sim)
  • Hardware tests (ctest -R test_ring_buffer once test infrastructure is available)

Fixes #429

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request modifies the ring buffer and dependency pool logic across several files, adjusting allocation boundary checks and index calculations. The feedback identifies a significant risk regarding the use of signed 32-bit integers for monotonic counters; specifically, signed overflow is undefined behavior and the modulo operator can return negative results for negative operands, potentially leading to out-of-bounds memory access. It is recommended to use unsigned arithmetic or explicit casting to ensure safe index generation.

@chenshengxin2026 chenshengxin2026 force-pushed the fix/issue-429-ring-buffer-allocator-defects branch 5 times, most recently from b8ef4e7 to f6f97f3 Compare April 2, 2026 02:28
Bug 1 — Heap wrap-around: change strict `>` to `>=` in try_bump_heap.
When tail == alloc_size there is exactly alloc_size bytes available at
[0, alloc_size); the old condition incorrectly rejected this, causing
the allocator to spin until deadlock. Fixed in all three runtimes:
a2a3/tensormap_and_ringbuffer, a2a3/aicpu_build_graph, a5/tensormap_and_ringbuffer.

Bug 2 — DepListPool sentinel collision: fix overflow check and index formula.
`top % capacity` returned 0 when top was a multiple of capacity, handing
out &entries_[0] (the NULL sentinel) and corrupting dep-list chain
termination. Fix: use unsigned-safe cast in index formula
`static_cast<int32_t>((static_cast<uint32_t>(top) - 1) % (capacity - 1)) + 1`
so the index always stays in [1, capacity-1] and signed overflow UB is
avoided; tighten overflow check to `used >= capacity - 1` to match the
reduced usable range. Applied to all three runtimes.

Additionally:
- Add copyright headers to the three pto_ring_buffer.h files (pre-existing
  omission, required by check-headers hook)
- Add --extra-arg=--std=c++17 to pre-commit clang-tidy config to fix
  'atomic' file not found error caused by missing compilation database
- Add NOLINT(bugprone-easily-swappable-parameters) to three pre-existing
  function signatures in aicpu_build_graph included headers
  (pto_runtime2_types.h, pto_submit_types.h, tensor.h)
- Apply clang-format to all modified files

Fixes hw-native-sys#429
@chenshengxin2026 chenshengxin2026 force-pushed the fix/issue-429-ring-buffer-allocator-defects branch from f6f97f3 to 2f4e574 Compare April 2, 2026 02:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Two ring-buffer allocator defects in pto_ring_buffer.h: heap wrap-around off-by-one and DepListPool sentinel collision

1 participant