Skip to content

asort: replace old quicksort fallback with glibc 2.43 mergesort+heapsort#129

Open
nGoline wants to merge 1 commit into
rustyrussell:masterfrom
nGoline:update-asort-fallback
Open

asort: replace old quicksort fallback with glibc 2.43 mergesort+heapsort#129
nGoline wants to merge 1 commit into
rustyrussell:masterfrom
nGoline:update-asort-fallback

Conversation

@nGoline

@nGoline nGoline commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

The bug

The old fallback _asort (a copy of glibc's 2004 median-of-three quicksort) called cmp(a, a), the same pointer on both sides, when sorting arrays where all elements compare equal. In the inner loop, after the left and right scan pointers both stop on the pivot element, two comparisons of the form cmp(pivot, pivot) execute before the left_ptr == right_ptr guard fires and breaks.

This violates the implicit contract that comparators can assume a != b when called by a sort routine, which is a reasonable expectation since the C standard says comparator behaviour for equal-valued distinct elements is unspecified but never requires handling identical pointers.

The fix

Replace the old quicksort with the algorithm from glibc 2.43: a stable top-down mergesort that uses a stack buffer for small arrays and falls back to in-place heapsort if malloc fails. Mergesort always compares across two distinct sub-array halves, so self-comparisons cannot occur.

Portability

The glibc 2.43 source uses several glibc-internal headers and symbols that are not available outside glibc (memswap.h, pthreadP.h, __compar_d_fn_t, pthread_cleanup_combined_*, etc.). This PR replaces each with a portable equivalent; the algorithm itself is unchanged.

Tests

Added four new test cases:

  • all-equal array: sorts 100 identical elements — exercises the path the old quicksort mishandled, verifies the result is still sorted.
  • single-element array: verifies the early-exit path produces no comparator call and leaves the element untouched.
  • direct self-comparison: calls cmp(&val, &val, NULL) explicitly and asserts it returns 0, pinning the contract that comparators passed to asort must handle identical pointers gracefully.
  • self_compared flag: asserts the flag was set by the direct call, confirming the test machinery itself works correctly.

The diag line reports whether a self-comparison occurred anywhere during the test run, serving as a canary if a future implementation reintroduces them.

Signed-off-by: Nickolas Goline @nGoline

The old fallback (used when the native `qsort_r` lacks the glibc calling convention) was a copy of glibc's 2004 median-of-three quicksort.
It had a subtle self-comparison bug: when the left and right scan pointers both converged on the pivot with an all-equal array, the inner loop evaluated `cmp(pivot, pivot)` before detecting `left_ptr == right_ptr` and breaking. Comparators that use pointer identity to detect distinct elements would spuriously fire on this.

Replace it with glibc 2.43's stable mergesort (`heapsort` fallback on `malloc` failure).  Mergesort always compares from two distinct halves and never produces self-comparisons.

Portability adaptations to remove glibc-internal dependencies:
  - `__compar_d_fn_t` -> typedef from `_total_order_cb`
  - `__memswap` / `__mempcpy` -> local static inline helpers
  - `pthread_cleanup_*` -> removed (no thread cancellation needed)
  - `__set_errno` → `errno = …`
  - `libc_hidden_def` / `weak_alias` / `__qsort_r` -> removed / renamed to `_asort`

Add tests for all-equal and single-element arrays.
@ddustin

ddustin commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

This one needs to be a PR on ccan.

I'm surprised we can't just use a glibc built in for this, seems like a lot of effort just to sort.

Would be interesting to see a profiler output for a large node to see if sorting is showing up as a performance issue 🤔

@nGoline

nGoline commented Jun 29, 2026

Copy link
Copy Markdown
Collaborator Author

This one needs to be a PR on ccan.

It is ;)

I'm surprised we can't just use a glibc built in for this, seems like a lot of effort just to sort.

We do when the platform has it. asort is just a #define to qsort_r whenever HAVE_QSORT_R_PRIVATE_LAST is set, i.e. the GNU signature where the context is the private last arg. The bundled code only ever compiles on platforms without that. macOS is the catch: it has a qsort_r, but with the BSD signature (thunk first, comparator (thunk, a, b)), so the probe fails and we drop to the in-tree fallback.

The effort here isn't really "writing a sort", it's that the existing fallback was a copy of glibc's 2004 quicksort, which calls cmp(a, a) (same pointer both sides) on all-equal arrays, violating the comparator contract. I just swapped it for glibc's current mergesort/heapsort, which never self-compares. Algorithm is unchanged from upstream: I only replaced the glibc-internal headers/symbols with portable equivalents.

Would be interesting to see a profiler output for a large node to see if sorting is showing up as a performance issue 🤔

Agreed, would be interesting! Worth flagging though that this path only runs on non-GNU platforms (so not your typical Linux node), and the change is a correctness fix rather than a perf one. mergesort vs. the old quicksort shouldn't move the needle much either way. Happy to grab profiler output for a large node if it's useful for the broader sorting question.

@ddustin

ddustin commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

This one needs to be a PR on ccan.

It is ;)

Lol indeed it is 😂😂

@ddustin ddustin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still find it crazy we have to do this but ACK 1628832.

Seems good to have a fallback in general, though. For Mac it seems prudent to figure out how to get the build system to use its BSD qsort_r but it isn't urgent.

I left a tag on some asserts that would help me sleep at night but it appears the code technically shouldn't need them.

Comment thread ccan/asort/asort.c

static inline void
swap_words_64 (void * restrict a, void * restrict b, size_t n)
{

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add an assert here, something like assert(n && n % 8 == 0)

Comment thread ccan/asort/asort.c
static inline void
swap_words_32 (void * restrict a, void * restrict b, size_t n)
{
do

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert(n && n % 4 == 0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants