Add Neon implementation of `is_sorted_until` #6018

hazzlim · 2026-01-16T13:23:30Z

This PR adds a vectorized implementation of is_sorted_until using Neon intrinsics 🚀

Performance numbers (speedup figure relative to the existing, non-manually vectorized code - higher is better)

Benchmark	MSVC Speedup	Clang Speedup
bm_is_sorted_until<std::int8_t, AlgType::Std>/3000/1800	10	12.895
bm_is_sorted_until<std::int8_t, AlgType::Rng>/3000/1800	11.795	10.204
bm_is_sorted_until<std::int16_t, AlgType::Std>/3000/1800	5.674	5.6
bm_is_sorted_until<std::int16_t, AlgType::Rng>/3000/1800	6.551	5.442
bm_is_sorted_until<std::int32_t, AlgType::Std>/3000/1800	3.039	2.908
bm_is_sorted_until<std::int32_t, AlgType::Rng>/3000/1800	3.566	2.908
bm_is_sorted_until<std::int64_t, AlgType::Std>/3000/1800	1.549	1.507
bm_is_sorted_until<std::int64_t, AlgType::Rng>/3000/1800	1.899	1.581
bm_is_sorted_until<std::uint8_t, AlgType::Std>/3000/1800	9.673	12.436
bm_is_sorted_until<std::uint8_t, AlgType::Rng>/3000/1800	11.5	10.459
bm_is_sorted_until<std::uint16_t, AlgType::Std>/3000/1800	5.463	6.389
bm_is_sorted_until<std::uint16_t, AlgType::Rng>/3000/1800	6.389	6.944
bm_is_sorted_until<std::uint32_t, AlgType::Std>/3000/1800	3.017	3.172
bm_is_sorted_until<std::uint32_t, AlgType::Rng>/3000/1800	3.636	3.178
bm_is_sorted_until<std::uint64_t, AlgType::Std>/3000/1800	1.549	1.739
bm_is_sorted_until<std::uint64_t, AlgType::Rng>/3000/1800	1.818	1.581
bm_is_sorted_until<float, AlgType::Std>/3000/1800	3.939	3.297
bm_is_sorted_until<float, AlgType::Rng>/3000/1800	3.883	3.475
bm_is_sorted_until<double, AlgType::Std>/3000/1800	2.026	1.663
bm_is_sorted_until<double, AlgType::Rng>/3000/1800	2.016	1.7

stl/src/vector_algorithms.cpp

StephanTLavavej · 2026-01-17T06:14:53Z

stl/src/vector_algorithms.cpp

+            if constexpr (_Traits::_Vectorized) {
+                const size_t _Total_size_bytes = _Byte_length(_First, _Last);
+
+                const auto _Cmp_gt_wrap = [](const auto _Right, const auto _Left) noexcept {


No change requested: This parameter order does non-Newtonian things to my brain but I suppose it is consistent with the code below.

On instruction level, both ISAs have the GT mnemonic and not LT mnemonic.
So on intrinsics level lt is weird, and SSE4,2/AVX2 doesn't even have them (SSE2 does though).

For C++ the default predicate is std::less,

We need to bridge these two somehow. Ideally that this part would stand out.

By putting it into the least comfortable place we ensure it stands out.

(See also Pearl River Necklace bridge)

Add Neon implementation of is_sorted_until

95edeaa

hazzlim requested a review from a team as a code owner January 16, 2026 13:23

github-project-automation bot added this to STL Code Reviews Jan 16, 2026

github-project-automation bot moved this to Initial Review in STL Code Reviews Jan 16, 2026

AlexGuteniev approved these changes Jan 16, 2026

View reviewed changes

stl/src/vector_algorithms.cpp Show resolved Hide resolved

stl/src/vector_algorithms.cpp Outdated Show resolved Hide resolved

StephanTLavavej added performance Must go faster ARM64 Related to the ARM64 architecture labels Jan 16, 2026

StephanTLavavej self-assigned this Jan 16, 2026

Make _Next top level const

90b7177

StephanTLavavej approved these changes Jan 17, 2026

View reviewed changes

StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Jan 17, 2026

StephanTLavavej removed their assignment Jan 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Neon implementation of `is_sorted_until` #6018

Add Neon implementation of `is_sorted_until` #6018

hazzlim commented Jan 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

StephanTLavavej Jan 17, 2026

Uh oh!

AlexGuteniev Jan 17, 2026

Uh oh!

AlexGuteniev Jan 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add Neon implementation of is_sorted_until #6018

Are you sure you want to change the base?

Add Neon implementation of is_sorted_until #6018

Conversation

hazzlim commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

StephanTLavavej Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

AlexGuteniev Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

AlexGuteniev Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add Neon implementation of `is_sorted_until` #6018

Add Neon implementation of `is_sorted_until` #6018

hazzlim commented Jan 16, 2026 •

edited

Loading