Skip to content

Yusheng-Hu/Position-Pure-Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

263 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

License: MIT Language: C++ Complexity: O(n) DOI Permutation Generation ORCID GitHub stars


Position Pure Algorithm

Official implementation and interactive visualizations of the Position Pure (PP) linear-time ranking/unranking algorithm.

πŸš€ Evolution of Efficiency: Beyond Heap's Algorithm

The Position-Pure Algorithm and Circle-Permutation-Algorithm represent a paradigm shift in combinatorial generation by optimizing the fundamental cost of state transitions.

  • Heap's Algorithm (The Classical Baseline): Long considered the gold standard due to its swap-based approach. While swaps were once thought to be the minimum-cost operation, they involve multiple memory accesses and conditional logic that limit peak performance.
  • Ives' Iterative Algorithm: Improved upon Heap's by utilizing a single-shift and single-assignment pattern, reducing the overhead of element movement.
  • Position-Pure Algorithm (Our Work): Achieves a new frontier by eliminating conditional branching and implementing a highly refined shift-and-assign strategy.
    • Scalability: Optimized for large $N$.
    • Concurrency: Stateless design allows for massive parallel execution, a feat traditional recursive algorithms cannot achieve.
  • Circle-Permutation-Algorithm : Specifically engineered for circular symmetry. While it introduces controlled spatial overhead, it shatters traditional performance barriers by realizing an O((n-1)!) complexity framework.

Key Insights

  • Significant Speedup: permPure_full consistently outperforms Heap's Algorithm by a factor of approximately 7x.
  • Algorithmic Efficiency: The performance gap highlights the superior memory access patterns and lower computational overhead inherent in the PositionPure algorithm.
  • Scalability: As the permutation space grows factorially, the performance gap remains stable, demonstrating excellent algorithmic efficiency for large-scale generation.

πŸš€ Live Demonstrations

πŸš€ Full Permutation generation: Iterative Permutation Generation

The PositionPure algorithm is a high-performance, iterative approach to generating all permutations of a set. By utilizing an iterative state machine rather than traditional recursion, it significantly reduces function call overhead and optimizes CPU branch prediction efficiency.

The algorithm utilizes an iterative state machine to eliminate recursion overhead and leverages hardware affinity binding via the Windows API to minimize context-switching noise, with performance verified to nanosecond-level accuracy through high-precision benchmarking.

Last Run: Fri Feb 06 07:13:05 2026 UTC / Fri Feb 06 15:13:05 2026 (UTC+8) Environment: AMD EPYC 7763 64-Core Processor (GitHub Actions Runner)

N Heap (s) PP (s) Speedup
9 0.005815 0.000984 5.90x
10 0.058239 0.007166 8.12x
11 0.645273 0.074127 8.70x
12 7.837899 0.837278 9.36x

Last Run: Fri Feb 06 07:20:50 2026 UTC / Fri Feb 06 15:20:50 2026 (UTC+8) Environment: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz (GitHub Actions Runner)

N Heap (s) PP (s) Speedup
9 0.006039 0.000871 6.93x
10 0.059160 0.007174 8.24x
11 0.657323 0.075407 8.71x
12 7.809986 1.004821 7.77x
13 103.446021 11.693761 8.84x
14 1468.660328 162.986940 9.01x

πŸš€ Comparison with Python's Built-in itertools

At the request of Reddit users and other community members, a performance comparison with Python’s built-in itertools was conducted. Results are based on the implementation in Position-Pure-Algorithm/python/pp_iter.py.

  • Runtime Environment: The tests were performed using PyPy3. Since the standard itertools library is C-optimized, using PyPy3 helps bridge the low-level language gap, allowing for a more authentic comparison of algorithmic efficiency.
  • Results: Benchmark tests show that this algorithm improves performance by at least 1.4x, with potential gains reaching over 2x compared to the standard library.
  • Future Plans: For applications requiring even higher performance, a C-compiled version should be considered.

🐍 Position Pure Iterator Performance (PyPy)

Last Run: Fri Feb 06 09:07:10 2026 UTC / Fri Feb 06 17:07:10 2026 (UTC+8) Environment: AMD EPYC 7763 64-Core Processor (GitHub Actions Runner)

N Total Permutations Itertools (s) Position Pure (s) Speed-up
10 3,628,800 0.1073s 0.0667s 1.61x
11 39,916,800 1.2113s 0.5685s 2.13x
12 479,001,600 15.0139s 6.6579s 2.26x
13 6,227,020,800 203.8653s 84.7977s 2.40x
14 87,178,291,200 2916.0625s 1163.6335s 2.51x

Last Run: Fri Feb 06 05:40:44 2026 UTC / Fri Feb 06 13:40:44 2026 (UTC+8) Environment: Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz (GitHub Actions Runner)

N Total Permutations Itertools (s) Position Pure (s) Speed-up
10 3,628,800 0.1513s 0.1334s 1.13x
11 39,916,800 2.0003s 1.0969s 1.82x

πŸš€ Liner rank unrank: Position Pure (PP) vs. Myrvold-Ruskey (MR)

Last Run: Sat Feb 07 11:41:19 2026 UTC / Sat Feb 07 19:41:19 2026 (UTC+8) Environment: AMD EPYC 7763 64-Core Processor (GitHub Actions Runner)

N Dist MR (ns/it) PP (ns/it) Speedup
1000 Random 746.4 629.9 1.18x
1000 Sorted 677.2 628.0 1.08x
1000 Reverse 740.8 628.0 1.18x
100000 Random 114304.8 91588.8 1.25x
100000 Sorted 67190.6 62576.1 1.07x
100000 Reverse 79229.8 67114.2 1.18x
1000000 Random 1510619.8 1208586.6 1.25x
1000000 Sorted 711168.4 625344.8 1.14x
1000000 Reverse 810659.0 663776.2 1.22x

πŸ’» Source Code

The core C++ implementations of the algorithms can be found here:

  • map_perm_algorithms.cpp: Includes Position_unrank, Position_rank, PositionPure_unrank, and PositionPure_rank.
  • permPure_full.cpp: Serves as a high-performance reference implementation for generating all permutations of a set using the PositionPure iterative algorithm.

πŸ›  Compilation Commands

The following commands compile both algorithms using the same aggressive optimization flags to ensure a fair performance comparison:

# Compile options
g++ -O3 -std=c++11 -march=native -ffast-math -fomit-frame-pointer

πŸ“– Citation

If you use this algorithm or implementation in your research, please cite it as follows:

APA Style

HU, Y. (2026). Position Method: A Linear-Time Generation Algorithm for Permutations (Version v1.0.0). GitHub Repository. DOI: 10.5281/zenodo.18170157

BibTeX

@software{hu_yusheng_2026_18170157,
  author       = {Hu, Yusheng},
  title        = {Position Method: A Linear-Time Generation Algorithm for Permutations},
  year         = 2026,
  publisher    = {Zenodo},
  version      = {v1.0.0},
  doi          = {10.5281/zenodo.18170157},
  url          = {[https://doi.org/10.5281/zenodo.18170157](https://doi.org/10.5281/zenodo.18170157)}
}

πŸ’‘ Future Ideas & Extensions

1. Zero-Memory "Lazy" Permutation

Since we can determine any value at position k in O(N) time without generating the full array, we can implement a Lazy-Loaded Permutation Object.

  • Use Case: Accessing elements of a massive permutation (N > 10^6) that exceeds RAM capacity.
  • Concept: Overload operator[] to call PP_get_value_at_position(k) on the fly.

2. SIMD-Accelerated Partial Search

The backward-tracing logic if (C[i] == current_target_pos) is highly vectorizable.

  • Idea: Use AVX-512 to scan 16 elements of C simultaneously.
  • Goal: Achieve near O(1) perceived latency for individual element lookups.

3. Distributed Constrained Search

Leverage the independence of the get_value function to split a single permutation validation task across multiple CPU cores or network nodes without shared state.

πŸ’‘ Why Position Pure?

The Position Pure (PP) algorithm provides a more intuitive $O(n)$ implementation compared to the classic Myrvold-Ruskey (MR) method:

  • Simpler Indexing: Replaces the traditional swap-based unranking with a direct mapping logic.
  • Educational Value: The provided visualizations make the complex mapping process easy to understand.

πŸ“„ Academic Status

  • Preprint: need long time.
  • Status: $O(n)$ complexity achieved, improving upon the classic Myrvold-Ruskey logic.

πŸ“’ Important Update: From "Position Pro" to "Position Pure"

Please note: Starting from version v1.1.1, this algorithm has been officially renamed from Position Pro to Position Pure (PP).

Why the name change?

  1. Trademark Consideration: We found that "Position Pro" is a registered trademark in several industrial and commercial sectors. To ensure the algorithm's academic independence and avoid legal confusion, we have transitioned to a unique name.
  2. Technical Essence (Why "Pure"?): The name "Pure" more accurately describes the algorithm's technical implementation. In computer science, "pure" often implies a clean, branchless execution flow.
  3. Core Feature - Branchless Logic: Unlike the base version, Position Pure (PP) achieves its linear-time performance through direct element-wise shifting and replacement. It eliminates conditional branching (no if statements in the core loop), leading to a "purer" and faster execution path on modern CPUs.