This repository contains the reference implementation and experimental validation code for the paper:
"V-SORT: Vectorized Counting Sort Implementation for High-Performance Bounded Integer Sorting"
Submitted to IEEE Transactions on [Journal Name]
V-SORT is a novel vectorized sorting algorithm designed for bounded integer arrays that achieves O(n + k) time complexity through elimination of comparison operations. By leveraging modern CPU SIMD capabilities and optimized memory access patterns, V-SORT demonstrates superior performance over traditional comparison-based algorithms for datasets with bounded integer ranges.
Experience V-SORT in action with our interactive web demonstration that features:
- Real-time visualization of the V-SORT algorithm steps
- Performance comparison with traditional sorting algorithms (QuickSort, MergeSort)
- Interactive array generation with customizable size and value ranges
- Step-by-step algorithm breakdown showing the three vectorized operations
- Performance metrics including execution time and memory usage
- Educational mode with detailed explanations of the underlying theory
The demo allows you to experiment with different array configurations to understand when V-SORT excels and observe the O(n + k) complexity in action.
V-SORT/
├── README.md # This file
├── .gitignore
├── V-SORT.py # Core V-SORT implementation
├── V-SORT LARGE.py # Large-scale dataset implementation
├── vsort_basic.py # Basic reference implementation
├── vsort_optimized.py # Memory-optimized variant
├── vsort_parallel.py # Multi-threaded implementation
├── vsort_gpu.py # GPU implementation (experimental)
├── benchmarkofv-sort.py # Main benchmark suite
├── vsort-benchmark.py # Comprehensive performance tests
├── v-sort-demo.zip # Interactive demonstration
├── tempCodeRunnerFile.py # Temporary file
├── Vectorized Sorting Algorithm Analysis_.pdf # Technical paper
└── V-SORT/ # All experimental results & plots
├── Benchmark with Quicksort.png
├── Comparision with Quicksort.png
├── v_sort_performance_analysis.png
├── vsort_benchmark_comparison.png
├── vsort_relative_performance.png
├── sorting_dataset_winners.png
├── sorting_results_table.png
├── sorting_summary_rankings.png
├── performance_statistics.png
├── table of numpy sort.png
├── benchmark1.png
├── benchmark2.png
├── direct_index_sort_benchmark.png
├── direct_mapping_sort_benchmark.png
├── download.png
├── download (3).png
├── Figure_1.png
├── img1.png
├── gpt result.png
├── v_sort_comparison.png
└── Screenshot [various].png
- Python 3.7 or higher
- NumPy 1.19+
- SciPy 1.5+
- Matplotlib 3.3+ (for visualization)
# Clone the repository
git clone https://github.com/[your-username]/V-SORT.git
cd V-SORT
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install numpy scipy matplotlib
# Verify installation
python vsort_basic.pyVisit https://v-sort.lovable.app/ to explore V-SORT interactively without any setup required.
import numpy as np
from V-SORT import vsort # or: exec(open('V-SORT.py').read())
# Create test array
arr = np.random.randint(0, 1000, size=1000000)
# Sort using V-SORT
sorted_arr = vsort(arr)
# Verify correctness
assert np.array_equal(sorted_arr, np.sort(arr))# Run main benchmark suite (reproduces paper results)
python benchmarkofv-sort.py
# Run comprehensive benchmark comparison
python vsort-benchmark.py
# For large-scale datasets
python "V-SORT LARGE.py"All figures presented in the paper can be found in the V-SORT/ subdirectory:
Figure 1: Head-to-head comparison with QuickSort
Figure 2: Detailed comparative analysis
Figure 3: Comprehensive performance metrics
Figure 4: Multi-algorithm comparison
Figure 5: Relative speedup analysis
Figure 6: Dataset-specific performance winners
Table I: Detailed numerical results
Table II: NumPy integration comparison
Figure 7: Overall algorithm rankings
Figure 8: Statistical performance analysis
Extended benchmark results - Configuration 1
Extended benchmark results - Configuration 2
Direct indexing performance analysis
Memory mapping efficiency study
GPU implementation performance analysis
Extended GPU performance metrics
Supplementary experimental analysis
V-SORT operates in three vectorized steps:
- Index Mapping: Generate range array using
np.arange(min_val, max_val + 1) - Frequency Counting: Count occurrences using
np.bincount(arr - min_val) - Array Reconstruction: Build sorted output using
np.repeat(indices, counts)
Time Complexity (Theorem 1): O(n + k) where n is array size and k is value range
Space Complexity (Theorem 2): O(k) auxiliary space for frequency counting
Proof: See Vectorized Sorting Algorithm Analysis_.pdf for complete mathematical proofs
V-SORT.py: Reference implementation with detailed commentsvsort_basic.py: Minimal implementation for educational purposesvsort_optimized.py: Memory-efficient variant for large k valuesvsort_parallel.py: Multi-threaded version for multi-core systemsvsort_gpu.py: Experimental GPU implementation (see Section VI for performance analysis)
V-SORT excels when:
- Integer arrays with bounded range (k << n² or k ≤ 10⁶)
- Dense value distributions
- Repeated sorting operations on similar data
- Memory bandwidth is not limiting factor
| Algorithm | Time Complexity | Space | Comparisons |
|---|---|---|---|
| QuickSort | O(n log n) avg | O(log n) | Yes |
| MergeSort | O(n log n) | O(n) | Yes |
| CountingSort | O(n + k) | O(k) | No |
| V-SORT | O(n + k) | O(k) | No |
Figure: V-SORT vs QuickSort performance across different array sizes
Figure: Comprehensive performance analysis showing V-SORT's advantages
The main implementation validates all theoretical claims from the paper:
def vsort(arr):
"""
V-SORT: Vectorized Counting Sort Implementation
Implements Algorithm 1 from the paper with three vectorized operations.
Args:
arr: Input integer array (numpy array)
Returns:
Sorted array (numpy array)
Complexity:
Time: O(n + k) where k = max(arr) - min(arr) + 1
Space: O(k) auxiliary
"""
# Implementation follows paper specification...Reproduces all experimental results from Section IV:
- Performance scaling analysis (Figure 3)
- Comparison with standard library sorts
- Memory efficiency measurements
- Cache behavior analysis
Handles datasets with:
- n > 10⁷ elements
- Various value range distributions
- Memory-constrained environments
# Run all benchmarks (takes ~30 minutes)
python benchmarkofv-sort.py --full
# Generate all figures
python vsort-benchmark.py --generate-figures
# Run specific experiments
python benchmarkofv-sort.py --experiment [complexity|comparison|scaling]Each theorem and lemma in the paper has corresponding validation code:
- Theorem 1 (Time Complexity): Validated in main benchmark suite
- Theorem 2 (Space Complexity): Memory profiling in optimization tests
- Lemma 1-3: Supporting proofs demonstrated through empirical measurements
numpy>=1.19.0
scipy>=1.5.0
matplotlib>=3.3.0
For GPU version:
cupy>=9.0.0 # Optional, only for vsort_gpu.py
If you use this code in your research, please cite:
@article{yourname2025vsort,
title={V-SORT: Vectorized Counting Sort Implementation for High-Performance Bounded Integer Sorting},
author={[Your Name]},
journal={IEEE Transactions on [Journal Name]},
year={2025},
note={Under Review}
}V-SORT achieves high performance through:
- SIMD Utilization: NumPy operations leverage CPU vector instructions
- Cache Efficiency: Sequential memory access patterns
- Branch Elimination: No conditional logic in critical path
- Memory Bandwidth Optimization: Minimized data movement
The repository includes vsort_gpu.py demonstrating GPU implementation challenges:
- Host-Device transfer overhead dominates for typical array sizes
- Memory-bound nature limits GPU advantage
- Practical performance gains only for n > 10⁸ and pre-loaded GPU data
See V-SORT/V-SORT/download.png and V-SORT/V-SORT/download (3).png for detailed GPU performance analysis.
- Value Range Dependency: Performance degrades when k approaches n²
- Integer-Only: Current implementation limited to integer types
- Memory Requirements: Requires O(k) contiguous memory allocation
- Negative Numbers: Requires offset computation for negative integers
- Extended precision floating-point sorting via bucketing
- Distributed memory implementations
- Hardware-specific optimizations (AVX-512, ARM NEON)
- Adaptive algorithm selection based on input characteristics
For questions about the implementation or paper:
- Email: [hanish.kumar9193@gmail.com]
- Issues: GitHub Issues
MIT License - See LICENSE file for details
This work was supported by [Your Institution/Grant Information].
Repository Status: Under active development | Paper: Under Review | Last Updated: [Current Date]