Skip to content

vikyw89/parallize

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Parallize

Parallize is a Python package that provides a decorator to convert CPU-bound synchronous functions into asynchronous functions, executing them in separate processes using concurrent.futures.ProcessPoolExecutor. This leverages multiple CPU cores for genuine parallelism, bypassing Python's GIL for compute-intensive workloads.

Features

  • Async Wrapper for Sync Functions: Convert any blocking sync function into an awaitable async function.
  • True Parallelism via Processes: Uses ProcessPoolExecutor to run functions in separate processes, achieving real parallelism for CPU-bound tasks.
  • Configurable Worker Count: Specify max_workers or let the package default to the number of available CPU cores.
  • Simple Decorator API: Just decorate any sync function with @aparallize and await it.

Mini Benchmark

Results comparing serial (ThreadPoolExecutor) vs parallel (aparallize / ProcessPoolExecutor) execution:

Test Case Serial (ThreadPool) Parallel (ProcessPool) Speedup Tasks
test_aparallize_fn (2 tasks) ~17.2s ~8.3s 2.08x 2
test_aparallize_10 (10 tasks) ~85.1s ~14.0s 5.94x 10

Note: Speedup is measured against a ThreadPoolExecutor baseline, which still suffers from Python's GIL for CPU-bound work. The aparallize decorator uses ProcessPoolExecutor to bypass the GIL, yielding much higher speedups for CPU-bound tasks.

Requirements

  • Python 3.10+

Installation

pip install parallize

Or install from source with Poetry:

poetry install

Usage

Basic Usage

import asyncio
from parallize import aparallize

def cpu_bound_task(n):
    return sum(i * i for i in range(n))

async def main():
    # Run the sync CPU-bound function in a separate process
    result = await aparallize(cpu_bound_task)(10**6)
    print(result)

asyncio.run(main())

Parallelizing Multiple Calls

Use asyncio.gather to run multiple CPU-bound tasks in parallel:

import asyncio
from parallize import aparallize

def cpu_bound_task(n):
    return sum(i * i for i in range(n))

async def main():
    # Run 4 CPU-bound tasks concurrently in separate processes
    tasks = [aparallize(cpu_bound_task)(10**6) for _ in range(4)]
    results = await asyncio.gather(*tasks)
    print(results)

asyncio.run(main())

Customizing Worker Count

async def main():
    # Use only 2 worker processes
    result = await aparallize(cpu_bound_task, max_workers=2)(10**6)
    print(result)

How It Works

The aparallize decorator wraps a synchronous function with an async wrapper that:

  1. Submits the synchronous function call to a ProcessPoolExecutor.
  2. Uses asyncio's event loop (loop.run_in_executor) to await the result without blocking.

This is ideal for CPU-bound workloads such as:

  • Mathematical computation / numeric processing
  • Image / video / audio processing
  • Data transformation / parsing
  • Any workload that benefits from true multi-core parallelism

Project Structure

parallize/
├── parallize/
│   └── __init__.py    # Main module with aparallize decorator
├── utils/
│   └── __init__.py    # TimeLogger utility for task timing
├── scripts/
│   └── publish.py     # Poetry publish script
├── tests/
│   └── test_aparallize.py  # Benchmark tests
├── pyproject.toml
└── README.md

License

MIT License. See LICENSE for details.

Contributing

Contributions are welcome! Please open an issue or submit a pull request on the GitHub repository.

Issues

Bug reports and feature requests: GitHub Issues

Releases

No releases published

Packages

 
 
 

Contributors

Languages