easy_multiprocess is a package that makes it extremely simple to multiprocess.
# Before:
def func1(x):
# some heavy computing
...
a = [func1(i) for i in range(16)]
# After:
@parallelize
def func1(x):
# some heavy computing
a = [func1(i) for i in range(16)] # all calls run in parallel on 16 coresBelow is the same code from above, except using concurrent.futures:
def func1(x):
# some heavy computation
...
# concurrent.futures
with ProcessPoolExecutor() as pool:
a = list(pool.map(func1, range(16)))On a 16 core machine, let's see how long the following two take:
# Our machine has 16 cores
# func1, func2... each take 10 seconds
# 1: concurrent.futures library:
a = list(pool.map(func1, range(4)))
b = list(pool.map(func2, range(4)))
c = list(pool.map(func3, range(4)))
d = list(pool.map(func4, range(4)))
# elapsed time = 40s
# 2: easy_multiprocess:
a = [func1(i) for i in range(4)]
b = [func2(i) for i in range(4)]
c = [func3(i) for i in range(4)]
d = [func4(i) for i in range(4)]
# elapsed time = 10sYou can even use easy_multiprocess for simple code (that needs parallelizing):
# func1, func2... each take 10 seconds
a = func1(0)
b = func2(1)
c = func3(2)
d = func4(3)
print(a, b, c, d)
# elapsed time = 10sIt even works for the non-embarrassingly parallel case (but might be suboptimal):
# func1, func2... each take 10 seconds
a = func1(0)
b = func2(a)
c = func3(2) # c/d need to wait if after b
d = func4(3)
print(a, b, c, d)
# elapsed time = 30seasy_multiprocess implicitly uses a DAG computation graph for this (other libraries have similar mechanisms, such as Ray's DAG). See Limitations for where this doesn't work.
On Mac/Linux/Unix-like:
pip install easy_multiprocess
(Windows not currently supported)
git clone <this_repo>
cd easymultiprocess
pip install -e .
Then, run tests:
python -m unittest tests.test
I built easy_multiprocess simply to learn how to build a python package.
It's built on top of concurrent.futures, rather than being built from the ground up using OS-level primitives, since that would've taken me over 10x as much time and code to build. This means it has MANY limitations.
iscomparisons aren't supported forFutureResultobjects due to python identity. This means that forisoperations involving any output from any@parallelize-d function, the user should use==instead, or call.result()before usingis(similar to anyfutureobject from other multiprocess libraries). Adding support for this would require the user to install/use an inefficient custom python interpreter fork, which I would also have to spend time to build. This would anyway defeat the purpose of user-friendliness.- Standard IO streams are not guaranteed to work correctly
- The non-embarrassingly parallel case is suboptimally implemented (see the example, which should take 20s in the ideal case), but can be improved in the future
- Requires Copy-on-write, so only works on Mac/Linux/Unix-like (system with
forkmethod)
General limitations of all common python multiprocessing libraries:
- Closure variables cannot be created/updated once processes are set up (for std library concurrent futures, this occurs upon first submission to executor). You can get around this by calling
ProcessPoolManager.cleanupandget_executoragain. (TODO: add code sample) - Args must be
pickle-able (some other cases also work, such as if the library is usingdillor other serialization methods) - If you have lots of state, can be expensive to create new processes (copy-on-write not guaranteed)
- Program correctness is not guaranteed when external state race-conditions exist (ex. parallel processes try to write/read from same file)
Other Notes:
- The
@parallelizedecorator will send off the code it wraps to another process parallelizesounds more intuitive (and cooler), butconcurrentis technically "correct". If you want, you can use@concurrentinstead
Future improvements:
- Avoid
pickle-ing arguments. Instead, wrap the function and turn itsargsinto closure variables (copy-on-write would apply). Even if you pass in a large arg (such as a large ML model), it would not delay, or need to copy the large arg over to, the subprocess.