Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to handle multiprocessing based on the limit of CPU's

Currently i have a process that parses thousands of data files, currently I'm doing the following strategy to limit the number of parallel process launched based if the total amount is lower than the number of CPUs available.

But this is the most apropriate way to do it?

from concurrent.futures import ProcessPoolExecutor
from multiprocessing import cpu_count


def pool_executor(function_name, data):
    if len(data) >= 8:
        workers = cpu_count()
    else:
        workers = len(data)
    with ProcessPoolExecutor(max_workers=workers) as executor:
        executor.map(function_name, data)
like image 659
thclpr Avatar asked Nov 08 '25 10:11

thclpr


1 Answers

You could just pass in cpu_count() for the max_workers value. If len(data) is less than that it won't create more workers than it needs.

def pool_executor(function_name, data):
    with ProcessPoolExecutor(max_workers=cpu_count()) as executor:
        executor.map(function_name, data)

However you might want to experiment to find whether cpu_count() is actually the best value. If your processes are spending a lot of time reading/writing files it might be that starting slightly more than cpu_count() actually gives you an additional boost, but that's only something you can determine from measurement.

like image 181
Duncan Avatar answered Nov 10 '25 02:11

Duncan



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!