Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the function passed to Pool.map pickled when mutiprocessing uses fork as a starting method?

In Linux, the multiprocesing module uses fork as the default starting method for a new process. Why is then necessary to pickle the function passed to map? As far as I understand all the state of the process is cloned, including the functions. I can imagine why that's necessary if spawn is used but not for fork.

like image 482
Jorge E. Cardona Avatar asked Jan 21 '26 06:01

Jorge E. Cardona


1 Answers

Job-methods like .map() don't start new processes so exploiting fork at this point would not be an option. Pool uses IPC to pass arguments to already running worker-processes and this always requires serialization (pickling). It seems there's some deeper misunderstanding with what pickling here involves, though.

When you look at job-methods like .map(), the pickling for your function here just results in the qualified function-name getting send as string and the receiving process during unpickling basically just looks up the function in its global scope for a reference to it again.

Now between spawn and fork there is a difference, but it already materializes as soon as worker-processes boot up (starts with initializing Pool). With spawn-context, the new worker needs to build up all reachable global objects from scratch, with fork they're already there. So your function will be cloned once during boot up when you use fork and it will save a little time.

When you start sending jobs later, unpickling your sent function in the worker, with any context, just means re-referencing the function from global scope again. That's why the function needs to exist before you instantiate the pool and workers are launched, even for usage with spawn-context.

So the inconveniences you might experience with not being able to pickle local or unnamed-functions (lambdas) is rooted in the problem of regaining a reference to your (then) already existing function in the worker-processes. If spawn or fork is used for setting up the worker-processes before, doesn't make a difference at this point.

like image 61
Darkonaut Avatar answered Jan 22 '26 19:01

Darkonaut



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!