I have a list comprehension:
thingie=[f(a,x,c) for x in some_list]
which I am parallelising as follows:
from multiprocessing import Pool
pool=Pool(processes=4)
thingie=pool.map(lambda x: f(a,x,c), some_list)
but I get the following error:
_pickle.PicklingError: Can't pickle <function <lambda> at 0x7f60b3b0e9d8>:
attribute lookup <lambda> on __main__ failed
I have tried to install the pathos package which apparently addresses this issue, but when I try to import it I get the error:
ImportError: No module named 'pathos'
OK, so this answer is just for the record, I've figured it out with author of the question during comment conversation.
multiprocessing needs to transport every object between processes, so it uses pickle to serialize it in one process and deserialize in another. It all works well, but pickle cannot serialize lambda. AFAIR it is so because pickle needs functions source to serialize it, and lambda won't have it, but I'm not 100% sure and cannot quote my source.
It won't be any problem if you use map() on 1 argument function - you can pass that function instead of lambda. If you have more arguments, like in your example, you need to define some wrapper with def keyword:
from multiprocessing import Pool
def f(x, y, z):
print(x, y, z)
def f_wrapper(y):
return f(1, y, "a")
pool = Pool(processes=4)
result = pool.map(f_wrapper, [7, 9, 11])
Just before I close this, I found another way to do this with Python 3, using functools,
say I have a function f with three variables f(a,x,c), one of which I want to may, say x. I can use the following code to do basically what @FilipMalczak suggests:
import functools
from multiprocessing import Pool
f1=functools.partial(f,a=10)
f2=functools.partial(f2,c=10)
pool=Pool(processes=4)
final_answer=pool.map(f2,some_list)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With