Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does random.sample used with multiprocessing.Pool deadlock sometimes?

When I run the following snippet, sometimes it deadlocks and won't finish but sometimes it does. Why is this the case? I'm running Python 3.8 on Ubuntu 16.04 (4.4.0-173-generic).

from functools import partial
from multiprocessing.pool import Pool
from random import sample

pool = Pool(4)
result = pool.map(partial(sample, range(10)), range(10))

The same happens when I create a fresh random.Random instance for every function call:

def sample(data, k):
    rand = random.Random()
    return rand.sample(data, k)

In case it hangs and I send SIGINT I get the following traceback, however I can't make sense of it:

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "[...]/multiprocessing/util.py", line 277, in _run_finalizers
Process ForkPoolWorker-2:
Process ForkPoolWorker-1:
Process ForkPoolWorker-3:
    finalizer()
  File "[...]/multiprocessing/util.py", line 201, in __call__
    res = self._callback(*self._args, **self._kwargs)
  File "[...]/multiprocessing/pool.py", line 689, in _terminate_pool
    cls._help_stuff_finish(inqueue, task_handler, len(pool))
  File "[...]/multiprocessing/pool.py", line 674, in _help_stuff_finish
    inqueue._rlock.acquire()
KeyboardInterrupt
Traceback (most recent call last):
Traceback (most recent call last):
  File "[...]/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "[...]/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "[...]/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "[...]/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "[...]/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "[...]/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "[...]/multiprocessing/queues.py", line 355, in get
    with self._rlock:
  File "[...]/multiprocessing/queues.py", line 355, in get
    with self._rlock:
  File "[...]/multiprocessing/synchronize.py", line 95, in __enter__
    return self._semlock.__enter__()
  File "[...]/multiprocessing/synchronize.py", line 95, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
KeyboardInterrupt
Traceback (most recent call last):
  File "[...]/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "[...]/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "[...]/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "[...]/multiprocessing/queues.py", line 356, in get
    res = self._reader.recv_bytes()
  File "[...]/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "[...]/multiprocessing/connection.py", line 414, in _recv_bytes
    buf = self._recv(4)
  File "[...]/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt
Process ForkPoolWorker-4:
Traceback (most recent call last):
  File "[...]/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "[...]/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "[...]/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "[...]/multiprocessing/queues.py", line 355, in get
    with self._rlock:
  File "[...]/multiprocessing/synchronize.py", line 95, in __enter__
    return self._semlock.__enter__()
KeyboardInterrupt
like image 499
a_guest Avatar asked Dec 05 '25 05:12

a_guest


1 Answers

This is an open bug in Python 3.8. It is not related to random, the reason is the cleanup of worker processes seems to be done incorrectly. For example the following deadlocks as well:

from multiprocessing.pool import Pool

def test(x):
    return 'test'

pool = Pool(4)
result = pool.map(test, range(10))

A solution is to either call pool.close() manually after the map returned or to use the pool object as a context manager:

with Pool(4) as pool:
    result = pool.map(test, range(10))
like image 176
a_guest Avatar answered Dec 07 '25 20:12

a_guest