Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: How to use different logfiles for processes in multiprocessing.Pool?

I am using multiprocessing.Pool to run a number of independent processes in parallel. Not so much different from the basic example in the python docs:

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(f, [1, 2, 3]))

I would like each process to have a separate log file. I log various info from other modules in my codebase and some third-party packages (none of them is multiprocessing aware). So, for example, I would like this:

import logging
from multiprocessing import Pool

def f(x):
    logging.info(f"x*x={x*x}")
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(f, range(10)))

to write on disk:

log1.log
log2.log
log3.log
log4.log
log5.log

How do I achieve it?

like image 929
Roberto Arista Avatar asked May 04 '26 07:05

Roberto Arista


1 Answers

You'll need to use Pool's initializer() to set up and register the separate loggers immediately after workers start up. Under the hood the arguments to Pool(initializer) and Pool(initargs) end up being passed to Process(target) and Process(args) for creating new worker-processes...

Pool-workers get named in the format {start_method}PoolWorker-{number}, so e.g. SpawnWorker-1 if you use spawn as starting method for new processes. The file number for the logfiles then can be extracted from the assigned worker-names with mp.current_process().name.split('-')[1].

import logging
import multiprocessing as mp


def f(x):
    logger.info(f"x*x={x*x}")
    return x*x


def _init_logging(level=logging.INFO, mode='a'):
    worker_no = mp.current_process().name.split('-')[1]
    filename = f"log{worker_no}.log"
    fh = logging.FileHandler(filename, mode=mode)
    fmt = logging.Formatter(
        '%(asctime)s %(processName)-10s %(name)s %(levelname)-8s --- %(message)s'
    )
    fh.setFormatter(fmt)
    logger = logging.getLogger()
    logger.addHandler(fh)
    logger.setLevel(level)
    globals()['logger'] = logger


if __name__ == '__main__':

    with mp.Pool(5, initializer=_init_logging, initargs=(logging.DEBUG,)) as pool:
        print(pool.map(f, range(10)))

Note, due to the nature of multiprocessing, there's no guarantee for the exact number of files you end up with in your small example. Since multiprocessing.Pool (contrary to concurrent.futures.ProcessPoolExecutor) starts workers as soon as you create the instance, you're bound to get the specified Pool(process)-number of files, so in your case 5. Actual thread/process-scheduling by your OS might cut this number short here, though.

like image 90
Darkonaut Avatar answered May 06 '26 20:05

Darkonaut



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!