Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to have multiple MLFlow runs in parallel?

I'm not very familiar with parallelization in Python and I'm getting an error when trying to train a model on multiple training folds in parallel. Here's a simplified version of my code:

def train_test_model(fold):
    # here I train the model etc...
    
    # now I want to save the parameters and metrics
    with mlflow.start_run():
        mlflow.log_param("run_name", run_name)
        mlflow.log_param("modeltype", modeltype)
        # and so on...

if __name__=="__main__":
    pool = ThreadPool(processes = num_trials)
    # run folds in parallel
    pool.map(lambda fold:train_test_model(fold), folds)

I'm getting the following error:

Exception: Run with UUID 23e9bb6d22674a518e48af9c51252860 is already active. To start a new run, first end the current run with mlflow.end_run(). To start a nested run, call start_run with nested=True

The documentation says that mlflow.start_run() starts a new run and makes it active which is the root of my problem. Every thread starts a MLFlow run for its corresponding fold and makes it active while I need the runs to run in parallel i.e. all be active(?) and save parameters/metrics of the corresponding fold. How can I solve that issue?

like image 666
jared3412341 Avatar asked Oct 25 '25 06:10

jared3412341


1 Answers

I found a solution, maybe it will be useful for someone else. You can see details with code examples here: https://github.com/mlflow/mlflow/issues/3592

like image 78
jared3412341 Avatar answered Oct 27 '25 19:10

jared3412341