How to log a table of metrics into mlflow

Question

I am trying to see if mlflow is the right place to store my metrics in the model tracking. According to the doc log_metric takes either a key value or a dict of key-values. I am wondering how to log something like below into mlflow so it can be visualized meaningfully.

              precision    recall  f1-score   support

      class1       0.89      0.98      0.93       174
      class2       0.96      0.90      0.93        30
      class3       0.96      0.90      0.93        30
      class4       1.00      1.00      1.00         7
      class5       0.93      1.00      0.96        13
      class6       1.00      0.73      0.85        15
      class7       0.95      0.97      0.96        39
      class8       0.80      0.67      0.73         6
      class9       0.97      0.86      0.91        37
     class10       0.95      0.81      0.88        26
     class11       0.50      1.00      0.67         5
     class12       0.93      0.89      0.91        28
     class13       0.73      0.84      0.78        19
     class14       1.00      1.00      1.00         6
     class15       0.45      0.83      0.59         6
     class16       0.97      0.98      0.97       245
     class17       0.93      0.86      0.89       206

    accuracy                           0.92       892
   macro avg       0.88      0.90      0.88       892
weighted avg       0.93      0.92      0.92       892

Björn · Accepted Answer

I searched for the same thing a few days ago, and since I still have not found anything more practicable and this post was again on the top my search results, I thought I share an example of the approach @Martin Zivdar already mentioned in the comments and that I have implemented for now

Sidenotes

for simplicity I scratched preprocessing, rebalancing, .. etc.
it is possible to log multiple metrics (or parameters) at once in a flat dictionary (see the docs)

TL;DR

Logging all performance metrics can be done with loops, here in an example for the classification_report()

# Logging all metrics in classification_report
mlflow.log_metric("accuracy", cr.pop("accuracy"))
for class_or_avg, metrics_dict in cr.items():
    for metric, value in metrics_dict.items():
        mlflow.log_metric(class_or_avg + '_' + metric,value)

Create Sample Data / Simulate Training

import pandas as pd
import numpy as np

from sklearn.tree import DecisionTreeClassifier
from imblearn.over_sampling import SMOTE
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV, StratifiedKFold
from sklearn.metrics import classification_report
import mlflow

# Create example data
N = 5000
n_features = 20
X, y = make_classification(n_samples=N,
                           n_features=n_features,
                           n_clusters_per_class=1,
                           weights=[0.8,0.15,0.05],
                           flip_y=0,
                           random_state=1, n_classes=3)

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y)

# Start logging
mlflow.set_experiment("stackoverflow")
with mlflow.start_run():
    # Simulate Model Training
    grid_params = {
        "criterion" : ["gini","log_loss"],
        "min_samples_split": np.arange(2,6),
        "min_samples_leaf": np.linspace(0.01,0.5, num = 3),
        "ccp_alpha": np.linspace(0,3,5),
    }
    cv=StratifiedKFold(shuffle=True)
    grid_search = GridSearchCV(DecisionTreeClassifier(),grid_params,n_jobs=3, return_train_score=False, scoring='f1_macro', verbose=1)
    grid_search.fit(X_train,y_train)

    best_model = grid_search.best_estimator_
    best_params = grid_search.best_params_
    
    # it is possible to log multiple params (and metrics) in a flat dictionary
    mlflow.log_params(best_params)
    y_pred = best_model.predict(X_test)
    cr = classification_report(y_test, y_pred, output_dict=True)
    cr

Output:

{'0': {'precision': 0.9461312438785504,
  'recall': 0.966,
  'f1-score': 0.9559623948540327,
  'support': 1000},
 '1': {'precision': 0.8083832335329342,
  'recall': 0.7180851063829787,
  'f1-score': 0.7605633802816901,
  'support': 188},
 '2': {'precision': 0.7903225806451613,
  'recall': 0.7903225806451613,
  'f1-score': 0.7903225806451614,
  'support': 62},
 'accuracy': 0.92,
 'macro avg': {'precision': 0.8482790193522153,
  'recall': 0.8248025623427133,
  'f1-score': 0.835616118593628,
  'support': 1250},
 'weighted avg': {'precision': 0.9176858334261937,
  'recall': 0.92,
  'f1-score': 0.9183586482775923,
  'support': 1250}}

Exemplary logging multiple metrics with MLFlow

So far so good, now to logg all metrics of the classification report one can just iterate over the nested dictionary. I manually .pop accuracy first because this is the only non-nested entry in the dict

    # Logging all metrics in classification_report
    mlflow.log_metric("accuracy", cr.pop("accuracy"))
    for class_or_avg, metrics_dict in cr.items():
        for metric, value in metrics_dict.items():
            mlflow.log_metric(class_or_avg + '_' + metric,value)

George Pearse · Answer

You can log a table as an artefact, but it's hard to then use this data to compare across runs.

if run_name is not None:

    with mlflow.start_run(run_id=run_name) as run:

        mlflow.log_artifact("metrics.csv", artifact_path="post_threshold_metrics")

enter image description here

If looks like version 2.8.0 might have some extended support, you can now use.

client = MlflowClient(tracking_uri='XXXXXXX')
    client.log_table(run_name, data=thresholds_df, artifact_file='post_threshold_metrics.json')

To log an evaluation table, which can then be shown in the evaluation section of the UI, not much in the way of visualizations though, can't find a way to create a bar chart to compare across runs.

enter image description here

How to log a table of metrics into mlflow

Tags:

mlflow

Felix Gao

2 Answers

TL;DR

Create Sample Data / Simulate Training

Exemplary logging multiple metrics with MLFlow

Björn

George Pearse

Recent Activity

Donate For Us

How to log a table of metrics into mlflow

Tags:

mlflow

Felix Gao

2 Answers

TL;DR

Create Sample Data / Simulate Training

Exemplary logging multiple metrics with MLFlow

Björn

George Pearse

Related questions

Recent Activity

Donate For Us