Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to log a table of metrics into mlflow

Tags:

mlflow

I am trying to see if mlflow is the right place to store my metrics in the model tracking. According to the doc log_metric takes either a key value or a dict of key-values. I am wondering how to log something like below into mlflow so it can be visualized meaningfully.

              precision    recall  f1-score   support

      class1       0.89      0.98      0.93       174
      class2       0.96      0.90      0.93        30
      class3       0.96      0.90      0.93        30
      class4       1.00      1.00      1.00         7
      class5       0.93      1.00      0.96        13
      class6       1.00      0.73      0.85        15
      class7       0.95      0.97      0.96        39
      class8       0.80      0.67      0.73         6
      class9       0.97      0.86      0.91        37
     class10       0.95      0.81      0.88        26
     class11       0.50      1.00      0.67         5
     class12       0.93      0.89      0.91        28
     class13       0.73      0.84      0.78        19
     class14       1.00      1.00      1.00         6
     class15       0.45      0.83      0.59         6
     class16       0.97      0.98      0.97       245
     class17       0.93      0.86      0.89       206

    accuracy                           0.92       892
   macro avg       0.88      0.90      0.88       892
weighted avg       0.93      0.92      0.92       892
like image 647
Felix Gao Avatar asked Oct 22 '25 15:10

Felix Gao


2 Answers

I searched for the same thing a few days ago, and since I still have not found anything more practicable and this post was again on the top my search results, I thought I share an example of the approach @Martin Zivdar already mentioned in the comments and that I have implemented for now

Sidenotes

  • for simplicity I scratched preprocessing, rebalancing, .. etc.
  • it is possible to log multiple metrics (or parameters) at once in a flat dictionary (see the docs)

TL;DR

Logging all performance metrics can be done with loops, here in an example for the classification_report()

# Logging all metrics in classification_report
mlflow.log_metric("accuracy", cr.pop("accuracy"))
for class_or_avg, metrics_dict in cr.items():
    for metric, value in metrics_dict.items():
        mlflow.log_metric(class_or_avg + '_' + metric,value)

Create Sample Data / Simulate Training

import pandas as pd
import numpy as np

from sklearn.tree import DecisionTreeClassifier
from imblearn.over_sampling import SMOTE
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split, GridSearchCV, StratifiedKFold
from sklearn.metrics import classification_report
import mlflow

# Create example data
N = 5000
n_features = 20
X, y = make_classification(n_samples=N,
                           n_features=n_features,
                           n_clusters_per_class=1,
                           weights=[0.8,0.15,0.05],
                           flip_y=0,
                           random_state=1, n_classes=3)

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y)

# Start logging
mlflow.set_experiment("stackoverflow")
with mlflow.start_run():
    # Simulate Model Training
    grid_params = {
        "criterion" : ["gini","log_loss"],
        "min_samples_split": np.arange(2,6),
        "min_samples_leaf": np.linspace(0.01,0.5, num = 3),
        "ccp_alpha": np.linspace(0,3,5),
    }
    cv=StratifiedKFold(shuffle=True)
    grid_search = GridSearchCV(DecisionTreeClassifier(),grid_params,n_jobs=3, return_train_score=False, scoring='f1_macro', verbose=1)
    grid_search.fit(X_train,y_train)

    best_model = grid_search.best_estimator_
    best_params = grid_search.best_params_
    
    # it is possible to log multiple params (and metrics) in a flat dictionary
    mlflow.log_params(best_params)
    y_pred = best_model.predict(X_test)
    cr = classification_report(y_test, y_pred, output_dict=True)
    cr

Output:

{'0': {'precision': 0.9461312438785504,
  'recall': 0.966,
  'f1-score': 0.9559623948540327,
  'support': 1000},
 '1': {'precision': 0.8083832335329342,
  'recall': 0.7180851063829787,
  'f1-score': 0.7605633802816901,
  'support': 188},
 '2': {'precision': 0.7903225806451613,
  'recall': 0.7903225806451613,
  'f1-score': 0.7903225806451614,
  'support': 62},
 'accuracy': 0.92,
 'macro avg': {'precision': 0.8482790193522153,
  'recall': 0.8248025623427133,
  'f1-score': 0.835616118593628,
  'support': 1250},
 'weighted avg': {'precision': 0.9176858334261937,
  'recall': 0.92,
  'f1-score': 0.9183586482775923,
  'support': 1250}}

Exemplary logging multiple metrics with MLFlow

So far so good, now to logg all metrics of the classification report one can just iterate over the nested dictionary. I manually .pop accuracy first because this is the only non-nested entry in the dict

    # Logging all metrics in classification_report
    mlflow.log_metric("accuracy", cr.pop("accuracy"))
    for class_or_avg, metrics_dict in cr.items():
        for metric, value in metrics_dict.items():
            mlflow.log_metric(class_or_avg + '_' + metric,value)
like image 89
Björn Avatar answered Oct 26 '25 12:10

Björn


You can log a table as an artefact, but it's hard to then use this data to compare across runs.

if run_name is not None:

    with mlflow.start_run(run_id=run_name) as run:

        mlflow.log_artifact("metrics.csv", artifact_path="post_threshold_metrics")

enter image description here

If looks like version 2.8.0 might have some extended support, you can now use.

client = MlflowClient(tracking_uri='XXXXXXX')
    client.log_table(run_name, data=thresholds_df, artifact_file='post_threshold_metrics.json')

To log an evaluation table, which can then be shown in the evaluation section of the UI, not much in the way of visualizations though, can't find a way to create a bar chart to compare across runs.

enter image description here

like image 42
George Pearse Avatar answered Oct 26 '25 11:10

George Pearse



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!