Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add correct labels for Seaborn Confusion Matrix

I have plotted my data into a confusion matrix using seaborn but I ran into a problem. The problem is that it is only showing numbers from 0 to 11, on both axes, because I have 12 different labels.

My code looks as follows:

cf_matrix = confusion_matrix(y_test, y_pred)
fig, ax = plt.subplots(figsize=(15,10)) 
sns.heatmap(cf_matrix, linewidths=1, annot=True, ax=ax, fmt='g')

Here you can see my confusion matrix:

Confusion matrix

I am getting the confusion matrix as I should. The only problem is the names of the labels which are not shown. I have searched quite a while over the Internet and with no luck. Are there any parameters which can attach the labels or how can this be done?

like image 663
Rasmus Birk Knudsen Avatar asked Oct 20 '25 04:10

Rasmus Birk Knudsen


1 Answers

When you factorize your categories, you should have retained the levels, so you can use that in conjunction with pd.crosstab instead of confusion_matrix to plot. Using iris as example:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from sklearn.metrics import classification_report, confusion_matrix

df = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data",
                 header=None,names=["s.wid","s.len","p.wid","p.len","species"])
X = df.iloc[:,:4]
y,levels = pd.factorize(df['species'])

At this part, you get the labels y in [0,..1,..2] and levels as the original labels to which 0,1,2 corresponds to:

Index(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype='object')

So we fit and do like what you have:

clf = RandomForestClassifier(max_depth=2, random_state=0)
clf.fit(X,y)
y_pred = clf.predict(X)
print(classification_report(y,y_pred,target_names=levels))

enter image description here

And a confusion matrix with 0,1,2:

cf_matrix = confusion_matrix(y, y_pred)
sns.heatmap(cf_matrix, linewidths=1, annot=True, fmt='g')

enter image description here

We go back and use the levels:

cf_matrix = pd.crosstab(levels[y],levels[y_pred])
fig, ax = plt.subplots(figsize=(5,5))
sns.heatmap(cf_matrix, linewidths=1, annot=True, ax=ax, fmt='g')

enter image description here

like image 89
StupidWolf Avatar answered Oct 22 '25 18:10

StupidWolf