Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to customize y-labels in seaborn heatmap when I use a multi-index dataframe?

I have a DataFrame with multi-index rows and I would like to create a heatmap without the repetition of row's labels, just like it appears in pandas DataFrame. Here a code to replicate my problem:

import pandas as pd
from matplotlib import pyplot as plt
import random
import seaborn as sns
%matplotlib inline

df = pd.DataFrame({'Occupation':['Economist','Economist','Economist','Engineer','Engineer','Engineer',
                           'Data Scientist','Data Scientist','Data Scientist'],
             'Sex':['Female','Male','Both']*3, 'UK':random.sample(range(-10,10),9),
             'US':random.sample(range(-10,10),9),'Brazil':random.sample(range(-10,10),9)})

df = df.set_index(['Occupation','Sex'])

df

enter image description here

sns.heatmap(df, annot=True, fmt="",cmap="YlGnBu")

enter image description here

Besides the elimination of repetition, I would like to customize a bit the y-labels since this raw form doesn't look good to me.

Is it possible?

like image 321
Lucas Avatar asked Aug 31 '25 20:08

Lucas


1 Answers

AFAIK there's no quick and easy way to do that within seaborn, but hopefully some one corrects me. You can do it manually by resetting the ytick_labels to just be the values from level 1 of your index. Then you can loop over level 0 of your index and add a text element to your visualization at the correct location:

from collections import OrderedDict

ax = sns.heatmap(df, annot=True, cmap="YlGnBu")

ylabel_mapping = OrderedDict()
for occupation, sex in df.index:
    ylabel_mapping.setdefault(occupation, [])
    ylabel_mapping[occupation].append(sex)
    
hline = []
new_ylabels = []
for occupation, sex_list in ylabel_mapping.items():
    sex_list[0] = "{} - {}".format(occupation, sex_list[0])
    new_ylabels.extend(sex_list)
    
    if hline:
        hline.append(len(sex_list) + hline[-1])
    else:
        hline.append(len(sex_list))


ax.hlines(hline, xmin=-1, xmax=4, color="white", linewidth=5)
ax.set_yticklabels(new_ylabels)

enter image description here

An alternative approach involves using dataframe styling. This leads to a super simply syntax, but you do lose out on the colobar. This keeps your index and column presentation all the same as a dataframe. Note that you'll need to be working in a notebook or somewhere that can render html to view the output:

df.style.background_gradient(cmap="YlGnBu", vmin=-10, vmax=10)

enter image description here

like image 60
Cameron Riddell Avatar answered Sep 03 '25 22:09

Cameron Riddell