I have a DataFrame with multi-index rows and I would like to create a heatmap without the repetition of row's labels, just like it appears in pandas DataFrame. Here a code to replicate my problem:
import pandas as pd
from matplotlib import pyplot as plt
import random
import seaborn as sns
%matplotlib inline
df = pd.DataFrame({'Occupation':['Economist','Economist','Economist','Engineer','Engineer','Engineer',
'Data Scientist','Data Scientist','Data Scientist'],
'Sex':['Female','Male','Both']*3, 'UK':random.sample(range(-10,10),9),
'US':random.sample(range(-10,10),9),'Brazil':random.sample(range(-10,10),9)})
df = df.set_index(['Occupation','Sex'])
df
sns.heatmap(df, annot=True, fmt="",cmap="YlGnBu")
Besides the elimination of repetition, I would like to customize a bit the y-labels since this raw form doesn't look good to me.
Is it possible?
AFAIK there's no quick and easy way to do that within seaborn, but hopefully some one corrects me. You can do it manually by resetting the ytick_labels to just be the values from level 1 of your index. Then you can loop over level 0 of your index and add a text
element to your visualization at the correct location:
from collections import OrderedDict
ax = sns.heatmap(df, annot=True, cmap="YlGnBu")
ylabel_mapping = OrderedDict()
for occupation, sex in df.index:
ylabel_mapping.setdefault(occupation, [])
ylabel_mapping[occupation].append(sex)
hline = []
new_ylabels = []
for occupation, sex_list in ylabel_mapping.items():
sex_list[0] = "{} - {}".format(occupation, sex_list[0])
new_ylabels.extend(sex_list)
if hline:
hline.append(len(sex_list) + hline[-1])
else:
hline.append(len(sex_list))
ax.hlines(hline, xmin=-1, xmax=4, color="white", linewidth=5)
ax.set_yticklabels(new_ylabels)
An alternative approach involves using dataframe styling. This leads to a super simply syntax, but you do lose out on the colobar. This keeps your index and column presentation all the same as a dataframe. Note that you'll need to be working in a notebook or somewhere that can render html to view the output:
df.style.background_gradient(cmap="YlGnBu", vmin=-10, vmax=10)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With