I have been lately working with bokeh for plotting. I just found out about holoviews and wanted to plot a basic box plot.
In my box plot I am trying to color per one of the categories I am grouping the data in. Here is the code I am using:
hv.extension('bokeh')
%opts BoxWhisker (box_color='blue')
boxwhisker = hv.BoxWhisker(pool_ride_distance_time_year_less_hour, ['total_time', 'customer'], 'amount')
plot_opts = dict(show_legend=False, width=800, height=400)
I am trying to color it differently according to the customer variable (which is a yes/no dummy variable.) When I try to include a list in box_color it does not work. Also including an extra variable with color in the data set does not do the trick. Any ideas on how to make it work? Thanks!
Most Elements in HoloViews have a color_index plot option which allows coloring by a particular variable. Using your example here we color by the 'customer' variable and define a HoloViews Cycle for the box_color using the Set1 colormap.
data = (np.random.randint(0, 3, 100), np.random.randint(0, 5, 100), np.random.rand(100))
boxwhisker = hv.BoxWhisker(data, ['total_time', 'customer'], 'amount')
plot_opts = dict(show_legend=False, width=800, height=400, color_index='customer')
style_opts = dict(box_color=hv.Cycle('Set1'))
boxwhisker.opts(plot=plot_opts, style=style_opts)
If you want to define a custom set of colors you can also define an explicit Cycle like this: Cycle(values=['#ffffff', ...]).
You can either use HoloViews or hvplot to color your boxplots per category.
Three possible solutions are:
import numpy as np
import pandas as pd
import holoviews as hv
import hvplot
import hvplot.pandas
df = pd.DataFrame({
'total_time': np.random.randint(0, 3, 100),
'customer': np.random.randint(0, 5, 100),
'amount': np.random.rand(100)
})
1) Use .hvplot() on your dataframe as follows:
df.hvplot.box(y='amount', by=['total_time', 'customer'], color='customer')
2) Or use .opts(box_color='your_variable') and only use holoviews:
# you can create your plot like this:
hv.BoxWhisker(df, kdims=['total_time', 'customer'], vdims=['amount']).opts(box_color='customer')
# or you can create your plot like this:
hv.Dataset(df).to.box(['total_time', 'customer'], 'amount').opts(box_color='customer')
This results in the following plot where in this case each customer gets it's own color:

3) If you have categorical variables, besides box_color you also have to specify a color map with keyword cmap:
df = pd.DataFrame({
'total_time': np.random.choice(['A', 'B', 'C'], 100),
'customer': np.random.choice(['a', 'b', 'c'], 100),
'amount': np.random.rand(100)
})
df.hvplot.box(
y='amount',
by=['total_time', 'customer'],
color='customer',
cmap='Category20',
legend=False,
)
hv.BoxWhisker(
df,
kdims=['total_time', 'customer'],
vdims=['amount']
).opts(
box_color='customer',
cmap='Category20',
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With