Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Let axvline end at certain y-value

I was plotting a histogram with pandas and pyplot. For additional information, I added lines at certain percentiles of the histogram distribution. I already found out that you can make a axvline appear with a certain % height of the whole chart:

cycle_df = pd.DataFrame(results)
plot = cycle_df.plot.hist(bins=30, label='Cycle time')

plot.axvline(np.percentile(cycle_df,5), label='5%', color='red', linestyle='dashed', linewidth=2, ymax=0.25)
plot.axvline(np.percentile(cycle_df,95), label='95%', color='blue', linestyle='dashed', linewidth=2, ymax=0.25)

Is it possible to let the red/blue lines end exactly where the histogram bar ends too to look smooth?

enter image description here

like image 735
VeryMary Avatar asked Oct 18 '25 21:10

VeryMary


1 Answers

That's definitely possible but I'm not sure if it's easy to do with pandas.DataFrame.hist because that doesn't return the histogram data. You would have to do another matplotlib.pyplot.hist (or numpy.hist) to get the actual bins and heights.

However if you use matplotlib directly this would work:

import matplotlib.pyplot as plt

plt.style.use('ggplot')

import numpy as np

data = np.random.normal(550, 20, 100000)

fig, ax = plt.subplots(1, 1)
plot = ax.hist(data, bins=30, label='Cycle time', color='darkgrey')

ps = np.percentile(data, [5, 95])
_, ymax = ax.get_ybound()

# Search for the heights of the bins in which the percentiles are
heights = plot[0][np.searchsorted(plot[1], ps, side='left')-1]

# The height should be the bin-height divided by the y_bound (at least if y_min is zero)
ax.axvline(ps[0], label='5%', color='red', linestyle='dashed', linewidth=2, ymax=heights[0] / ymax)
ax.axvline(ps[1], label='95%', color='blue', linestyle='dashed', linewidth=2, ymax=heights[1] / ymax)
plt.legend()

enter image description here

In case you don't want to bother with calculating the relative height, you could also use Lines2D from matplotlib.lines

import matplotlib.pyplot as plt
import matplotlib.lines as mlines

plt.style.use('ggplot')

import numpy as np

data = np.random.normal(550, 20, 100000)

fig, ax = plt.subplots(1, 1)
plot = ax.hist(data, bins=30, label='Cycle time', color='darkgrey')

ps = np.percentile(data, [5, 95])

# Search for the heights of the bins in which the percentiles are
heights = plot[0][np.searchsorted(plot[1], ps, side='left')-1]

# The height should be the bin-height divided by the y_bound (at least if y_min is zero)
l1 = mlines.Line2D([ps[0], ps[0]], [0, heights[0]], label='5%', color='red', linestyle='dashed', linewidth=2)
l2 = mlines.Line2D([ps[1], ps[1]], [0, heights[1]], label='95%', color='blue', linestyle='dashed', linewidth=2)
ax.add_line(l1)
ax.add_line(l2)
plt.legend()

enter image description here

like image 138
MSeifert Avatar answered Oct 21 '25 09:10

MSeifert