Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Defining bin width/x-axis scale in Matplotlib histogram

I am generating histograms with matplotlib.

I need the bins to be of unequal width as I'm mostly interested in the lowest bins. Right now I'm doing this:

plt.hist(hits_array, bins = (range(0,50,10) + range(50,550,50)))

This creates what I want (the first 5 bins have a width of 10, the rest of 50), but the first five bins are, of course, narrower than the latter ones, as all bins are displayed on the same axis.

Is there a way to influence the x-axis or histogram itself so I can break the scale after the first 5 bins, so all bins are displayed as equally wide?

(I realize that this will create a distorted view, and I'm fine with that, though I wouldn't mind a bit of space between the two differently scaled parts of the axis.)

Any help will be greatly appreciated. Thanks!

like image 233
CodingCat Avatar asked Oct 14 '25 08:10

CodingCat


2 Answers

I had a similar question here, and the answer was to use a dirty hack. Matplotlib histogram with collection bin for high values

So with the following code, you get the ugly histogram you already have.

def plot_histogram_04():
    limit1, limit2 = 50, 550
    binwidth1, binwidth2 = 10, 50    
    data = np.hstack((np.random.rand(1000) * limit1, np.random.rand(100) * limit2))

    bins = range(0, limit1, binwidth1) + range(limit1, limit2, binwidth2)

    plt.subplots(1, 1)
    plt.hist(data, bins=bins)
    plt.savefig('my_plot_04.png')
    plt.close()

enter image description here

In order to make the bins equal width, you indeed have to make them equal width! This means manipulating your data such that they all fall in bins with equal width, and then play around with the xlabel.

def plot_histogram_05():
    limit1, limit2 = 50, 550
    binwidth1, binwidth2 = 10, 50

    data = np.hstack((np.random.rand(1000) * limit1, np.random.rand(100) * limit2))

    orig_bins = range(0, limit1, binwidth1) + range(limit1, limit2 + binwidth2, binwidth2)
    data = [(i - limit1) / (binwidth2 / binwidth1) + limit1 
            if i >= limit1 else i for i in data]
    bins = range(0, limit2 / (binwidth2 / binwidth1) + limit1, binwidth1)

    _, ax = plt.subplots(1, 1)
    plt.hist(data, bins=bins)

    xlabels = np.array(orig_bins, dtype='|S3')
    N_labels = len(xlabels)
    print xlabels
    print bins
    plt.xlim([0, bins[-1]])
    plt.xticks(binwidth1 * np.arange(N_labels))
    ax.set_xticklabels(xlabels)

    plt.savefig('my_plot_05.png')
    plt.close()

enter image description here

like image 120
physicalattraction Avatar answered Oct 18 '25 03:10

physicalattraction


You can use bar and there is no need to split the axis. Here is an example,

import matplotlib.pylab as plt
import numpy as np

data = np.hstack((np.random.rand(1000)*50,np.random.rand(100)*500))
binwidth1,binwidth2=10,50
bins=range(0,50,binwidth1)+range(50,550,binwidth2)

fig,(ax) = plt.subplots(1, 1)

y,binEdges=np.histogram(data,bins=bins)

ax.bar(0.5*(binEdges[1:]+binEdges[:-1])[:5], y[:5],width=.8*binwidth1,align='center')
ax.bar(0.5*(binEdges[1:]+binEdges[:-1])[5:], y[5:],width=.8*binwidth1,align='center')
plt.show()

enter image description here

In case you really want to split the axis have a look here.

like image 35
imsc Avatar answered Oct 18 '25 02:10

imsc



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!