Let's say I have a data frame like:
test = pandas.DataFrame([[0,1],[0,1],[0,2],[1,0],[1,0],[1,1],[1,2],[1,2]], columns=["A","B"])
So, for value 1 in the first column, the values are 0,1,2 in the second column, with different frequency.
Say I want to create a histogram for the how many times I see 0, 1 and 2, so I do:
ax = test[test["A"]==1]["B"].hist(bins=3)

However, I get a picture that has three bins, the first one going roughly from 0 to 0.7, the second from 0.7 to 1.4, and the third one from 1.4 to 2, while I want each bin centered around 0, 1 and 2. I even tried using ax.set_lim, but it did not work.
How do I make my histogram be centered around the values I am interested in (so one bin going from -0.5 to 0.5, one from 0.5 to 1.5 and one from 1.5 to 2.5 for example)?
P.S. I understand this answer has a workaround, I would like a solution that uses pandas.hist, if possible.
You can do this by using list/sequence for bins argument.
test = pd.DataFrame([[0,1],[0,1],[0,2],[1,0],[1,0],[1,1],[1,2],[1,2]], columns=["A","B"])
test
df = test[test["A"]==1]["B"]
df.hist(bins = np.arange(4)-0.5)
I use 4 for arange because this will produce [0,1,2,3] for xtick labels, a bar will be placed at 0-1 , one more at [1,2] and then one more at [2-3], i can move all of them right to center align by subtracting 0.5 from all.
which results in

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With