Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boxplot placed on time axis

I want to place a series of (matplotlib) boxplots in a time axis. They are series of measurements taken on different days along a year. The dates are not evenly distributed and I am interested on the variation along time.


Easy version

I have a pandas DataFrame with indexes and series of numbers, more or less like this: (notice the indexes):

np.random.seed(12345)
data = np.array( [ np.random.normal( i, 1, 10 ) for i in range(3) ] )
ii = np.array([ 3, 5, 8 ] )
df = pd.DataFrame( data=data, index=ii )

For each index, I need to make a boxplot, which is no problem:

plt.boxplot( [ df.loc[i] for i in df.index ], vert=True, positions=ii )

enter image description here

Time version

The problem is, I need to place the boxes in a time axis, i.e. place the boxes on a concrete date

np.random.seed(12345)
data = np.array( [ np.random.normal( i, 1, 10 ) for i in range(3) ] )
dates = pd.to_datetime( [ '2015-06-01', '2015-06-15', '2015-08-30' ] )
df = pd.DataFrame( data=data, index=dates )
plt.boxplot( [ df.loc[i] for i in df.index ], vert=True )

enter image description here

However, if I incorporate the positions:

ax.boxplot( [ df.loc[i] for i in df.index ], vert=True, positions=dates )

I get an error:

TypeError: Cannot compare type 'Timedelta' with type 'float'

A look up on the docs shows:

plt.boxplot?

positions : array-like, default = [1, 2, ..., n]

Sets the positions of the boxes. The ticks and limits are automatically set to match the positions.


Wished time version

This code is intended to clarify, narrow down the problem. The boxes should apppear there, where the blue points are placed in the next figure.

np.random.seed(12345)
data = np.array( [ np.random.normal( i, 1, 10 ) for i in range(3) ] )
dates = pd.to_datetime( [ '2015-06-01', '2015-06-15', '2015-08-30' ] )
df = pd.DataFrame( data=data, index=dates )

fig, ax = plt.subplots( figsize=(10,5) )
x1 = pd.to_datetime( '2015-05-01' )
x2 = pd.to_datetime( '2015-09-30' )
ax.set_xlim( [ x1, x2 ] )

# ax.boxplot( [ df.loc[i] for i in df.index ], vert=True ) # Does not throw error, but plots nothing (out of range)
# ax.boxplot( [ df.loc[i] for i in df.index ], vert=True, positions=dates ) # This is what I'd like (throws TypeError)

ax.plot( dates, [ df.loc[i].mean() for i in df.index ], 'o' )  # Added to clarify the positions I aim for

enter image description here


Is there a method to place boxplots in a time axis?


I am using:

python: 3.4.3 + numpy: 1.11.0 + pandas: 0.18.0 + matplotlib: 1.5.1

like image 558
Luis Avatar asked Sep 12 '25 20:09

Luis


1 Answers

So far, my best solution is to convert the units of the axis into a suitable int unit and plot everything accordingly. In my case, those are days.

np.random.seed(12345)
data = np.array( [ np.random.normal( i, 1, 10 ) for i in range(3) ] )
dates = pd.to_datetime( [ '2015-06-01', '2015-06-15', '2015-08-30' ] )
df = pd.DataFrame( data=data, index=dates )

fig, ax = plt.subplots( figsize=(10,5) )
x1 = pd.to_datetime( '2015-05-01' )
x2 = pd.to_datetime( '2015-09-30' )
pos = ( dates - x1 ).days

ax.boxplot( [ df.loc[i] for i in df.index ], vert=True, positions=pos )
ax.plot( pos, [ df.loc[i].mean() for i in df.index ], 'o' )

ax.set_xlim( [ 0, (x2-x1).days ] )
ax.set_xticklabels( dates.date, rotation=45 )

enter image description here

The boxplots are placed on their correct position, but the code seems a bit cumbersome to me.

More importantly: The units of the x-axis are not "time" anymore.

like image 189
Luis Avatar answered Sep 14 '25 10:09

Luis