Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas dataframe as input for matplotlib.pyplot.boxplot

I have a pandas dataframe which looks like this:

[('1975801_m', 1      0.203244
10    -0.159756
16    -0.172756
19    -0.089756
20    -0.033756
23    -0.011756
24     0.177244
32     0.138244
35    -0.104756
36     0.157244
40     0.108244
41     0.032244
42     0.063244
45     0.362244
59    -0.093756
62    -0.070756
65    -0.030756
66    -0.100756
73    -0.140756
77    -0.110756
81    -0.100756
84    -0.090756
86    -0.180756
87     0.119244
88     0.709244
102   -0.030756
105   -0.000756
107   -0.010756
109    0.039244
111    0.059244
Name: RTdiff), ('3878418_m', 1637    0.13811
1638   -0.21489
1644   -0.15989
1657   -0.11189
1662   -0.03289
1666   -0.09489
1669    0.03411
1675   -0.00489
1676    0.03511
1677    0.39711
1678   -0.02289
1679   -0.05489
1681   -0.01989
1691    0.14411
1697   -0.10589
1699    0.09411
1705    0.01411
1711   -0.12589
1713    0.04411
1715    0.04411
1716    0.01411
1731    0.06411
1738   -0.25589
1741   -0.21589
1745    0.39411
1746   -0.13589
1747   -0.10589
1748    0.08411
Name: RTdiff)

I would like to use it as input for the mtplotlib.pyplot.boxplot function.

the error I get from matplotlib.pyplot.boxplot(mydataframe) is ValueError: cannot set an array element with a sequence

I tried to use list(mydataframe) instead of mydataframe. That fails with the same error.

I also tried matplotlib.pyplot.boxplot(np.fromiter(mydataframe, np.float)) - that fails with ValueError: setting an array element with a sequence.

like image 572
TheChymera Avatar asked Oct 31 '25 13:10

TheChymera


1 Answers

It's not clear that your data are in a DataFrame. It appears to be a list of Series objects.

Once it's really in a DataFrame, the trick here is the create your figure and axes ahead of time and use the **kwargs that you would normally use with matplotlib.axes.boxplot. You also need to make sure that your data is a DataFrame and not a Series

import numpy as np
import matplotlib.pyplot as plt
import pandas

fig, ax = plt.subplots()
df = pandas.DataFrame(np.random.normal(size=(37,5)), columns=list('ABCDE'))
df.boxplot(ax=ax, positions=[2,3,4,6,8], notch=True, bootstrap=5000)
ax.set_xticks(range(10))
ax.set_xticklabels(range(10))
plt.show()

Which gives me:boxplots

Failing that, you can take a similar approach, looping through the columns you would like to plot using your ax object directly.

import numpy as np
import matplotlib.pyplot as plt
import pandas

df = pandas.DataFrame(np.random.normal(size=(37,5)), columns=list('ABCDE'))
fig, ax = plt.subplots()
for n, col in enumerate(df.columns):
    ax.boxplot(df[col], positions=[n+1], notch=True)

ax.set_xticks(range(10))
ax.set_xticklabels(range(10))
plt.show()

Which gives: more boxplots

like image 127
Paul H Avatar answered Nov 03 '25 05:11

Paul H



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!