Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to perform .describe() method on variables that have boolean data type in pandas

Tags:

python

pandas

I am trying to get the summary statistics of the columns of a data frame with data type: Boolean.

When I run:df.describe() it only gives me the summary statistics for numerical (in this case float) data types. When I change it to df.describe(include=['O']), it gives me only the object data type.

In either case, the summary statistics for Boolean data types are not provided.

Any suggestion is highly appreciated.

Thanks

like image 472
aghd Avatar asked Oct 15 '25 17:10

aghd


1 Answers

Not sure this is what you want but you can do so with a include="all" argument.

df = pd.DataFrame([[True, 1], [False, 2]])
df.describe(include="all")

        0   1
count   2   2.000000
unique  2   NaN
top     True    NaN
freq    1   NaN
mean    NaN     1.500000
std     NaN     0.707107
min     NaN     1.000000
25%     NaN     1.250000
50%     NaN     1.500000
75%     NaN     1.750000
max     NaN     2.000000

df.describe(include=[bool])  # will also work

        0
count   2
unique  2
top     True
freq    1

Reference

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.describe.html

like image 111
Tai Avatar answered Oct 18 '25 05:10

Tai



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!