Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a nullable boolean type I can use in a Pandas dataframe?

Tags:

pandas

In a program I am working on I have to explicitly set the type of a column that contains boolean data. Sometimes all of the values in this column are None. Unless I provide explicit type information Pandas will infer the wrong type information for that column.

Is there a pandas-compatible type that represents a nullable-bool? I want to do something like this, but preserve the Nones:

s = pandas.Series([True, False, None]).astype(bool)
print([v for v in s])

gives:

[True, False, False]

Python's built-in bool class cannot have a Null value. It can only be True or False. And in this case, because bool(None)==False the final Null is lost.

But what if I want to preserve my nulls? Is there a type I can give the column which allows for True, False and None?

I have solved a similar issue with numeric columns: For these I can use the Numpy Int64 which is a pandas-compatible nullable integer type:

s = pandas.Series([1, 2, None, numpy.NaN]).astype("Int64")
print([v for v in s])

gives:

[1, 2, <NA>, <NA>]

Which is exactly right behaviour for nullable integers, I just need a type I can use for my Nullable bools.

like image 212
Salim Fadhley Avatar asked Nov 06 '25 17:11

Salim Fadhley


1 Answers

boolean dtype should work:

>>> pd.Series([True, False, None])
0     True
1    False
2     None
dtype: object

>>> pd.Series([True, False, None]).astype("boolean")
0     True
1    False
2     <NA>
dtype: boolean
like image 170
Corralien Avatar answered Nov 09 '25 08:11

Corralien



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!