In a program I am working on I have to explicitly set the type of a column that contains boolean data. Sometimes all of the values in this column are None. Unless I provide explicit type information Pandas will infer the wrong type information for that column.
Is there a pandas-compatible type that represents a nullable-bool? I want to do something like this, but preserve the Nones:
s = pandas.Series([True, False, None]).astype(bool)
print([v for v in s])
gives:
[True, False, False]
Python's built-in bool class cannot have a Null value. It can only be True or False. And in this case, because bool(None)==False the final Null is lost.
But what if I want to preserve my nulls? Is there a type I can give the column which allows for True, False and None?
I have solved a similar issue with numeric columns: For these I can use the Numpy Int64 which is a pandas-compatible nullable integer type:
s = pandas.Series([1, 2, None, numpy.NaN]).astype("Int64")
print([v for v in s])
gives:
[1, 2, <NA>, <NA>]
Which is exactly right behaviour for nullable integers, I just need a type I can use for my Nullable bools.
boolean dtype should work:
>>> pd.Series([True, False, None])
0 True
1 False
2 None
dtype: object
>>> pd.Series([True, False, None]).astype("boolean")
0 True
1 False
2 <NA>
dtype: boolean
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With