Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use NaN for values that can't be cast using astype

I have a very large Pandas DataFrame that looks like this:

>>> d = pd.DataFrame({"a": ["1", "U", "3.4"]})
>>> d
     a
0    1
1    U
2  3.4

Currently the column is set as an object:

>>> d.dtypes
a    object
dtype: object

I'd like to convert this column to float so that I can use groupby() and compute the mean. When I try it using astype I correctly get an error because of the string that can't be cast to float:

>>> d.a.astype(float)
ValueError: could not convert string to float: 'U'

What I'd like to do is to cast all the elements to float, and then replace the ones that can't be cast by NaNs.

How can I do this?

I tried setting raise_on_error, but it doesn't work, the dtype is still object.

>>> d.a.astype(float, raise_on_error=False)
0      1
1      U
2    3.4
Name: a, dtype: object
like image 936
user1496984 Avatar asked Nov 02 '25 14:11

user1496984


1 Answers

Use to_numeric and specify errors='coerce' to force strings that can't be parsed to a numeric value to become NaN:

>>> pd.to_numeric(d['a'], errors='coerce')
0    1.0
1    NaN
2    3.4
Name: a, dtype: float64
like image 113
Alex Riley Avatar answered Nov 04 '25 03:11

Alex Riley



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!