pandas.DataFrame.astype(float) raises ValueError: could not convert string to float error.
What's the best way to find which cell(s) caused this to happen?
I think you can first fillna with some number, e.g. 1, apply function to_numeric with parameter errors='coerce' and if value cannot be converted is filled by NaN. Then you check isnull with any. Last use boolean indexing for finding columns and index with NaN values - it means there are obviously string values or other values, which cannot be converted to numeric.
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':['a','b','',5],
'B':[4,5,6,5],
'C':[np.nan,8,9,7]})
print (df)
A B C
0 a 4 NaN
1 b 5 8.0
2 6 9.0
3 5 5 7.0
a = (df.fillna(1).apply(lambda x: pd.to_numeric(x, errors='coerce')))
print (a)
A B C
0 NaN 4 1.0
1 NaN 5 8.0
2 NaN 6 9.0
3 5.0 5 7.0
b = (pd.isnull(a))
print (b)
A B C
0 True False False
1 True False False
2 True False False
3 False False False
print (b.any())
A True
B False
C False
dtype: bool
print (b.any()[b.any()].index)
Index(['A'], dtype='object')
print (b.any(axis=1))
0 True
1 True
2 True
3 False
dtype: bool
print (b.any(axis=1)[b.any(axis=1)].index)
Int64Index([0, 1, 2], dtype='int64')
#df is not modified
print (df)
A B C
0 a 4 NaN
1 b 5 8.0
2 6 9.0
3 5 5 7.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With