Example of data that I want to replace

Data has the following attributes
here is what I did
enter code here
#Buying price generalization
df["Buying_Price"]=df["Buying_Price"].replace({"vhigh":4})
df["Buying_Price"]=df["Buying_Price"].replace({"high":3})
df["Buying_Price"]=df["Buying_Price"].replace({"med":2})
df["Buying_Price"]=df["Buying_Price"].replace({"low":1})
#Maintanace generalization
df["Maintanance_price"]=df["Maintanance_price"].replace({"vhigh":4})
df["Maintanance_price"]=df["Maintanance_price"].replace({"high":3})
df["Maintanance_price"]=df["Maintanance_price"].replace({"med":2})
df["Maintanance_price"]=df["Maintanance_price"].replace({"low":1})
#lug_boot generalization
df["Lug_boot"]=df["Lug_boot"].replace({"small":1})
df["Lug_boot"]=df["Lug_boot"].replace({"med":2})
df["Lug_boot"]=df["Lug_boot"].replace({"big":3})
#Safety Generalization
df["Safety"]=df["Safety"].replace({"low":1})
df["Safety"]=df["Safety"].replace({"med":2})
df["Safety"]=df["Safety"].replace({"big":3})
print(df.head())
while printing it showed:
Cannot compare types 'ndarray(dtype=int64)' and 'str'
Some of you string you passed to replace with an (int)value, actually is an ndarray of int64 values.
You only have int64( here actually ndarray(dtype=int64)) type data in this column.
See document pandas.Dataframe.replace().
replace() try to seek and compare them with the str values you passed.
df["Buying_Price"]=df["Buying_Price"].replace({"vhigh":4})
find all "vhigh" value and compare with the value currently contains, the replace it with 4.
At the comparing it fails as try to compare str data with int64 ('ndarray(dtype=int64)')
A brief example to simulate this:
import pandas as pd
import numpy as np
a = np.array([1])
df = pd.DataFrame({"Maintanance_price": a})
df["Maintanance_price"] = df["Maintanance_price"].replace({"a":1})
print(df)
Out:
TypeError: Cannot compare types 'ndarray(dtype=int64)' and 'str'
I was facing the same issue and what worked for me was converting the datatype of the feature to an object type.
train['Some_feature']=train.Some_feature.astype(object)
Hope it helps.
You could try the following code:
df['Maintanance_price'].replace(to_replace = ['low', 'med','high','vhigh'], value =[1,2,3,4], inplace=True)
df.head()
Also, as suggested by @ouiemboughrra, check if the values have already been converted to numeric, in case you have rerun the cell.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With