I am reading data from multiple csv, applying some filters and merging them into a dataframe. The original data in csv is only numbers/fractions. Pandas is converting them into float. Thats OK but I need just 1 column to remain as it is. To convert that back to integer, I tried :
df['PRICE']=df['PRICE'].astype(int)
This works wonderful for whole numbers. However, this is also converting all decimals into whole numbers. Meaning, even
1162.50 --> 1162
I am looking to change it to something like :
1152.0 --> 1152
1216.50 --> 1216.5
1226.65 --> 1226.65
Thanks in advance
You can re-initialise the dataframe using the pd.DataFrame constructor with dtype=object:
print(df)
Col1
0 1152.00
1 1216.50
2 1226.65
df = pd.DataFrame(df, dtype=object)
print(df)
Col1
0 1152
1 1216.5
2 1226.65
Or, if it's just one column you want to convert, you can use the pd.Series constructor the same way:
df.Col1 = pd.Series(df.Col1, dtype=object)
print(df)
Col1
0 1152
1 1216.5
2 1226.65
Statutory Warning: Having mixed types in a dataframe kills all the optimisation and speedup benefits that pandas/numpy offers for pure numeric types.
The method above outlines an approach to retain numeric properties, but if you want to save to CSV, you must convert to string and truncate, otherwise they will be coerced to floats when saving. This is how you'd do that:
out = df.astype(str).replace('\.0+$', '', regex=True)
print(out)
Col1
0 1152
1 1216.5
2 1226.65
out.to_csv('out.csv')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With