I want to be able to save dtypes for my df and when I next time read csv I want to prove an array of dtypes.
I tried the following:
types_dic = df.dtypes.to_dict()
np.save("dtypes.npy", types_dic, allow_pickle=True)
dtyp = np.load("dtypes.npy", allow_pickle=True)
df2 = pd.read_csv(join(folder_no_extension, file), dtype=dtyp)
But it does not work --datetime time is not restored...
it also does not work if I create dictionary explicitly
types_dic = {}
for t in df.dtypes:
types_dic[t] = str(df.dtypes[t])
df.dtypes
BN object
School_Year datetime64[ns]
Start_Date datetime64[ns]
Overall_Rating object
Indicator_1.1 object
Indicator_1.2 object
Indicator_1.3 object
Indicator_1.4 object
and
df2.dtypes
BN object
School_Year object
Start_Date object
Overall_Rating object
Indicator_1.1 object
Indicator_1.2 object
Indicator_1.3 object
Indicator_1.4 object
First of all, if you don't have to save your results as a csv file you can instead use pandas methods like to_pickle
or to_parquet
which will preserve the column data types.
Secondly, if you do want to save your results in a csv format and preserve their data types then you can use the parse_dates
argument of read_csv
. To do that you could update to be:
# Save non-date dtypes
non_date_dict = df.dtypes[df.dtypes != '<M8[ns]'].to_dict()
np.save("non_date_dict.npy", non_date_dict, allow_pickle=True)
non_date_dict2 = np.load("non_date_dict.npy", allow_pickle=True)
# Save date dtypes
date_col_list = list(df.dtypes[df.dtypes == '<M8[ns]'].index)
np.save("date_col_list.npy", date_col_list, allow_pickle=True)
date_col_list2 = np.load("date_col_list.npy", allow_pickle=True)
# Load
df2 = pd.read_csv('pandas_dtypes.csv',
dtype=non_date_dict2,
parse_dates=list(date_col_list2))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With