Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I can't convert df to parquet by data type error

I'm trying to convert a pandas dataframe to parquet, but I'm getting an error "Exptected bytes, got a 'int' object", 'Conversion failed for column xxxxxxxx with type object') This table in Excel has numbers and strings, it is like dtype 'object', even so it gives error. I've tried df['xxxxxxxx'].astype(str), df['xxxxxxxx'].astype('data_type'), but none of them work. I tried do convert to parquet with AWS Wrangler and Pyarrow

like image 555
wanuke Avatar asked Oct 21 '25 12:10

wanuke


1 Answers

As mentioned in this other question

A general type of the column could work. So try:

df['xxxxxxxx'] = df['xxxxxxxx'].astype(str)
df.to_parquet(path)

However, this is not a good practice as this will hide the type error, you should consider fixing the type of the column by separating data or be aware that this columnhas different types. Pandas has a warning included for these type of errors:

   Columns (# of column) have mixed types. Specify dtype option on import or set low_memory=False.
like image 170
Alejandro Henao Avatar answered Oct 23 '25 00:10

Alejandro Henao



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!