I am trying to save a data frame as a parquet file on Databricks, getting the ArrowTypeError.
Databricks Runtime Version: 7.6 ML (includes Apache Spark 3.0.1, Scala 2.12)
ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column inv_yr with type int32')
The issue you are facing originates from the fact that you are using an old pyarrow wheel with the latest numpy 1.20 release. You are running into the bug "PyArray_DescrCheck doesn't work anymore if the consumer library was compiled with an older NumPy version ". Either update your pyarrow version or downgrade to numpy<1.20.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With