Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert spark dataframe to Delta table on azure databricks - warning

I am saving my spark dataframe on azure databricks and create delta lake table.

It works fine, however I am getting this warning message while execution.

Question- Why I am still getting this message, even with my table is delta table. What is wrong with my approach, any inputs is greatly appreciated.

Warning Message

This query contains a highly selective filter. To improve the performance of queries, convert the table to Delta and run the OPTIMIZE ZORDER BY command on the table

Code

dfMerged.write\
              .partitionBy("Date")\
              .mode("append")\
              .format("delta")\
              .option("overwriteSchema", "true")\
              .save("/mnt/path..")

spark.sql("CREATE TABLE DeltaUDTable USING DELTA LOCATION '/mnt/path..'")

Some more details

  1. I've mounted azure storage gen 2 to above mount location.
  2. databricks runtime - 6.4 (includes Apache Spark 2.4.5, Scala 2.11)
like image 786
Idleguys Avatar asked Oct 21 '25 09:10

Idleguys


2 Answers

we can save the dataframe as delta table direcly using below code block

df.write.mode("overwrite").saveAsTable("table_loc")
like image 126
Pavan elisetty Avatar answered Oct 24 '25 00:10

Pavan elisetty


The warning message is clearly misleading as you already have a Delta option. Ignore it.

like image 37
thebluephantom Avatar answered Oct 24 '25 00:10

thebluephantom



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!