I have a Dataframe and I want to dynamically pass the columns names through widgets in a select statement in my Databricks Notebook. How can I do it?
I am using the below code
df1 = spark.sql("select * from tableraw")
where df1 has columns "tablename" and "layer"
df = df1.select("tablename", "layer")
Now, our requirement is to use the values of the widgets to select those columns, something like:
df = df1.select(dbutils.widget.get("tablename"), dbutils.widget.get("datalayer"))
%python
dbutils.widgets.text(name = "pythonTextWidget", defaultValue = "columnName")
dbutils.widgets.dropdown(name = "pythonDropdownWidget", defaultValue = "col1", choices = ["col1", "col2", "col3"])
%scala
dbutils.widgets.text("scalaTextWidget", "columnName")
dbutils.widgets.dropdown("scalaDropdownWidget", "col1", Seq("col1", "col2", "col3"))
%python
textColumn = dbutils.widgets.get("pythonTextWidget")
dropdownColumn = dbutils.widgets.get("pythonDropdownWidget")
%scala
val textColumn = dbutils.widgets.get("scalaTextWidget")
val dropdownColumn = dbutils.widgets.get("scalaDropdownWidget")
%python
from pyspark.sql.functions import col
df.select(col(textColumn), col(dropdownColumn))
%scala
import org.apache.spark.sql.functions.col
df.select(col(textColumn), col(dropdownColumn))
The Widgets in SQL work slightly different compared to Python/Scala in the sense that you cannot use them to select a column. However, widgets can be used to dynamically adjust filters.
%sql CREATE WIDGET text sqlTextWidget DEFAULT "ACTIVE"
%sql CREATE WIDGET DROPDOWN sqlDropdownWidget DEFAULT "ACTIVE" CHOICES SELECT DISTINCT Status FROM <databaseName>.<tableName> WHERE Status IS NOT NULL
%sql SELECT * FROM <databaseName>.<tableName> WHERE Status = getArgument("sqlTextWidget")
More background can be found on the Databricks documentation on Widgets.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With