I'm trying to connect from a Databricks notebook to an Azure SQL Datawarehouse using the pyodbc python library. When I execute the code I get this error:
Error: ('01000', "[01000] [unixODBC][Driver Manager]Can't open lib 'ODBC Driver 17 for SQL Server' : file not found (0) (SQLDriverConnect)")
I understand that I need to install this driver but I have no idea how to do it. I have a Databricks cluster runing with Runetime 6.4, Standard_DS3_v2.
By default, Azure Databricks does not have ODBC Driver installed.
Run the following commands in a single cell to install MS SQL ODBC Driver on Azure Databricks cluster.
%sh
curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
curl https://packages.microsoft.com/config/ubuntu/16.04/prod.list > /etc/apt/sources.list.d/mssql-release.list
sudo apt-get update
sudo ACCEPT_EULA=Y apt-get -q -y install msodbcsql17

Instead of using the ODBC driver why don't you use the spark driver of Azure Synapse (aka SQL Data warehouse), databricks clusters have this driver installed by default ( com.databricks.spark.sqldw" ) .
Documentation : https://docs.databricks.com/data/data-sources/azure/synapse-analytics.html#language-python
Example of use :
df = spark.read \
.format("com.databricks.spark.sqldw") \
.option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>") \
.option("tempDir", "wasbs://<your-container-name>@<your-storage-account-
name>.blob.core.windows.net/<your-directory-name>") \
.option("forwardSparkAzureStorageCredentials", "true") \
.option("dbTable", "my_table_in_dw") \
.load()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With