I have few Jar files/packages in the DBFS and I want an init script (so that I can place that in the automated cluster) to install the Jar package everytime the cluster starts.
I also want to install maven packages from maven using an init script.
I can do all of these using databricks UI. But the requirement is to install libraries using an init script.
To install jar files, just put files onto DBFS, in some location, and in the init script do:
cp /dbfs/<some-location>/*.jar /databricks/jars/
Installation of the maven dependencies is more tricky, because you also will need to fetch dependencies. But it's doable - from the init script:
mvn dependency:get -Dartifact=<maven_coordinates>
find ~/.m2/repository/ -name \*.jar -print0|xargs -0 mv -t /databricks/jars/
rm -rf ~/.m2/
P.S. But really, I recommend to automate such stuff via Databricks Terraform Provider.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With