Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

install jar file in dbfs and mvn packages using init script

I have few Jar files/packages in the DBFS and I want an init script (so that I can place that in the automated cluster) to install the Jar package everytime the cluster starts.

I also want to install maven packages from maven using an init script.

I can do all of these using databricks UI. But the requirement is to install libraries using an init script.

like image 792
user1860447 Avatar asked Sep 12 '25 16:09

user1860447


1 Answers

To install jar files, just put files onto DBFS, in some location, and in the init script do:

cp /dbfs/<some-location>/*.jar /databricks/jars/

Installation of the maven dependencies is more tricky, because you also will need to fetch dependencies. But it's doable - from the init script:

  • Download and unpack Maven
  • Execute:
mvn dependency:get -Dartifact=<maven_coordinates>
  • move downloaded jars:
find ~/.m2/repository/ -name \*.jar -print0|xargs -0 mv -t /databricks/jars/
  • (optional) remove not necessary directory:
rm -rf ~/.m2/

P.S. But really, I recommend to automate such stuff via Databricks Terraform Provider.

like image 50
Alex Ott Avatar answered Sep 14 '25 10:09

Alex Ott