Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Installing Remote R Package to Databricks Cluster Rather than Notebook

I am trying to install the prophet package to Databricks. I want to install it directly to my cluster rather than my notebook. Below is the following code to install it to the notebook:

Sys.setenv(DOWNLOAD_STATIC_LIBV8 = 1)
remotes::install_github("jeroen/V8")
devtools::install_version("rstantools", version = "2.0.0")
install.packages('prophet')

However, I want to download it directly to my cluster. How would I add this snippet of code to install the prophet package to my Databricks cluster?

Here are the options I see when attempting to install a package to a cluster:

enter image description here

Attempt at downloading directly to cluster:

Command 1

%python
dbutils.fs.mkdirs("dbfs:/databricks/scripts/")

Command 2

%python
dbutils.fs.put("/databricks/scripts/prophet_install_script.R","""
Sys.setenv(DOWNLOAD_STATIC_LIBV8 = 1)
remotes::install_github(\"jeroen/V8\")
devtools::install_version(\"rstantools\", version = \"2.0.0\")
install.packages('prophet')
""", True)

Command 3

%python
dbutils.fs.put("/databricks/scripts/stock_cluster_init_script_v1.sh","""
#!/bin/bash
R CMD BATCH /dbfs/databricks/scripts/prophet_install_script.R
""", True)

Then I went to my new cluster and ran it with this init script:

enter image description here enter image description here

It then provided me the following error:

{
  "reason": {
    "code": "INIT_SCRIPT_FAILURE",
    "type": "CLIENT_ERROR",
    "parameters": {
      "instance_id": "i-0c71b23287fb81530",
      "databricks_error_message": "Cluster scoped init script dbfs:/databricks/scripts/stock_cluster_init_script_v1.sh failed: Script exit status is non-zero"
    }
  }
}
like image 594
Nick Knauer Avatar asked Jan 22 '26 16:01

Nick Knauer


1 Answers

If you aren't on the community edition, then you can use the cluster init script to perform this installation (you can install other libraries there as well).

Just put R commands into a file on DBFS (see linked docs to see how to use dbutils.fs.put for that - you also need to explicitly set CRAN mirror):

local({r <- getOption("repos")
       r["CRAN"] <- "http://cran.r-project.org" 
       options(repos=r)
})
Sys.setenv(DOWNLOAD_STATIC_LIBV8 = 1)
remotes::install_github(\"jeroen/V8\")
devtools::install_version(\"rstantools\", version = \"2.0.0\")
install.packages('prophet')

and then create init script with following content:

#!/bin/bash

Rscript --verbose  /dbfs/<path-to-file>

please note that <path-to-file> should be withouth dbfs:

like image 68
Alex Ott Avatar answered Jan 24 '26 07:01

Alex Ott



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!