I am currently trying to deploy a spark example jar on a Kubernetes cluster running on IBM Cloud.
If I try to follow these instructions to deploy spark on a kubernetes cluster, I am not able to launch Spark Pi, because I am always getting the error message:
The system cannot find the file specified
after entering the code
bin/spark-submit \
    --master k8s://<url of my kubernetes cluster> \
    --deploy-mode cluster \
    --name spark-pi \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.executor.instances=5 \
    --conf spark.kubernetes.container.image=<spark-image> \
    local:///examples/jars/spark-examples_2.11-2.3.0.jar
I am in the right directory with the spark-examples_2.11-2.3.0.jar file in the examples/jars directory.
This post details how to deploy Spark on a Kubernetes cluster. Minikube is a tool used to run a single-node Kubernetes cluster locally. Follow the official Install Minikube guide to install it along with a Hypervisor (like VirtualBox or HyperKit ), to manage virtual machines, and Kubectl, to deploy and manage apps on Kubernetes.
Run this command on the directory you have created the code. For the spark-submit to reach the cluster we will activate Kubernetes proxy so that the cluster starts listening to the communications on its IP. Finally, we will submit the application using spark-submit from the master node.
Apache Spark - Deployment. Spark application, using spark-submit, is a shell command used to deploy the Spark application on a cluster. It uses all respective cluster managers through a uniform interface. Therefore, you do not have to configure your application for each one.
Yes, the somewhat weird k8s:// prefix is needed. define name for Spark Driver (that’s also what your pod’s name will start with) run the Spark Executor with 2 replicas on Kubernetes that will be spawned by your Spark Driver define the Kubernetes image pull policy to Never, so the local image with that name can be used.
Ensure your.jar file is present inside the container image.
Instruction tells that it should be there:
Finally, notice that in the above example we specify a jar with a specific URI with a scheme of local://. This URI is the location of the example jar that is already in the Docker image.
In other words, local:// scheme is removed from local:///examples/jars/spark-examples_2.11-2.3.0.jar and the path /examples/jars/spark-examples_2.11-2.3.0.jar is expected to be available in a container image.
Please make sure this absolute path /examples/jars/spark-examples_2.11-2.3.0.jar is exists.
Or you are trying loading a jar file in current directory, In this case it should be an relative path like local://./examples/jars/spark-examples_2.11-2.3.0.jar.
I'm not sure if spark-submit accepts relative path or not.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With