Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to allocate more executors per worker in Standalone cluster mode?

Tags:

apache-spark

I use Spark 1.3.0 in a cluster of 5 worker nodes with 36 cores and 58GB of memory each. I'd like to configure Spark's Standalone cluster with many executors per worker.

I have seen the merged SPARK-1706, however it is not immediately clear how to actually configure multiple executors.

Here is the latest configuration of the cluster:

spark.executor.cores = "15"
spark.executor.instances = "10"
spark.executor.memory = "10g"

These settings are set on a SparkContext when the Spark application is submitted to the cluster.

like image 422
Rich Avatar asked Apr 29 '15 21:04

Rich


People also ask

Which command specifies the number of executor cores for a Spark standalone cluster per executor process?

You can also pass an option --total-executor-cores <numCores> to control the number of cores that spark-shell uses on the cluster.

Can worker have multiple executors Spark?

Executors Scheduling The number of cores assigned to each executor is configurable. When spark. executor. cores is explicitly set, multiple executors from the same application may be launched on the same worker if the worker has enough cores and memory.

How many executors does a node have per worker?

Number of executors per node = 30/10 = 3. Memory per executor = 64GB / 3 = 21GB.

How do you determine the number of executors and memory in Spark?

Number of available executors = (total cores/num-cores-per-executor) = 150/5 = 30. Leaving 1 executor for ApplicationManager => --num-executors = 29. Number of executors per node = 30/10 = 3. Memory per executor = 64GB/3 = 21GB.


1 Answers

In stand-alone mode, by default, all the resources on the cluster are acquired as you launch an application. You need to specify the number of executors you need using the --executor-cores and the --total-executor-cores configs.

For example, if there is 1 worker (1 worker == 1 machine in your cluster, it's a good practice to have only 1 worker per machine) in your cluster which has 3 cores and 3G available in its pool (this is specified in spark-env.sh), when you submit an application with --executor-cores 1 --total-executor-cores 2 --executor-memory 1g, two executors are launched for the application with 1 core and 1g each. Hope this helps!

like image 183
void Avatar answered Oct 04 '22 17:10

void