In simple terms, how does Spark schedule jobs?

Question

Just wondering how does Spark schedule jobs? In simple terms please, I have read many descriptions of how it does it but they were too complicated to understand.

Sathish · Accepted Answer

At high level, when any action is called on the RDD, Spark creates the DAG and submits to the DAG scheduler.

The DAG scheduler divides operators into stages of tasks. A stage is comprised of tasks based on partitions of the input data. The DAG scheduler pipelines operators together. For e.g. Many map operators can be scheduled in a single stage. The final result of a DAG scheduler is a set of stages.
The Stages are passed on to the Task Scheduler.The task scheduler launches tasks via cluster manager.(Spark Standalone/Yarn/Mesos). The task scheduler doesn't know about dependencies of the stages.
The Worker executes the tasks on the Slave.

look at this answer for more information

In simple terms, how does Spark schedule jobs?

Tags:

cloud

apache-spark

user2768498

1 Answers

Sathish

Recent Activity

Donate For Us

In simple terms, how does Spark schedule jobs?

Tags:

cloud

apache-spark

user2768498

1 Answers

Sathish

Related questions

Recent Activity

Donate For Us