What is the naming convention for YARN containers used by Spark?

Question

When running Spark jobs on top of YARN (yarn-cluster mode), YARN creates the workers in containers that have a name that looks something like this: container_e116_1495951495692_11203_01_000105

What is the naming convention for the containers?

Here is my educated guess:

container - Just a constant string, obviously
e116 - No Idea what this is. Maybe something to do with the YARN version.
1495951495692_11203 - The application-id
01 - An attempt counter?
000105 - This is probably just an increment integer.

If there is any concrete information about this (or even a refference to the right place in the code), I'd be glad to hear about it.

In light of the above, when running a Spark job on YARN, How can I know which containers belong to which executor?

Nadav Gruner · Accepted Answer

You can look at https://hadoop.apache.org/docs/current/api/org/apache/hadoop/yarn/api/records/ContainerId.html

A string representation of containerId. The format is container_eepoch_clusterTimestamp_appId_attemptId_containerId when epoch is larger than 0 (e.g. container_e17_1410901177871_0001_01_000005). epoch is increased when RM restarts or fails over. When epoch is 0, epoch is omitted (e.g. container_1410901177871_0001_01_000005).

What is the naming convention for YARN containers used by Spark?

Tags:

naming-conventions

apache-spark

hadoop-yarn

summerbulb

1 Answers

Nadav Gruner

Recent Activity

Donate For Us

What is the naming convention for YARN containers used by Spark?

Tags:

naming-conventions

apache-spark

hadoop-yarn

summerbulb

1 Answers

Nadav Gruner

Related questions

Recent Activity

Donate For Us