Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

set applicationTags property in YARN for jobs submitted by CLI

I'd like to track some related applications in YARN. They're submitted via command line, e.g.

yarn jar hadoop-mapreduce-examples.jar pi 10 100

Python has a really easy-to-use YARN client that returns the following:

finalStatus = SUCCEEDED
id = application_1458083392566_0929
state = FINISHED
name = QuasiMonteCarlo
applicationType = MAPREDUCE
user = awoolford
applicationTags = 
[...etc...]

I notice there's an applicationTags property. This would be an ideal way to track groups of related applications. I tried setting it via HADOOP_CLIENT_OPTS, e.g.

HADOOP_CLIENT_OPTS="-DapplicationTags=batch123,chunk62" hadoop jar [...etc...]

... but the applicationTags string didn't show up in YARN when I tried to retrieve them via the Python client.

Q) How can I submit a YARN job and populate the applicationTags property from the command line?

like image 364
Alex Woolford Avatar asked Dec 08 '25 08:12

Alex Woolford


1 Answers

The property that needs to be set is called mapreduce.job.tags (see Jira). So, for the calculate Pi MapReduce example, you'd tag the job like this:

yarn jar hadoop-mapreduce-examples.jar pi -Dmapreduce.job.tags=myJobTag 10 100

Credit to Neerja Khattar from Cloudera for figuring out how to do this.

like image 200
Alex Woolford Avatar answered Dec 11 '25 02:12

Alex Woolford



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!