Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get the memory, CPU and disk usage for Yarn application

I want to ask after I've run a Yarn application, how can I get the total memory and CPU usage of that application.

I used to use the resource manager UI to get all the information. But aside from getting this information from the UI, are there any commands that I can use to achieve the information.

like image 353
Fihop Avatar asked Oct 26 '25 16:10

Fihop


2 Answers

Using yarn application -status command, you can get the Aggregate Resource Allocation for an application.

For e.g. when I type yarn application -status application_1452267331813_0009 (for one of my completed applications), one of the rows returned is:

Aggregate Resource Allocation : 46641 MB-seconds, 37 vcore-seconds

This gives an aggregate memory and CPU allocations in seconds. You can check this answer: Aggregate Resource Allocation for a job in YARN, to understand the meaning of this output.

Apart from this, as of now, there are no other memory or CPU related metrics exposed through CLI.

like image 104
Manjunath Ballur Avatar answered Oct 29 '25 15:10

Manjunath Ballur


The command yarn top gives application level resource utilization and elapsed run time.

See this thread for a way to pipe the output to a file https://stackoverflow.com/a/53782200/12693167

--help info:

usage: yarn top

-cols Number of columns on the terminal

-delay The refresh delay(in seconds), default is 3 seconds

-help Print usage; for help while the tool is running press 'h' + Enter

-queues Comma separated list of queues to restrict applications

-rows Number of rows on the terminal

-types Comma separated list of types to restrict applications, case sensitive(though the display is lower case)

-users Comma separated list of users to restrict applications

'yarn top' is a tool to help cluster administrators understand cluster usage better. Some notes about the implementation:

  1. Fetching information for all the apps is an expensive call for the RM. To prevent a performance degradation, the results are cached for 5 seconds, irrespective of the delay value. Information about the NodeManager(s) and queue utilization stats are fetched at the specified delay interval. Once we have a better understanding of the performance impact, this might change.

  2. Since the tool is implemented in Java, you must hit Enter for key presses to be processed.

like image 25
jljohn00 Avatar answered Oct 29 '25 16:10

jljohn00



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!