I have never used Valgrind
, but I think this tool can help me with my question. I would be grateful for any help.
In my R
code, I use the MixedModels
Julia
package.
I integrate Julia
in R
using the JuliaCall
package.
I work with very large datasets (~1 GB
, ~4x10^6
observations) and at the modeling step (mixed models
) a lot of RAM is allocated (~130 GB
), most of it does not return to the system after the end of calculations.
I would like to analyze the code and see the whole stack of R
and Julia
functions.
It is very important for me to understand what functions are called up during mixed models
calculation with Julia
(especially low-level functions, most likely written in C / C ++
), and how much memory each of these functions utilize.
It is also important to understand what exactly the memory is spent on, what exactly happens in the RAM
when the functions from the MixedModels
package are running.
Perhaps understanding this will help me improve the performance of the code and reduce the memory allocation.
Maybe for my tasks some other tool (rather than Valgrind
) will be more useful - I will be very grateful for the relevant recommendations!
Valgrind contains several tools, two or three of which may be of use to you.
However, the first thing that you need to do is to reduce the size of your model. Valgrind has large time and memory overheads. Running Valgrind on an application that allocates 130Gb of memory is likely to be exceedingly slow. However, if you just scale down the size of your data, the insights that you get should still be valid.
The first tool to consider is memcheck
. This is the most commonly used Valgrind tool. In addition to other types of error, it can detect memory leaks. Run
valgrind --leak-check=full --show-reachable=yes {your app} {your app arguments}
You need to examine the output to determine whether there are any leaks or whether memory is being held (possibly for later reuse).
The next tool to consider is massif
. This is a heap memory profiler. It will generate a graph of how the memory use evolves over the duration of your application's execution. Run
valgrind --tool=massif {your app} {your app arguments}
This will generate a text file that you can either see with massif visualizer (a kde5 graphical application) or with ms_print (part of the Valgrind distribution), a command line tool that will generate an ascii-art graph. Additionally, if you are using a relatively recent version of Valgrind you can use the xtree options with massif
and will generate a text file that you can load in kcachegrind
(another kde5 graphical application). This will give you a "tree" view of which calls allocate how much memory.
Lastly there is DHAT
(exp-dhat
if you are you are using an older version of Valgrind). This profiles the usage of heap memory. It will generate a text file that you can load from an html file that is part of the Valgrind distribution. (Or just a text file with older versions). Use --tool=dhat
to use it. This tool can help track down memory that is either not really used or is used rarely and could possibly be released earlier in the program execution.
You may also want to look at other tools. For instance, Google perf tools has a heapprofiler
component.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With