perf is a performance analysis tool which can report hardware and software events. I am trying to run it with an MPI application in order to learn how much time the application spends within each core on data transfers and compute operations.
Normally, I would run my application with
mpirun -np $NUMBER_OF_CORES app_name
And it would spawn on several cores or possibly several nodes. Is it possible to add perf on top? I've tried
perf stat mpirun -np $NUMBER_OF_CORES app_name
But the output for this looks like some sort of aggregate of mpirun. Is there a way to collect perf type data from each core?
Something like:
mpirun -np $NUMBER_OF_CORES ./myscript.sh
might work with myscript.sh containing:
#! /bin/bash
perf stat app_name %*
You should add some parameter to the perf call to produce differently named result files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With