Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Measure the number of lines loaded in l1/l2 cache for reads(including prefetch)?

I am trying to determine the number of cache lines loaded in L1 cache (Processor Intel Broadwell). my kernel code is

a[i] = 2*b[i] + 2.3 // i from 0 to pow(10,8)

I am using the perf event L1-dcache-load-misses. The measured number is twice than expected. I am expecting 6M loads, and 6M stores. But L1-dcache-load-misses is around 12M. However LLC-stores is as expected (6M)

i) Does L1-dcache-load-misses count both load and store misses?

In Intel software developer manual (table 19.5), for L2 cache, I found two metrics

  • i) L2_TRANS.L2_FILL (r20f0)
  • ii) L2_TRANS.L2_WB (r40f0)

ii) What is the exact meaning of L2_TRANS.L2_FILL? Is it the total number of L2 transactions?

iii) What is the exact meaning of L2_TRANS.L2_WB? Is it the total number of L2 write transactions?

like image 857
knightrider Avatar asked Sep 14 '25 13:09

knightrider


1 Answers

Perf uses these event aliases that map to predefined counter events and masks, but since each CPU may have different mapping, this tends to shift sometimes, and you may end up counting something else.

This discussion on an Intel forum, suggests that at least some system (Haswell, but Broadwell should be quite similar) had L1-dcache-load-misses incorrectly mapped to L1 replacements, which would explain the double value (the stores would also fetch lines into the L1 cache).

As for the L2_trans events, assuming they're correctly mapped, they should indeed count the total fills and evictions from the L2. Note that this may include more that your loads + stores, since L2 also has code (probably negligible in such a small kernel), and prefetching (probably significant since your data is spatially laid out and easy to prefetch).

like image 128
Leeor Avatar answered Sep 16 '25 03:09

Leeor