How to catch the L3-cache hits and misses by perf tool in Linux

How to catch the L3-cache hits and misses by perf tool in Linux
Tag : linux
Date : November 25 2020, 01:01 AM

Detecting Cache Misses and Hits Pragmatically in Linux

seems to work fine
How can I detect a cache-miss pragmatically [without cache simulation]?

Linux perf reporting cache misses for unexpected instruction

I wish did fix the issue. About your example:
There are several instructions before and at the high counter:
        │       movsd  (%rcx,%rsi,8),%xmm0
   0.13 │       ucomis (%rcx,%rdx,8),%xmm0
  57.99 │     ↑ jbe    ff
 u64 nehalem_hw_cache_event_ids ...
[ C(LL  ) ] = {
    [ C(OP_READ) ] = {
        [ C(RESULT_ACCESS) ] = 0x01b7,
        [ C(RESULT_MISS)   ] = 0x01b7,
 * Nehalem/Westmere MSR_OFFCORE_RESPONSE bits;
 * See IA32 SDM Vol 3B
#define NHM_DMND_DATA_RD    (1 << 0)
 u64 nehalem_hw_cache_extra_regs
 [ C(LL  ) ] = {
    [ C(OP_READ) ] = {

How does Linux perf calculate the cache-references and cache-misses events

Does that help The built-in perf events that you are interested in are mapping to the following hardware performance monitoring events on your processor:
  523,288,816      cache-references        (architectural event: LLC Reference)                             
  205,331,370      cache-misses            (architectural event: LLC Misses) 
  237,794,728      L1-dcache-load-misses   L1D.REPLACEMENT
3,495,080,007      L1-dcache-loads         MEM_INST_RETIRED.ALL_LOADS
2,039,344,725      L1-dcache-stores        MEM_INST_RETIRED.ALL_STORES                     
  531,452,853      L1-icache-load-misses   ICACHE_64B.IFTAG_MISS
   77,062,627      LLC-loads               OFFCORE_RESPONSE (MSR bits 0, 16, 30-37)
   27,462,249      LLC-load-misses         OFFCORE_RESPONSE (MSR bits 0, 17, 26-29, 30-37)
   15,039,473      LLC-stores              OFFCORE_RESPONSE (MSR bits 1, 16, 30-37)
    3,829,429      LLC-store-misses        OFFCORE_RESPONSE (MSR bits 1, 17, 26-29, 30-37)

definition of linux perf cache-misses event?

hope this fix your issue The cache-misses event corresponds to the misses in the last level cache (LLC). Note that this is an architectural performance monitoring event, that is supposed to behave consistently across microarchitectures.
This can be verified from the source code - cache-misses

Using Intel's PIN tool to count the number of cache hits/misses in a program

