Available native events and hardware information. -------------------------------------------------------------------------------- PAPI version : 7.0.1.0 Operating system : Linux 6.6.2-arch1-1 Vendor string and code : GenuineIntel (1, 0x1) Model string and code : 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz (141, 0x8d) CPU revision : 1.000000 CPUID : Family/Model/Stepping 6/141/1, 0x06/0x8d/0x01 CPU Max MHz : 4800 CPU Min MHz : 800 Total cores : 16 SMT threads per core : 2 Cores per socket : 8 Sockets : 1 Cores per NUMA region : 16 NUMA regions : 1 Running in a VM : no Number Hardware Counters : 20 Max Multiplex Counters : 384 Fast counter read (rdpmc): yes -------------------------------------------------------------------------------- =============================================================================== Native Events in Component: perf_event =============================================================================== | ix86arch::UNHALTED_CORE_CYCLES | | count core clock cycles whenever the clock signal on the specific | | core is running (not halted) | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | ix86arch::INSTRUCTION_RETIRED | | count the number of instructions at retirement. For instructions t| | hat consists of multiple micro-ops, this event counts the retireme| | nt of the last micro-op of the instruction | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | ix86arch::UNHALTED_REFERENCE_CYCLES | | count reference clock cycles while the clock signal on the specifi| | c core is running. The reference clock operates at a fixed frequen| | cy, irrespective of core frequency changes due to performance stat| | e transitions | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | ix86arch::LLC_REFERENCES | | count each request originating from the core to reference a cache | | line in the last level cache. The count may include speculation, b| | ut excludes cache line fills due to hardware prefetch | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | ix86arch::LLC_MISSES | | count each cache miss condition for references to the last level c| | ache. The event count may include speculation, but excludes cache | | line fills due to hardware prefetch | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | ix86arch::BRANCH_INSTRUCTIONS_RETIRED | | count branch instructions at retirement. Specifically, this event | | counts the retirement of the last micro-op of a branch instruction| | | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | ix86arch::MISPREDICTED_BRANCH_RETIRED | | count mispredicted branch instructions at retirement. Specifically| | , this event counts at retirement of the last micro-op of a branch| | instruction in the architectural path of the execution and experi| | enced misprediction in the branch prediction hardware | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CPU_CYCLES | | PERF_COUNT_HW_CPU_CYCLES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::CYCLES | | PERF_COUNT_HW_CPU_CYCLES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::CPU-CYCLES | | PERF_COUNT_HW_CPU_CYCLES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_INSTRUCTIONS | | PERF_COUNT_HW_INSTRUCTIONS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::INSTRUCTIONS | | PERF_COUNT_HW_INSTRUCTIONS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CACHE_REFERENCES | | PERF_COUNT_HW_CACHE_REFERENCES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::CACHE-REFERENCES | | PERF_COUNT_HW_CACHE_REFERENCES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CACHE_MISSES | | PERF_COUNT_HW_CACHE_MISSES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::CACHE-MISSES | | PERF_COUNT_HW_CACHE_MISSES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_BRANCH_INSTRUCTIONS | | PERF_COUNT_HW_BRANCH_INSTRUCTIONS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::BRANCH-INSTRUCTIONS | | PERF_COUNT_HW_BRANCH_INSTRUCTIONS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::BRANCHES | | PERF_COUNT_HW_BRANCH_INSTRUCTIONS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_BRANCH_MISSES | | PERF_COUNT_HW_BRANCH_MISSES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::BRANCH-MISSES | | PERF_COUNT_HW_BRANCH_MISSES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_BUS_CYCLES | | PERF_COUNT_HW_BUS_CYCLES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::BUS-CYCLES | | PERF_COUNT_HW_BUS_CYCLES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_STALLED_CYCLES_FRONTEND | | PERF_COUNT_HW_STALLED_CYCLES_FRONTEND | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::STALLED-CYCLES-FRONTEND | | PERF_COUNT_HW_STALLED_CYCLES_FRONTEND | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::IDLE-CYCLES-FRONTEND | | PERF_COUNT_HW_STALLED_CYCLES_FRONTEND | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_STALLED_CYCLES_BACKEND | | PERF_COUNT_HW_STALLED_CYCLES_BACKEND | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::STALLED-CYCLES-BACKEND | | PERF_COUNT_HW_STALLED_CYCLES_BACKEND | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::IDLE-CYCLES-BACKEND | | PERF_COUNT_HW_STALLED_CYCLES_BACKEND | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_REF_CPU_CYCLES | | PERF_COUNT_HW_REF_CPU_CYCLES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::REF-CYCLES | | PERF_COUNT_HW_REF_CPU_CYCLES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_SW_CPU_CLOCK | | PERF_COUNT_SW_CPU_CLOCK | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::CPU-CLOCK | | PERF_COUNT_SW_CPU_CLOCK | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_SW_TASK_CLOCK | | PERF_COUNT_SW_TASK_CLOCK | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::TASK-CLOCK | | PERF_COUNT_SW_TASK_CLOCK | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_SW_PAGE_FAULTS | | PERF_COUNT_SW_PAGE_FAULTS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PAGE-FAULTS | | PERF_COUNT_SW_PAGE_FAULTS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::FAULTS | | PERF_COUNT_SW_PAGE_FAULTS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_SW_CONTEXT_SWITCHES | | PERF_COUNT_SW_CONTEXT_SWITCHES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::CONTEXT-SWITCHES | | PERF_COUNT_SW_CONTEXT_SWITCHES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::CS | | PERF_COUNT_SW_CONTEXT_SWITCHES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_SW_CPU_MIGRATIONS | | PERF_COUNT_SW_CPU_MIGRATIONS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::CPU-MIGRATIONS | | PERF_COUNT_SW_CPU_MIGRATIONS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::MIGRATIONS | | PERF_COUNT_SW_CPU_MIGRATIONS | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_SW_PAGE_FAULTS_MIN | | PERF_COUNT_SW_PAGE_FAULTS_MIN | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::MINOR-FAULTS | | PERF_COUNT_SW_PAGE_FAULTS_MIN | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_SW_PAGE_FAULTS_MAJ | | PERF_COUNT_SW_PAGE_FAULTS_MAJ | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::MAJOR-FAULTS | | PERF_COUNT_SW_PAGE_FAULTS_MAJ | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_SW_CGROUP_SWITCHES | | PERF_COUNT_SW_CGROUP_SWITCHES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::CGROUP-SWITCHES | | PERF_COUNT_SW_CGROUP_SWITCHES | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CACHE_L1D | | L1 data cache | | :READ | | read access | | :WRITE | | write access | | :PREFETCH | | prefetch access | | :ACCESS | | hit access | | :MISS | | miss access | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-DCACHE-LOADS | | L1 cache load accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-DCACHE-LOAD-MISSES | | L1 cache load misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-DCACHE-STORES | | L1 cache store accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-DCACHE-STORE-MISSES | | L1 cache store misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-DCACHE-PREFETCHES | | L1 cache prefetch accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-DCACHE-PREFETCH-MISSES | | L1 cache prefetch misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CACHE_L1I | | L1 instruction cache | | :READ | | read access | | :PREFETCH | | prefetch access | | :ACCESS | | hit access | | :MISS | | miss access | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-ICACHE-LOADS | | L1I cache load accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-ICACHE-LOAD-MISSES | | L1I cache load misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-ICACHE-PREFETCHES | | L1I cache prefetch accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::L1-ICACHE-PREFETCH-MISSES | | L1I cache prefetch misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CACHE_LL | | Last level cache | | :READ | | read access | | :WRITE | | write access | | :PREFETCH | | prefetch access | | :ACCESS | | hit access | | :MISS | | miss access | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::LLC-LOADS | | Last level cache load accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::LLC-LOAD-MISSES | | Last level cache load misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::LLC-STORES | | Last level cache store accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::LLC-STORE-MISSES | | Last level cache store misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::LLC-PREFETCHES | | Last level cache prefetch accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::LLC-PREFETCH-MISSES | | Last level cache prefetch misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CACHE_DTLB | | Data Translation Lookaside Buffer | | :READ | | read access | | :WRITE | | write access | | :PREFETCH | | prefetch access | | :ACCESS | | hit access | | :MISS | | miss access | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::DTLB-LOADS | | Data TLB load accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::DTLB-LOAD-MISSES | | Data TLB load misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::DTLB-STORES | | Data TLB store accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::DTLB-STORE-MISSES | | Data TLB store misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::DTLB-PREFETCHES | | Data TLB prefetch accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::DTLB-PREFETCH-MISSES | | Data TLB prefetch misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CACHE_ITLB | | Instruction Translation Lookaside Buffer | | :READ | | read access | | :ACCESS | | hit access | | :MISS | | miss access | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::ITLB-LOADS | | Instruction TLB load accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::ITLB-LOAD-MISSES | | Instruction TLB load misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CACHE_BPU | | Branch Prediction Unit | | :READ | | read access | | :ACCESS | | hit access | | :MISS | | miss access | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::BRANCH-LOADS | | Branch load accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::BRANCH-LOAD-MISSES | | Branch load misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::PERF_COUNT_HW_CACHE_NODE | | Node memory access | | :READ | | read access | | :WRITE | | write access | | :PREFETCH | | prefetch access | | :ACCESS | | hit access | | :MISS | | miss access | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::NODE-LOADS | | Node load accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::NODE-LOAD-MISSES | | Node load misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::NODE-STORES | | Node store accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::NODE-STORE-MISSES | | Node store misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::NODE-PREFETCHES | | Node prefetch accesses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::NODE-PREFETCH-MISSES | | Node prefetch misses | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::slots | | issue slots per logical CPU (used for topdown toplevel computation| | , must be first event in the group) | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::topdown-retiring | | topdown useful slots retiring uops (must be used in a group with t| | he other topdown- events with slots as leader) | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::topdown-bad-spec | | topdown wasted slots due to bad speculation (must be used in a gro| | up with the other topdown- events with slots as leader) | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::topdown-fe-bound | | topdown wasted slots due to frontend (must be used in a group with| | the other topdown- events with slots as leader) | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf::topdown-be-bound | | topdown wasted slots due to backend (must be used in a group with | | the other topdown- events with slots as leader) | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | perf_raw::r0000 | | perf_events raw event syntax: r[0-9a-fA-F]+ | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :h=0 | | monitor at hypervisor level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | UNHALTED_CORE_CYCLES | | Count core clock cycles whenever the clock signal on the specific | | core is running (not halted) | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | UNHALTED_REFERENCE_CYCLES | | Unhalted reference cycles | | :t=0 | | measure any thread | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | INSTRUCTION_RETIRED | | Number of instructions at retirement | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | INSTRUCTIONS_RETIRED | | Number of instructions at retirement | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | SQ_MISC | | SuperQueue miscellaneous. | | :SQ_FULL | | Cycles the thread is active and superQ cannot take any more entrie| | s. | | :BUS_LOCK | | Counts bus locks, accounts for cache line split locks and UC locks| | . | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | L2_LINES_OUT | | L2 lines evicted. | | :USELESS_HWPF | | Cache lines that have been L2 hardware prefetched but not used by | | demand accesses | | :NON_SILENT | | Modified cache lines that are evicted by L2 cache when triggered b| | y an L2 cache fill. | | :SILENT | | Non-modified cache lines that are silently dropped by L2 cache whe| | n triggered by an L2 cache fill. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | L2_LINES_IN | | L2 lines allocated. | | :ALL | | L2 cache lines filling L2 | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | L2_TRANS | | L2 transactions. | | :L2_WB | | L2 writebacks that access L2 cache | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | BACLEARS | | Branch re-steers. | | :ANY | | Counts the total number when the front end is resteered, mainly wh| | en the BPU cannot provide a correct prediction and this is correct| | ed by other branch handling mechanisms at the front end. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | MEM_LOAD_L3_HIT_RETIRED | | L3 hit load uops retired. | | :XSNP_NONE | | Retired load instructions whose data sources were hits in L3 witho| | ut snoops required | | :XSNP_HITM | | Retired load instructions whose data sources were HitM responses f| | rom shared L3 | | :XSNP_HIT | | Retired load instructions whose data sources were L3 and cross-cor| | e snoop hits in on-pkg core cache | | :XSNP_MISS | | Retired load instructions whose data sources were L3 hit and cross| | -core snoop missed in on-pkg core cache. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | MEM_LOAD_RETIRED | | Retired load uops. | | :FB_HIT | | Number of completed demand load requests that missed the L1, but h| | it the FB(fill buffer), because a preceding miss to the same cache| | line initiated the line to be brought into L1, but data is not yet| | ready in L1. | | :L3_MISS | | Retired load instructions missed L3 cache as data sources | | :L2_MISS | | Retired load instructions missed L2 cache as data sources | | :L1_MISS | | Retired load instructions missed L1 cache as data sources | | :L3_HIT | | Retired load instructions with L3 cache hits as data sources | | :L2_HIT | | Retired load instructions with L2 cache hits as data sources | | :L1_HIT | | Retired load instructions with L1 cache hits as data sources | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | MEM_INST_RETIRED | | Memory instructions retired. | | :ALL_STORES | | All retired store instructions. | | :ALL_LOADS | | All retired load instructions. | | :SPLIT_STORES | | Retired store instructions that split across a cacheline boundary.| | | | :SPLIT_LOADS | | Retired load instructions that split across a cacheline boundary. | | :LOCK_LOADS | | Retired load instructions with locked access. | | :STLB_MISS_STORES | | Retired store instructions that miss the STLB. | | :STLB_MISS_LOADS | | Retired load instructions that miss the STLB. | | :ANY | | All retired memory instructions. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | MEM_LOAD_L3_MISS_RETIRED | | Retired load instructions which data sources missed L3 but service| | d from local dram | | :LOCAL_DRAM | | Retired load instructions which data sources missed L3 but service| | d from local dram | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | MEM_TRANS_RETIRED | | Memory transactions retired, | | :LOAD_LATENCY | | Memory load instructions retired above programmed clocks, minimum | | threshold value is 3 (Precise Event and ldlat required) | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :ldlat=0 | | load latency threshold (cycles, [3-65535]) | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | MISC_RETIRED | | Miscellaneous retired events. | | :PAUSE_INST | | Number of retired PAUSE instructions. | | :LBR_INSERTS | | Increments whenever there is an update to the LBR array. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | RTM_RETIRED | | RTM (Restricted Transaction Memory) execution. | | :ABORTED_EVENTS | | Number of times an RTM execution aborted due to none of the previo| | us 4 categories (e.g. interrupt) | | :ABORTED_MEMTYPE | | Number of times an RTM execution aborted due to incompatible memor| | y type | | :ABORTED_UNFRIENDLY | | Number of times an RTM execution aborted due to HLE-unfriendly ins| | tructions | | :ABORTED_MEM | | Number of times an RTM execution aborted due to various memory eve| | nts (e.g. read/write capacity and conflicts) | | :ABORTED | | Number of times an RTM execution aborted. | | :COMMIT | | Number of times an RTM execution successfully committed | | :START | | Number of times an RTM execution started. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | HLE_RETIRED | | HLE (Hardware Lock Elision) execution. | | :ABORTED_EVENTS | | Number of times an HLE execution aborted due to unfriendly events | | (such as interrupts). | | :ABORTED_UNFRIENDLY | | Number of times an HLE execution aborted due to HLE-unfriendly ins| | tructions and certain unfriendly events (such as AD assists etc.).| | | | :ABORTED_MEM | | Number of times an HLE execution aborted due to various memory eve| | nts (e.g., read/write capacity and conflicts). | | :ABORTED | | Number of times an HLE execution aborted due to any reasons (multi| | ple categories may count as one). | | :COMMIT | | Number of times an HLE execution successfully committed | | :START | | Number of times an HLE execution started. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | FP_ARITH_INST_RETIRED | | Floating-point instructions retired. | | :512B_PACKED_SINGLE | | Counts number of SSE/AVX computational 512-bit packed double preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 16 computation o| | perations, one for each element. Applies to SSE* and AVX* packed | | double precision floating-point instructions: ADD SUB MUL DIV MIN | | MAX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions cou| | nt twice as they perform 2 calculations per element. The DAZ and F| | TZ flags in the MXCSR register need to be set when using this even| | t. | | :512B_PACKED_DOUBLE | | Counts number of SSE/AVX computational 512-bit packed double preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 8 computation op| | erations, one for each element. Applies to SSE* and AVX* packed d| | ouble precision floating-point instructions: ADD SUB MUL DIV MIN M| | AX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions coun| | t twice as they perform 2 calculations per element. The DAZ and FT| | Z flags in the MXCSR register need to be set when using this event| | | | :256B_PACKED_SINGLE | | Counts number of SSE/AVX computational 256-bit packed single preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 8 computation op| | erations, one for each element. Applies to SSE* and AVX* packed s| | ingle precision floating-point instructions: ADD SUB HADD HSUB SUB| | ADD MUL DIV MIN MAX SQRT RSQRT RCP DPP FM(N)ADD/SUB. DPP and FM(N| | )ADD/SUB instructions count twice as they perform 2 calculations p| | er element. The DAZ and FTZ flags in the MXCSR register need to be| | set when using this event. | | :256B_PACKED_DOUBLE | | Counts number of SSE/AVX computational 256-bit packed double preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 4 computation op| | erations, one for each element. Applies to SSE* and AVX* packed d| | ouble precision floating-point instructions: ADD SUB HADD HSUB SUB| | ADD MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions | | count twice as they perform 2 calculations per element. The DAZ an| | d FTZ flags in the MXCSR register need to be set when using this e| | vent. | | :128B_PACKED_SINGLE | | Number of SSE/AVX computational 128-bit packed single precision fl| | oating-point instructions retired; some instructions will count tw| | ice as noted below. Each count represents 4 computation operation| | s, one for each element. Applies to SSE* and AVX* packed single p| | recision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP1| | 4 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instruction| | s count twice as they perform 2 calculations per element. The DAZ | | and FTZ flags in the MXCSR register need to be set when using this| | event. | | :128B_PACKED_DOUBLE | | Counts number of SSE/AVX computational 128-bit packed double preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 2 computation op| | erations, one for each element. Applies to SSE* and AVX* packed d| | ouble precision floating-point instructions: ADD SUB HADD HSUB SUB| | ADD MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB i| | nstructions count twice as they perform 2 calculations per element| | . The DAZ and FTZ flags in the MXCSR register need to be set when | | using this event. | | :SCALAR_SINGLE | | Counts number of SSE/AVX computational scalar single precision flo| | ating-point instructions retired; some instructions will count twi| | ce as noted below. Each count represents 1 computational operatio| | n. Applies to SSE* and AVX* scalar single precision floating-point| | instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT RCP FM(N)ADD/SUB| | . FM(N)ADD/SUB instructions count twice as they perform 2 calcula| | tions per element. The DAZ and FTZ flags in the MXCSR register nee| | d to be set when using this event. | | :SCALAR_DOUBLE | | Counts number of SSE/AVX computational scalar double precision flo| | ating-point instructions retired; some instructions will count twi| | ce as noted below. Each count represents 1 computational operatio| | n. Applies to SSE* and AVX* scalar double precision floating-point| | instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)AD| | D/SUB instructions count twice as they perform 2 calculations per | | element. The DAZ and FTZ flags in the MXCSR register need to be se| | t when using this event. | | :SCALAR | | Number of SSE/AVX computational scalar floating-point instructions| | retired; some instructions will count twice as noted below. Appl| | ies to SSE* and AVX* scalar, double and single precision floating-| | point: ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB| | . DPP and FM(N)ADD/SUB instructions count twice as they perform m| | ultiple calculations per element. | | :4_FLOPS | | Number of SSE/AVX computational 128-bit packed single and 256-bit | | packed double precision FP instructions retired; some instructions| | will count twice as noted below. Each count represents 2 or/and | | 4 computation operations, 1 for each element. Applies to SSE* and| | AVX* packed single precision and packed double precision FP instr| | uctions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX RCP14 RSQRT14 SQ| | RT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as they per| | form 2 calculations per element. | | :8_FLOPS | | Number of SSE/AVX computational 256-bit packed single precision an| | d 512-bit packed double precision FP instructions retired; some i| | nstructions will count twice as noted below. Each count represent| | s 8 computation operations, 1 for each element. Applies to SSE* a| | nd AVX* packed single precision and double precision FP instructio| | ns: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RSQRT14 RC| | P RCP14 DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as the| | y perform 2 calculations per element. | | :VECTOR | | Number of any Vector retired FP arithmetic instructions | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | FP_ARITH | | Floating-point instructions retired. | | :512B_PACKED_SINGLE | | Counts number of SSE/AVX computational 512-bit packed double preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 16 computation o| | perations, one for each element. Applies to SSE* and AVX* packed | | double precision floating-point instructions: ADD SUB MUL DIV MIN | | MAX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions cou| | nt twice as they perform 2 calculations per element. The DAZ and F| | TZ flags in the MXCSR register need to be set when using this even| | t. | | :512B_PACKED_DOUBLE | | Counts number of SSE/AVX computational 512-bit packed double preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 8 computation op| | erations, one for each element. Applies to SSE* and AVX* packed d| | ouble precision floating-point instructions: ADD SUB MUL DIV MIN M| | AX SQRT RSQRT14 RCP14 FM(N)ADD/SUB. FM(N)ADD/SUB instructions coun| | t twice as they perform 2 calculations per element. The DAZ and FT| | Z flags in the MXCSR register need to be set when using this event| | | | :256B_PACKED_SINGLE | | Counts number of SSE/AVX computational 256-bit packed single preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 8 computation op| | erations, one for each element. Applies to SSE* and AVX* packed s| | ingle precision floating-point instructions: ADD SUB HADD HSUB SUB| | ADD MUL DIV MIN MAX SQRT RSQRT RCP DPP FM(N)ADD/SUB. DPP and FM(N| | )ADD/SUB instructions count twice as they perform 2 calculations p| | er element. The DAZ and FTZ flags in the MXCSR register need to be| | set when using this event. | | :256B_PACKED_DOUBLE | | Counts number of SSE/AVX computational 256-bit packed double preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 4 computation op| | erations, one for each element. Applies to SSE* and AVX* packed d| | ouble precision floating-point instructions: ADD SUB HADD HSUB SUB| | ADD MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions | | count twice as they perform 2 calculations per element. The DAZ an| | d FTZ flags in the MXCSR register need to be set when using this e| | vent. | | :128B_PACKED_SINGLE | | Number of SSE/AVX computational 128-bit packed single precision fl| | oating-point instructions retired; some instructions will count tw| | ice as noted below. Each count represents 4 computation operation| | s, one for each element. Applies to SSE* and AVX* packed single p| | recision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP1| | 4 RSQRT14 SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB instruction| | s count twice as they perform 2 calculations per element. The DAZ | | and FTZ flags in the MXCSR register need to be set when using this| | event. | | :128B_PACKED_DOUBLE | | Counts number of SSE/AVX computational 128-bit packed double preci| | sion floating-point instructions retired; some instructions will c| | ount twice as noted below. Each count represents 2 computation op| | erations, one for each element. Applies to SSE* and AVX* packed d| | ouble precision floating-point instructions: ADD SUB HADD HSUB SUB| | ADD MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB i| | nstructions count twice as they perform 2 calculations per element| | . The DAZ and FTZ flags in the MXCSR register need to be set when | | using this event. | | :SCALAR_SINGLE | | Counts number of SSE/AVX computational scalar single precision flo| | ating-point instructions retired; some instructions will count twi| | ce as noted below. Each count represents 1 computational operatio| | n. Applies to SSE* and AVX* scalar single precision floating-point| | instructions: ADD SUB MUL DIV MIN MAX SQRT RSQRT RCP FM(N)ADD/SUB| | . FM(N)ADD/SUB instructions count twice as they perform 2 calcula| | tions per element. The DAZ and FTZ flags in the MXCSR register nee| | d to be set when using this event. | | :SCALAR_DOUBLE | | Counts number of SSE/AVX computational scalar double precision flo| | ating-point instructions retired; some instructions will count twi| | ce as noted below. Each count represents 1 computational operatio| | n. Applies to SSE* and AVX* scalar double precision floating-point| | instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB. FM(N)AD| | D/SUB instructions count twice as they perform 2 calculations per | | element. The DAZ and FTZ flags in the MXCSR register need to be se| | t when using this event. | | :SCALAR | | Number of SSE/AVX computational scalar floating-point instructions| | retired; some instructions will count twice as noted below. Appl| | ies to SSE* and AVX* scalar, double and single precision floating-| | point: ADD SUB MUL DIV MIN MAX RCP14 RSQRT14 SQRT DPP FM(N)ADD/SUB| | . DPP and FM(N)ADD/SUB instructions count twice as they perform m| | ultiple calculations per element. | | :4_FLOPS | | Number of SSE/AVX computational 128-bit packed single and 256-bit | | packed double precision FP instructions retired; some instructions| | will count twice as noted below. Each count represents 2 or/and | | 4 computation operations, 1 for each element. Applies to SSE* and| | AVX* packed single precision and packed double precision FP instr| | uctions: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX RCP14 RSQRT14 SQ| | RT DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as they per| | form 2 calculations per element. | | :8_FLOPS | | Number of SSE/AVX computational 256-bit packed single precision an| | d 512-bit packed double precision FP instructions retired; some i| | nstructions will count twice as noted below. Each count represent| | s 8 computation operations, 1 for each element. Applies to SSE* a| | nd AVX* packed single precision and double precision FP instructio| | ns: ADD SUB HADD HSUB SUBADD MUL DIV MIN MAX SQRT RSQRT RSQRT14 RC| | P RCP14 DPP FM(N)ADD/SUB. DPP and FM(N)ADD/SUB count twice as the| | y perform 2 calculations per element. | | :VECTOR | | Number of any Vector retired FP arithmetic instructions | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | FRONTEND_RETIRED | | Precise frontend retired events. | | :LATENCY_GE_1 | | Retired instructions after front-end starvation of at least 1 cycl| | e | | :LATENCY_GE_2_BUBBLES_GE_1 | | Retired instructions that are fetched after an interval where the | | front-end had at least 1 bubble-slot for a period of 2 cycles whic| | h was not interrupted by a back-end stall. | | :LATENCY_GE_512 | | Retired instructions that are fetched after an interval where the | | front-end delivered no uops for a period of 512 cycles which was n| | ot interrupted by a back-end stall. | | :LATENCY_GE_256 | | Retired instructions that are fetched after an interval where the | | front-end delivered no uops for a period of 256 cycles which was n| | ot interrupted by a back-end stall. | | :LATENCY_GE_128 | | Retired instructions that are fetched after an interval where the | | front-end delivered no uops for a period of 128 cycles which was n| | ot interrupted by a back-end stall. | | :LATENCY_GE_64 | | Retired instructions that are fetched after an interval where the | | front-end delivered no uops for a period of 64 cycles which was no| | t interrupted by a back-end stall. | | :LATENCY_GE_32 | | Retired instructions that are fetched after an interval where the | | front-end delivered no uops for a period of 32 cycles which was no| | t interrupted by a back-end stall. | | :LATENCY_GE_16 | | Retired instructions that are fetched after an interval where the | | front-end delivered no uops for a period of 16 cycles which was no| | t interrupted by a back-end stall. | | :LATENCY_GE_8 | | Retired instructions that are fetched after an interval where the | | front-end delivered no uops for a period of 8 cycles which was not| | interrupted by a back-end stall. | | :LATENCY_GE_4 | | Retired instructions that are fetched after an interval where the | | front-end delivered no uops for a period of 4 cycles which was not| | interrupted by a back-end stall. | | :LATENCY_GE_2 | | Retired instructions after front-end starvation of at least 2 cycl| | es | | :STLB_MISS | | Retired Instructions who experienced STLB (2nd level TLB) true mis| | s. | | :ITLB_MISS | | Retired Instructions who experienced iTLB true miss. | | :L2_MISS | | Retired Instructions who experienced Instruction L2 Cache true mis| | s. | | :L1I_MISS | | Retired Instructions who experienced Instruction L1 Cache true mis| | s. | | :DSB_MISS | | Retired Instructions experiencing a critical DSB miss. | | :ANY_DSB_MISS | | Retired Instructions experiencing a DSB miss. | | :IDQ_4_BUBBLES | | Retired instructions after an interval where the front-end did not| | deliver any uops (4 bubbles) for a period determined by the fe_th| | res modifier (set to 1 cycle by default) and which was not interru| | pted by a back-end stall | | :IDQ_3_BUBBLES | | Counts instructions retired after an interval where the front-end | | did not deliver more than 1 uop (3 bubbles) for a period determine| | d by the fe_thres modifier (set to 1 cycle by default) and which w| | as not interrupted by a back-end stall | | :IDQ_2_BUBBLES | | Counts instructions retired after an interval where the front-end | | did not deliver more than 2 uops (2 bubbles) for a period determin| | ed by the fe_thres modifier (set to 1 cycle by default) and which | | was not interrupted by a back-end stall | | :IDQ_1_BUBBLE | | Counts instructions retired after an interval where the front-end | | did not deliver more than 3 uops (1 bubble) for a period determine| | d by the fe_thres modifier (set to 1 cycle by default) and which w| | as not interrupted by a back-end stall | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :fe_thres=0 | | frontend bubble latency threshold in cycles ([1-4095] | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | BR_MISP_RETIRED | | Mispredicted branch instructions retired. | | :INDIRECT | | All miss-predicted indirect branch instructions retired (excluding| | RETs. TSX aborts is considered indirect branch). | | :NEAR_TAKEN | | Number of near branch instructions retired that were mispredicted | | and taken. | | :COND | | Mispredicted conditional branch instructions retired. | | :COND_NTAKEN | | Mispredicted non-taken conditional branch instructions retired. | | :INDIRECT_CALL | | Mispredicted indirect CALL instructions retired. | | :COND_TAKEN | | number of branch instructions retired that were mispredicted and t| | aken. | | :ALL_BRANCHES | | All mispredicted branch instructions retired. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | BR_INST_RETIRED | | Branch instructions retired. | | :INDIRECT | | Indirect near branch instructions retired (excluding returns) | | :FAR_BRANCH | | Far branch instructions retired. | | :NEAR_TAKEN | | Taken branch instructions retired. | | :COND | | Conditional branch instructions retired. | | :COND_NTAKEN | | Not taken branch instructions retired. | | :NEAR_RETURN | | Return instructions retired. | | :NEAR_CALL | | Direct and indirect near call instructions retired. | | :COND_TAKEN | | Taken conditional branch instructions retired. | | :ALL_BRANCHES | | All branch instructions retired. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | MACHINE_CLEARS | | Machine clear asserted. | | :SMC | | Self-modifying code (SMC) detected. | | :MEMORY_ORDERING | | Number of machine clears due to memory ordering conflicts. | | :COUNT | | Number of machine clears (nukes) of any type. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | UOPS_RETIRED | | Retired uops. | | :SLOTS | | Retirement slots used. | | :TOTAL_CYCLES | | Cycles with less than 10 actually retired uops. | | :STALL_CYCLES | | Cycles without actually retired uops. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | ASSISTS | | Software assist. | | :ANY | | Number of occurrences where a microcode assist is invoked by hardw| | are. | | :FP | | Counts all microcode FP assists. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | TLB_FLUSH | | Data TLB flushes. | | :STLB_ANY | | STLB flush attempts | | :DTLB_THREAD | | DTLB flush attempts of the thread-specific entries | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | UOPS_EXECUTED | | Uops executed. | | :X87 | | Counts the number of x87 uops dispatched. | | :CORE_CYCLES_GE_4 | | Cycles at least 4 micro-op is executed from any thread on physical| | core. | | :CORE_CYCLES_GE_3 | | Cycles at least 3 micro-op is executed from any thread on physical| | core. | | :CORE_CYCLES_GE_2 | | Cycles at least 2 micro-op is executed from any thread on physical| | core. | | :CORE_CYCLES_GE_1 | | Cycles at least 1 micro-op is executed from any thread on physical| | core. | | :CORE | | Number of uops executed on the core. | | :CYCLES_GE_4 | | Cycles where at least 4 uops were executed per-thread | | :CYCLES_GE_3 | | Cycles where at least 3 uops were executed per-thread | | :CYCLES_GE_2 | | Cycles where at least 2 uops were executed per-thread | | :CYCLES_GE_1 | | Cycles where at least 1 uop was executed per-thread | | :STALL_CYCLES | | Counts number of cycles no uops were dispatched to be executed on | | this thread. | | :THREAD | | Counts the number of uops to be executed per-thread each cycle. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | OFFCORE_REQUESTS | | Requests sent to uncore. | | :ALL_REQUESTS | | Any memory transaction that reached the SQ. | | :L3_MISS_DEMAND_DATA_RD | | Demand Data Read requests who miss L3 cache | | :ALL_DATA_RD | | Demand and prefetch data reads | | :DEMAND_RFO | | Demand RFO requests including regular RFOs, locks, ItoM | | :DEMAND_DATA_RD | | Demand Data Read requests sent to uncore | | :DEMAND_CODE_RD | | Counts cacheable and non-cacheable code reads to the core. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | DSB2MITE_SWITCHES | | Number of DSB to MITE switches. | | :COUNT | | DSB-to-MITE transitions count. | | :PENALTY_CYCLES | | DSB-to-MITE switch true penalty cycles. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | LSD | | LSD (Loop stream detector) operations. | | :CYCLES_OK | | Cycles optimal number of Uops delivered by the LSD, but did not co| | me from the decoder. | | :CYCLES_ACTIVE | | Cycles Uops delivered by the LSD, but didn't come from the decoder| | . | | :UOPS | | Number of Uops delivered by the LSD. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | EXE_ACTIVITY | | Execution activity, | | :EXE_BOUND_0_PORTS | | Cycles where no uops were executed, the Reservation Station was no| | t empty, the Store Buffer was full and there was no outstanding lo| | ad. | | :BOUND_ON_STORES | | Cycles where the Store Buffer was full and no loads caused an exec| | ution stall. | | :4_PORTS_UTIL | | Cycles total of 4 uops are executed on all ports and Reservation S| | tation was not empty. | | :3_PORTS_UTIL | | Cycles total of 3 uops are executed on all ports and Reservation S| | tation was not empty. | | :2_PORTS_UTIL | | Cycles total of 2 uops are executed on all ports and Reservation S| | tation was not empty. | | :1_PORTS_UTIL | | Cycles total of 1 uop is executed on all ports and Reservation Sta| | tion was not empty. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | CYCLE_ACTIVITY | | Stalled cycles. | | :STALLS_MEM_ANY | | Execution stalls while memory subsystem has an outstanding load. | | :CYCLES_MEM_ANY | | Cycles while memory subsystem has an outstanding load. | | :STALLS_L1D_MISS | | Execution stalls while L1 cache miss demand load is outstanding. | | :CYCLES_L1D_MISS | | Cycles while L1 cache miss demand load is outstanding. | | :STALLS_L3_MISS | | Execution stalls while L3 cache miss demand load is outstanding. | | :STALLS_L2_MISS | | Execution stalls while L2 cache miss demand load is outstanding. | | :STALLS_TOTAL | | Total execution stalls. | | :CYCLES_L3_MISS | | Cycles while L3 cache miss demand load is outstanding. | | :CYCLES_L2_MISS | | Cycles while L2 cache miss demand load is outstanding. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | RESOURCE_STALLS | | Cycles where Allocation is stalled due to Resource Related reasons| | . | | :SB | | Cycles stalled due to no store buffers available. (not including d| | raining form sync). | | :SCOREBOARD | | Counts cycles where the pipeline is stalled due to serializing ope| | rations. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | UOPS_DISPATCHED | | Uops dispatched to specific ports | | :PORT_7_8 | | Number of uops executed on port 7 and 8 | | :PORT_6 | | Number of uops executed on port 6 | | :PORT_5 | | Number of uops executed on port 5 | | :PORT_4_9 | | Number of uops executed on port 4 and 9 | | :PORT_2_3 | | Number of uops executed on port 2 and 3 | | :PORT_1 | | Number of uops executed on port 1 | | :PORT_0 | | Number of uops executed on port 0 | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | UOPS_DISPATCHED_PORT | | Uops dispatched to specific ports | | :PORT_7_8 | | Number of uops executed on port 7 and 8 | | :PORT_6 | | Number of uops executed on port 6 | | :PORT_5 | | Number of uops executed on port 5 | | :PORT_4_9 | | Number of uops executed on port 4 and 9 | | :PORT_2_3 | | Number of uops executed on port 2 and 3 | | :PORT_1 | | Number of uops executed on port 1 | | :PORT_0 | | Number of uops executed on port 0 | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | IDQ_UOPS_NOT_DELIVERED | | Uops not delivered. | | :CYCLES_FE_WAS_OK | | Cycles when optimal number of uops was delivered to the back-end w| | hen the back-end is not stalled | | :CYCLES_0_UOPS_DELIV_CORE | | Cycles when no uops are not delivered by the IDQ when backend of t| | he machine is not stalled | | :CORE | | Uops not delivered by IDQ when backend of the machine is not stall| | ed | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | ILD_STALL | | ILD (Instruction Length Decoder) stalls. | | :LCP | | Stalls caused by changing prefix length of the instruction. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | ITLB_MISSES | | Instruction TLB misses. | | :STLB_HIT | | Instruction fetch requests that miss the ITLB and hit the STLB. | | :WALK_ACTIVE | | Cycles when at least one PMH is busy with a page walk for code (in| | struction fetch) request. | | :WALK_PENDING | | Number of page walks outstanding for an outstanding code request i| | n the PMH each cycle. | | :WALK_COMPLETED | | Code miss in all TLB levels causes a page walk that completes. (Al| | l page sizes) | | :WALK_COMPLETED_2M_4M | | Code miss in all TLB levels causes a page walk that completes. (2M| | /4M) | | :WALK_COMPLETED_4K | | Code miss in all TLB levels causes a page walk that completes. (4K| | ) | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | ICACHE_64B | | Instruction Cache. | | :IFTAG_STALL | | Cycles where a code fetch is stalled due to L1 instruction cache t| | ag miss. | | :IFTAG_MISS | | Instruction fetch tag lookups that miss in the instruction cache (| | L1I). Counts at 64-byte cache-line granularity. | | :IFTAG_HIT | | Instruction fetch tag lookups that hit in the instruction cache (L| | 1I). Counts at 64-byte cache-line granularity. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | ICACHE_16B | | Instruction Cache. | | :IFDATA_STALL | | Cycles where a code fetch is stalled due to L1 instruction cache m| | iss. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | IDQ | | IDQ (Instruction Decoded Queue) operations | | :MS_CYCLES_ANY | | Cycles when uops are being delivered to IDQ while MS is busy | | :MS_UOPS | | Uops delivered to IDQ while MS is busy | | :MS_SWITCHES | | Number of switches from DSB or MITE to the MS | | :DSB_CYCLES_ANY | | Cycles Decode Stream Buffer (DSB) is delivering any Uop | | :DSB_CYCLES_OK | | Cycles DSB is delivering optimal number of Uops | | :DSB_UOPS | | Uops delivered to Instruction Decode Queue (IDQ) from the Decode S| | tream Buffer (DSB) path | | :MITE_CYCLES_ANY | | Cycles MITE is delivering any Uop | | :MITE_CYCLES_OK | | Cycles MITE is delivering optimal number of Uops | | :MITE_UOPS | | Uops delivered to Instruction Decode Queue (IDQ) from MITE path | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | OFFCORE_REQUESTS_OUTSTANDING | | Outstanding offcore requests. | | :DEMAND_DATA_RD | | For every cycle, increments by the number of outstanding demand da| | ta read requests pending. | | :DEMAND_CODE_RD | | For every cycle, increments by the number of outstanding code read| | requests pending. | | :CYCLES_WITH_DEMAND_CODE_RD | | Cycles with outstanding code read requests pending. | | :CYCLES_WITH_DEMAND_RFO | | Cycles where at least 1 outstanding Demand RFO request is pending.| | | | :ALL_DATA_RD | | For every cycle, increments by the number of outstanding data read| | requests pending. | | :CYCLES_WITH_DATA_RD | | Cycles where at least 1 outstanding data read request is pending. | | :L3_MISS_DEMAND_DATA_RD | | For every cycle, increments by the number of demand data read requ| | ests pending that are known to have missed the L3 cache. | | :CYCLES_WITH_L3_MISS_DEMAND_DATA_RD | | Cycles where at least one demand data read request known to have m| | issed the L3 cache is pending. | | :L3_MISS_DEMAND_DATA_RD_GE_6 | | Cycles where the core is waiting on at least 6 outstanding demand | | data read requests known to have missed the L3 cache. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | RS_EVENTS | | Reservation Station. | | :EMPTY_END | | Counts end of periods where the Reservation Station (RS) was empty| | . | | :EMPTY_CYCLES | | Cycles when Reservation Station (RS) is empty for the thread | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | TX_EXEC | | Transactional execution. | | :MISC3 | | Number of times an instruction execution caused the transactional | | nest count supported to be exceeded | | :MISC2 | | Counts the number of times a class of instructions that may cause | | a transactional abort was executed inside a transactional region | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | TX_MEM | | Transactional memory. | | :ABORT_CAPACITY_READ | | Speculatively counts the number of TSX aborts due to a data capaci| | ty limitation for transactional reads | | :HLE_ELISION_BUFFER_FULL | | Number of times HLE lock could not be elided due to ElisionBufferA| | vailable being zero. | | :ABORT_HLE_ELISION_BUFFER_UNSUPPORTED_ALIGNMENT | | Number of times an HLE transactional execution aborted due to an u| | nsupported read alignment from the elision buffer. | | :ABORT_HLE_ELISION_BUFFER_MISMATCH | | Number of times an HLE transactional execution aborted due to XREL| | EASE lock not satisfying the address and value requirements in the| | elision buffer | | :ABORT_HLE_ELISION_BUFFER_NOT_EMPTY | | Number of times an HLE transactional execution aborted due to NoAl| | locatedElisionBuffer being non-zero. | | :ABORT_HLE_STORE_TO_ELIDED_LOCK | | Number of times a HLE transactional region aborted due to a non XR| | ELEASE prefixed instruction writing to an elided lock in the elisi| | on buffer | | :ABORT_CAPACITY_WRITE | | Speculatively counts the number of TSX aborts due to a data capaci| | ty limitation for transactional writes. | | :ABORT_CONFLICT | | Number of times a transactional abort was signaled due to a data c| | onflict on a transactionally accessed address | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | L1D | | L1D cache. | | :REPLACEMENT | | Counts the number of cache lines replaced in L1 data cache. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | LOAD_HIT_PREFETCH | | Load dispatches. | | :SWPF | | Counts the number of demand load dispatches that hit L1D fill buff| | er (FB) allocated for software prefetch. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | LOAD_HIT_PRE | | Load dispatches. | | :SWPF | | Counts the number of demand load dispatches that hit L1D fill buff| | er (FB) allocated for software prefetch. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | DTLB_STORE_MISSES | | Data TLB store misses. | | :STLB_HIT | | Stores that miss the DTLB and hit the STLB. | | :WALK_ACTIVE | | Cycles when at least one PMH is busy with a page walk for a store.| | | | :WALK_PENDING | | Number of page walks outstanding for a store in the PMH each cycle| | . | | :WALK_COMPLETED | | Store misses in all TLB levels causes a page walk that completes. | | (All page sizes) | | :WALK_COMPLETED_2M_4M | | Page walks completed due to a demand data store to a 2M/4M page. | | :WALK_COMPLETED_4K | | Page walks completed due to a demand data store to a 4K page. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | L1D_PEND_MISS | | L1D pending misses. | | :L2_STALL | | Number of cycles a demand request has waited due to L1D due to lac| | k of L2 resources. | | :FB_FULL_PERIODS | | Number of phases a demand request has waited due to L1D Fill Buffe| | r (FB) unavailability. | | :FB_FULL | | Number of cycles a demand request has waited due to L1D Fill Buffe| | r (FB) unavailability. | | :PENDING_CYCLES | | Cycles with L1D load Misses outstanding. | | :PENDING | | Number of L1D misses that are outstanding | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | SW_PREFETCH_ACCESS | | Software prefetches. | | :PREFETCHW | | Number of PREFETCHW instructions executed. | | :T1_T2 | | Number of PREFETCHT1 or PREFETCHT2 instructions executed. | | :T0 | | Number of PREFETCHT0 instructions executed. | | :NTA | | Number of PREFETCHNTA instructions executed. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | SW_PREFETCH | | Software prefetches. | | :PREFETCHW | | Number of PREFETCHW instructions executed. | | :T1_T2 | | Number of PREFETCHT1 or PREFETCHT2 instructions executed. | | :T0 | | Number of PREFETCHT0 instructions executed. | | :NTA | | Number of PREFETCHNTA instructions executed. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | LONGEST_LAT_CACHE | | L3 cache. | | :MISS | | Core-originated cacheable demand requests missed L3 (except hardwa| | re prefetches to L3). | | :REFERENCES | | Core-originated cacheable requests that refer to L3 (Except hardwa| | re prefetches to the L3). | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | CORE_POWER | | Power power cycles. | | :LVL2_TURBO_LICENSE | | Core cycles where the core was running in a manner where Turbo may| | be clipped to the AVX512 turbo schedule. | | :LVL1_TURBO_LICENSE | | Core cycles where the core was running in a manner where Turbo may| | be clipped to the AVX2 turbo schedule. | | :LVL0_TURBO_LICENSE | | Core cycles where the core was running in a manner where Turbo may| | be clipped to the Non-AVX turbo schedule. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | L2_RQSTS | | L2 requests. | | :ALL_DEMAND_REFERENCES | | Demand requests to L2 cache | | :ALL_CODE_RD | | L2 code requests | | :ALL_RFO | | RFO requests to L2 cache | | :ALL_DEMAND_DATA_RD | | Demand Data Read requests | | :SWPF_HIT | | SW prefetch requests that hit L2 cache. Accounts for PREFETCHNTA a| | nd PREFETCH0/1/2 instructions when FB is not full. | | :CODE_RD_HIT | | L2 cache hits when fetching instructions, code reads. | | :RFO_HIT | | RFO requests that hit L2 cache | | :DEMAND_DATA_RD_HIT | | Demand Data Read requests that hit L2 cache | | :SWPF_MISS | | SW prefetch requests that miss L2 cache. Accounts for PREFETCHNTA | | and PREFETCH0/1/2 instructions when FB is not full. | | :ALL_DEMAND_MISS | | Demand requests that miss L2 cache | | :CODE_RD_MISS | | L2 cache misses when fetching instructions | | :RFO_MISS | | RFO requests that miss L2 cache | | :DEMAND_DATA_RD_MISS | | Demand Data Read miss L2, no rejects | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | ARITH | | Arithmetic uops. | | :DIVIDER_ACTIVE | | Cycles when divide unit is busy executing divide or square root op| | erations. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | UOPS_ISSUED | | Uops issued. | | :STALL_CYCLES | | Cycles when RAT does not issue Uops to RS for the thread | | :VECTOR_WIDTH_MISMATCH | | Uops inserted at issue-stage in order to preserve upper bits of ve| | ctor registers. | | :ANY | | Uops that RAT issues to RS | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | INT_MISC | | Miscellaneous interruptions. | | :CLEAR_RESTEER_CYCLES | | Counts cycles after recovery from a branch misprediction or machin| | e clear till the first uop is issued from the resteered path. | | :UOP_DROPPING | | TMA slots where uops got dropped | | :ALL_RECOVERY_CYCLES | | Cycles the Backend cluster is recovering after a miss-speculation | | or a Store Buffer or Load Buffer drain stall. | | :RECOVERY_CYCLES | | Core cycles the allocator was stalled due to recovery from earlier| | clear event for this thread | | :CLEARS_COUNT | | Clears speculative count | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | DTLB_LOAD_MISSES | | Data TLB load misses. | | :STLB_HIT | | Loads that miss the DTLB and hit the STLB. | | :WALK_ACTIVE | | Cycles when at least one PMH is busy with a page walk for a demand| | load. | | :WALK_PENDING | | Number of page walks outstanding for a demand load in the PMH each| | cycle. | | :WALK_COMPLETED | | Load miss in all TLB levels causes a page walk that completes (All| | page sizes). | | :WALK_COMPLETED_2M_4M | | Page walks completed due to a demand data load to a 2M/4M page. | | :WALK_COMPLETED_4K | | Page walks completed due to a demand data load to a 4K page. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | LD_BLOCKS_PARTIAL | | Partial load blocks. | | :ADDRESS_ALIAS | | False dependencies in MOB due to partial compare on address. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | LD_BLOCKS | | Blocking loads. | | :NO_SR | | The number of times that split load operations are temporarily blo| | cked because all resources for handling the split accesses are in | | use. | | :STORE_FORWARD | | Loads blocked due to overlapping with a preceding store that canno| | t be forwarded. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | TOPDOWN | | TMA slots available for an unhalted logical processor. | | :BR_MISPREDICT_SLOTS | | TMA slots wasted due to incorrect speculation by branch mispredict| | ions | | :BACKEND_BOUND_SLOTS | | TMA slots where no uops were being issued due to lack of back-end | | resources. | | :SLOTS_P | | TMA slots available for an unhalted logical processor. General cou| | nter - architectural event | | :SLOTS | | TMA slots available for an unhalted logical processor. Fixed count| | er - architectural event | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | CPU_CLK_UNHALTED | | Count core clock cycles whenever the clock signal on the specific | | core is running (not halted). | | :DISTRIBUTED | | Cycle counts are evenly distributed between active threads in the | | Core. | | :REF_DISTRIBUTED | | Core crystal clock cycles. Cycle counts are evenly distributed bet| | ween active threads in the Core. | | :ONE_THREAD_ACTIVE | | Core crystal clock cycles when this thread is unhalted and the oth| | er thread is halted. | | :REF_XCLK | | Core crystal clock cycles when the thread is unhalted. | | :THREAD_P | | Thread cycles when thread is not in halt state | | :REF_TSC | | Reference cycles when the core is not in halt state. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | INST_RETIRED | | Number of instructions retired | | :STALL_CYCLES | | Cycles without actually retired instructions. | | :ANY_P | | Number of instructions retired. General Counter - architectural ev| | ent | | :PREC_DIST | | Precise instruction retired event with a reduced effect of PEBS sh| | adow in IP distribution (Fixed counter 0 only. c, e, i, intx, intx| | cp modifiers not available) | | :ANY | | Number of instructions retired. Fixed Counter - architectural even| | t (c, e, i, intx, intxcp modifiers not available) | | :NOP | | Number of retired NOP instructions. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | UOPS_DECODED | | Number of instructions decoded | | :DEC0 | | Number of uops decoded out of instructions exclusively fetched by | | decoder 0 | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | MEM_LOAD_MISC_RETIRED | | Miscellaneous loads retired | | :UC | | Retired instructions with at least 1 uncacheable load or Bus Lock.| | | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :precise=0 | | precise event sampling | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | -------------------------------------------------------------------------------- | INST_DECODED | | Instructions decoders | | :DECODERS | | Number of decoders utilized in a cycle when the MITE (legacy decod| | e pipeline) fetches instructions. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | OFFCORE_RESPONSE_0 | | Offcore response event | | :OTHER_LOCAL_DRAM | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that DRAM supplied the request. | | :STREAMING_WR_LOCAL_DRAM | | Counts streaming stores that DRAM supplied the request. | | :HWPF_L1D_AND_SWPF_LOCAL_DRAM | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that DRAM supplied the request. | | :HWPF_L2_RFO_LOCAL_DRAM | | Counts hardware prefetch RFOs (which bring data to L2) that DRAM s| | upplied the request. | | :HWPF_L2_DATA_RD_LOCAL_DRAM | | Counts hardware prefetch data reads (which bring data to L2) that| | DRAM supplied the request. | | :DEMAND_CODE_RD_LOCAL_DRAM | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that DRAM supplied the request. | | :DEMAND_RFO_LOCAL_DRAM | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that DRAM supplied the | | request. | | :DEMAND_DATA_RD_LOCAL_DRAM | | Counts demand data reads that DRAM supplied the request. | | :OTHER_L3_MISS | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that was not supplied by the L3 cache. | | :STREAMING_WR_L3_MISS | | Counts streaming stores that was not supplied by the L3 cache. | | :HWPF_L1D_AND_SWPF_L3_MISS | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that was not supplied by the L3 cache. | | :HWPF_L2_RFO_L3_MISS | | Counts hardware prefetch RFOs (which bring data to L2) that was no| | t supplied by the L3 cache. | | :HWPF_L2_DATA_RD_L3_MISS | | Counts hardware prefetch data reads (which bring data to L2) that| | was not supplied by the L3 cache. | | :DEMAND_CODE_RD_L3_MISS | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that was not supplied by the L3 cache. | | :DEMAND_RFO_L3_MISS | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that was not supplied b| | y the L3 cache. | | :DEMAND_DATA_RD_L3_MISS | | Counts demand data reads that was not supplied by the L3 cache. | | :OTHER_DRAM | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that DRAM supplied the request. | | :STREAMING_WR_DRAM | | Counts streaming stores that DRAM supplied the request. | | :HWPF_L1D_AND_SWPF_DRAM | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that DRAM supplied the request. | | :HWPF_L2_RFO_DRAM | | Counts hardware prefetch RFOs (which bring data to L2) that DRAM s| | upplied the request. | | :HWPF_L2_DATA_RD_DRAM | | Counts hardware prefetch data reads (which bring data to L2) that| | DRAM supplied the request. | | :DEMAND_CODE_RD_DRAM | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that DRAM supplied the request. | | :DEMAND_RFO_DRAM | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that DRAM supplied the | | request. | | :DEMAND_DATA_RD_DRAM | | Counts demand data reads that DRAM supplied the request. | | :OTHER_L3_HIT_SNOOP_SENT | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop was sent. | | :HWPF_L2_RFO_L3_HIT_SNOOP_SENT | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was sent. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_SENT | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was sent. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_SENT | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was sent. | | :DEMAND_RFO_L3_HIT_SNOOP_SENT | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was sent. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_SENT | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was sent. | | :OTHER_ANY_RESPONSE | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that have any type of response. | | :STREAMING_WR_ANY_RESPONSE | | Counts streaming stores that have any type of response. | | :HWPF_L1D_AND_SWPF_ANY_RESPONSE | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that have any type of response. | | :HWPF_L2_RFO_ANY_RESPONSE | | Counts hardware prefetch RFOs (which bring data to L2) that have a| | ny type of response. | | :HWPF_L2_DATA_RD_ANY_RESPONSE | | Counts hardware prefetch data reads (which bring data to L2) that| | have any type of response. | | :DEMAND_CODE_RD_ANY_RESPONSE | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that have any type of response. | | :DEMAND_RFO_ANY_RESPONSE | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that have any type of r| | esponse. | | :DEMAND_DATA_RD_ANY_RESPONSE | | Counts demand data reads that have any type of response. | | :HWPF_L3_L3_HIT_ANY | | Counts hardware prefetches to the L3 only that hit a cacheline in | | the L3 where a snoop was sent or not. | | :OTHER_L3_HIT_SNOOP_HIT_NO_FWD | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop hit in another cor| | e, data forwarding is not required. | | :OTHER_L3_HIT_SNOOP_MISS | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop was sent but no ot| | her cores had the data. | | :OTHER_L3_HIT_SNOOP_NOT_NEEDED | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop was not needed to | | satisfy the request. | | :STREAMING_WR_L3_HIT_ANY | | Counts streaming stores that hit a cacheline in the L3 where a sno| | op was sent or not. | | :HWPF_L1D_AND_SWPF_L3_HIT_ANY | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that hit a cacheline in the L3 where a snoop was s| | ent or not. | | :HWPF_L1D_AND_SWPF_L3_HIT_SNOOP_MISS | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that hit a cacheline in the L3 where a snoop was s| | ent but no other cores had the data. | | :HWPF_L1D_AND_SWPF_L3_HIT_SNOOP_NOT_NEEDED | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that hit a cacheline in the L3 where a snoop was n| | ot needed to satisfy the request. | | :HWPF_L2_RFO_L3_HIT_ANY | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was sent or not. | | :HWPF_L2_RFO_L3_HIT_SNOOP_HITM | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop hit in another cores caches, dat| | a forwarding is required as the data is modified. | | :HWPF_L2_RFO_L3_HIT_SNOOP_HIT_NO_FWD | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop hit in another core, data forwar| | ding is not required. | | :HWPF_L2_RFO_L3_HIT_SNOOP_MISS | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was sent but no other cores had | | the data. | | :HWPF_L2_RFO_L3_HIT_SNOOP_NOT_NEEDED | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was not needed to satisfy the re| | quest. | | :HWPF_L2_DATA_RD_L3_HIT_ANY | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was sent or not. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_HITM | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop hit in another cores cach| | es, data forwarding is required as the data is modified. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop hit in another core, data| | forwarding is not required. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_MISS | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was sent but no other cor| | es had the data. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_NOT_NEEDED | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was not needed to satisfy| | the request. | | :DEMAND_CODE_RD_L3_HIT_ANY | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was sent or not. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_HITM | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop hit in another co| | res caches, data forwarding is required as the data is modified. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_HIT_NO_FWD | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop hit in another co| | re, data forwarding is not required. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_MISS | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was sent but no o| | ther cores had the data. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_NOT_NEEDED | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was not needed to| | satisfy the request. | | :DEMAND_RFO_L3_HIT_ANY | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was sent or not. | | :DEMAND_RFO_L3_HIT_SNOOP_HITM | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop hit in another cores caches, data forwarding| | is required as the data is modified. | | :DEMAND_RFO_L3_HIT_SNOOP_HIT_NO_FWD | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop hit in another core, data forwarding is not | | required. | | :DEMAND_RFO_L3_HIT_SNOOP_MISS | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was sent but no other cores had the data. | | :DEMAND_RFO_L3_HIT_SNOOP_NOT_NEEDED | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was not needed to satisfy the request. | | :DEMAND_DATA_RD_L3_HIT_ANY | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was sent or not. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_HITM | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop hit in another cores caches, data forwarding is required as th| | e data is modified. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop hit in another core, data forwarding is not required. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_MISS | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was sent but no other cores had the data. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_NOT_NEEDED | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was not needed to satisfy the request. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | OFFCORE_RESPONSE_1 | | Offcore response event | | :OTHER_LOCAL_DRAM | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that DRAM supplied the request. | | :STREAMING_WR_LOCAL_DRAM | | Counts streaming stores that DRAM supplied the request. | | :HWPF_L1D_AND_SWPF_LOCAL_DRAM | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that DRAM supplied the request. | | :HWPF_L2_RFO_LOCAL_DRAM | | Counts hardware prefetch RFOs (which bring data to L2) that DRAM s| | upplied the request. | | :HWPF_L2_DATA_RD_LOCAL_DRAM | | Counts hardware prefetch data reads (which bring data to L2) that| | DRAM supplied the request. | | :DEMAND_CODE_RD_LOCAL_DRAM | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that DRAM supplied the request. | | :DEMAND_RFO_LOCAL_DRAM | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that DRAM supplied the | | request. | | :DEMAND_DATA_RD_LOCAL_DRAM | | Counts demand data reads that DRAM supplied the request. | | :OTHER_L3_MISS | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that was not supplied by the L3 cache. | | :STREAMING_WR_L3_MISS | | Counts streaming stores that was not supplied by the L3 cache. | | :HWPF_L1D_AND_SWPF_L3_MISS | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that was not supplied by the L3 cache. | | :HWPF_L2_RFO_L3_MISS | | Counts hardware prefetch RFOs (which bring data to L2) that was no| | t supplied by the L3 cache. | | :HWPF_L2_DATA_RD_L3_MISS | | Counts hardware prefetch data reads (which bring data to L2) that| | was not supplied by the L3 cache. | | :DEMAND_CODE_RD_L3_MISS | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that was not supplied by the L3 cache. | | :DEMAND_RFO_L3_MISS | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that was not supplied b| | y the L3 cache. | | :DEMAND_DATA_RD_L3_MISS | | Counts demand data reads that was not supplied by the L3 cache. | | :OTHER_DRAM | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that DRAM supplied the request. | | :STREAMING_WR_DRAM | | Counts streaming stores that DRAM supplied the request. | | :HWPF_L1D_AND_SWPF_DRAM | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that DRAM supplied the request. | | :HWPF_L2_RFO_DRAM | | Counts hardware prefetch RFOs (which bring data to L2) that DRAM s| | upplied the request. | | :HWPF_L2_DATA_RD_DRAM | | Counts hardware prefetch data reads (which bring data to L2) that| | DRAM supplied the request. | | :DEMAND_CODE_RD_DRAM | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that DRAM supplied the request. | | :DEMAND_RFO_DRAM | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that DRAM supplied the | | request. | | :DEMAND_DATA_RD_DRAM | | Counts demand data reads that DRAM supplied the request. | | :OTHER_L3_HIT_SNOOP_SENT | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop was sent. | | :HWPF_L2_RFO_L3_HIT_SNOOP_SENT | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was sent. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_SENT | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was sent. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_SENT | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was sent. | | :DEMAND_RFO_L3_HIT_SNOOP_SENT | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was sent. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_SENT | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was sent. | | :OTHER_ANY_RESPONSE | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that have any type of response. | | :STREAMING_WR_ANY_RESPONSE | | Counts streaming stores that have any type of response. | | :HWPF_L1D_AND_SWPF_ANY_RESPONSE | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that have any type of response. | | :HWPF_L2_RFO_ANY_RESPONSE | | Counts hardware prefetch RFOs (which bring data to L2) that have a| | ny type of response. | | :HWPF_L2_DATA_RD_ANY_RESPONSE | | Counts hardware prefetch data reads (which bring data to L2) that| | have any type of response. | | :DEMAND_CODE_RD_ANY_RESPONSE | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that have any type of response. | | :DEMAND_RFO_ANY_RESPONSE | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that have any type of r| | esponse. | | :DEMAND_DATA_RD_ANY_RESPONSE | | Counts demand data reads that have any type of response. | | :HWPF_L3_L3_HIT_ANY | | Counts hardware prefetches to the L3 only that hit a cacheline in | | the L3 where a snoop was sent or not. | | :OTHER_L3_HIT_SNOOP_HIT_NO_FWD | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop hit in another cor| | e, data forwarding is not required. | | :OTHER_L3_HIT_SNOOP_MISS | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop was sent but no ot| | her cores had the data. | | :OTHER_L3_HIT_SNOOP_NOT_NEEDED | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop was not needed to | | satisfy the request. | | :STREAMING_WR_L3_HIT_ANY | | Counts streaming stores that hit a cacheline in the L3 where a sno| | op was sent or not. | | :HWPF_L1D_AND_SWPF_L3_HIT_ANY | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that hit a cacheline in the L3 where a snoop was s| | ent or not. | | :HWPF_L1D_AND_SWPF_L3_HIT_SNOOP_MISS | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that hit a cacheline in the L3 where a snoop was s| | ent but no other cores had the data. | | :HWPF_L1D_AND_SWPF_L3_HIT_SNOOP_NOT_NEEDED | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that hit a cacheline in the L3 where a snoop was n| | ot needed to satisfy the request. | | :HWPF_L2_RFO_L3_HIT_ANY | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was sent or not. | | :HWPF_L2_RFO_L3_HIT_SNOOP_HITM | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop hit in another cores caches, dat| | a forwarding is required as the data is modified. | | :HWPF_L2_RFO_L3_HIT_SNOOP_HIT_NO_FWD | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop hit in another core, data forwar| | ding is not required. | | :HWPF_L2_RFO_L3_HIT_SNOOP_MISS | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was sent but no other cores had | | the data. | | :HWPF_L2_RFO_L3_HIT_SNOOP_NOT_NEEDED | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was not needed to satisfy the re| | quest. | | :HWPF_L2_DATA_RD_L3_HIT_ANY | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was sent or not. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_HITM | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop hit in another cores cach| | es, data forwarding is required as the data is modified. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop hit in another core, data| | forwarding is not required. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_MISS | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was sent but no other cor| | es had the data. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_NOT_NEEDED | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was not needed to satisfy| | the request. | | :DEMAND_CODE_RD_L3_HIT_ANY | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was sent or not. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_HITM | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop hit in another co| | res caches, data forwarding is required as the data is modified. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_HIT_NO_FWD | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop hit in another co| | re, data forwarding is not required. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_MISS | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was sent but no o| | ther cores had the data. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_NOT_NEEDED | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was not needed to| | satisfy the request. | | :DEMAND_RFO_L3_HIT_ANY | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was sent or not. | | :DEMAND_RFO_L3_HIT_SNOOP_HITM | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop hit in another cores caches, data forwarding| | is required as the data is modified. | | :DEMAND_RFO_L3_HIT_SNOOP_HIT_NO_FWD | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop hit in another core, data forwarding is not | | required. | | :DEMAND_RFO_L3_HIT_SNOOP_MISS | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was sent but no other cores had the data. | | :DEMAND_RFO_L3_HIT_SNOOP_NOT_NEEDED | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was not needed to satisfy the request. | | :DEMAND_DATA_RD_L3_HIT_ANY | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was sent or not. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_HITM | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop hit in another cores caches, data forwarding is required as th| | e data is modified. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop hit in another core, data forwarding is not required. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_MISS | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was sent but no other cores had the data. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_NOT_NEEDED | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was not needed to satisfy the request. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | -------------------------------------------------------------------------------- | OCR | | Offcore response event | | :OTHER_LOCAL_DRAM | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that DRAM supplied the request. | | :STREAMING_WR_LOCAL_DRAM | | Counts streaming stores that DRAM supplied the request. | | :HWPF_L1D_AND_SWPF_LOCAL_DRAM | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that DRAM supplied the request. | | :HWPF_L2_RFO_LOCAL_DRAM | | Counts hardware prefetch RFOs (which bring data to L2) that DRAM s| | upplied the request. | | :HWPF_L2_DATA_RD_LOCAL_DRAM | | Counts hardware prefetch data reads (which bring data to L2) that| | DRAM supplied the request. | | :DEMAND_CODE_RD_LOCAL_DRAM | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that DRAM supplied the request. | | :DEMAND_RFO_LOCAL_DRAM | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that DRAM supplied the | | request. | | :DEMAND_DATA_RD_LOCAL_DRAM | | Counts demand data reads that DRAM supplied the request. | | :OTHER_L3_MISS | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that was not supplied by the L3 cache. | | :STREAMING_WR_L3_MISS | | Counts streaming stores that was not supplied by the L3 cache. | | :HWPF_L1D_AND_SWPF_L3_MISS | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that was not supplied by the L3 cache. | | :HWPF_L2_RFO_L3_MISS | | Counts hardware prefetch RFOs (which bring data to L2) that was no| | t supplied by the L3 cache. | | :HWPF_L2_DATA_RD_L3_MISS | | Counts hardware prefetch data reads (which bring data to L2) that| | was not supplied by the L3 cache. | | :DEMAND_CODE_RD_L3_MISS | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that was not supplied by the L3 cache. | | :DEMAND_RFO_L3_MISS | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that was not supplied b| | y the L3 cache. | | :DEMAND_DATA_RD_L3_MISS | | Counts demand data reads that was not supplied by the L3 cache. | | :OTHER_DRAM | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that DRAM supplied the request. | | :STREAMING_WR_DRAM | | Counts streaming stores that DRAM supplied the request. | | :HWPF_L1D_AND_SWPF_DRAM | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that DRAM supplied the request. | | :HWPF_L2_RFO_DRAM | | Counts hardware prefetch RFOs (which bring data to L2) that DRAM s| | upplied the request. | | :HWPF_L2_DATA_RD_DRAM | | Counts hardware prefetch data reads (which bring data to L2) that| | DRAM supplied the request. | | :DEMAND_CODE_RD_DRAM | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that DRAM supplied the request. | | :DEMAND_RFO_DRAM | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that DRAM supplied the | | request. | | :DEMAND_DATA_RD_DRAM | | Counts demand data reads that DRAM supplied the request. | | :OTHER_L3_HIT_SNOOP_SENT | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop was sent. | | :HWPF_L2_RFO_L3_HIT_SNOOP_SENT | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was sent. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_SENT | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was sent. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_SENT | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was sent. | | :DEMAND_RFO_L3_HIT_SNOOP_SENT | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was sent. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_SENT | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was sent. | | :OTHER_ANY_RESPONSE | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that have any type of response. | | :STREAMING_WR_ANY_RESPONSE | | Counts streaming stores that have any type of response. | | :HWPF_L1D_AND_SWPF_ANY_RESPONSE | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that have any type of response. | | :HWPF_L2_RFO_ANY_RESPONSE | | Counts hardware prefetch RFOs (which bring data to L2) that have a| | ny type of response. | | :HWPF_L2_DATA_RD_ANY_RESPONSE | | Counts hardware prefetch data reads (which bring data to L2) that| | have any type of response. | | :DEMAND_CODE_RD_ANY_RESPONSE | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that have any type of response. | | :DEMAND_RFO_ANY_RESPONSE | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that have any type of r| | esponse. | | :DEMAND_DATA_RD_ANY_RESPONSE | | Counts demand data reads that have any type of response. | | :HWPF_L3_L3_HIT_ANY | | Counts hardware prefetches to the L3 only that hit a cacheline in | | the L3 where a snoop was sent or not. | | :OTHER_L3_HIT_SNOOP_HIT_NO_FWD | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop hit in another cor| | e, data forwarding is not required. | | :OTHER_L3_HIT_SNOOP_MISS | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop was sent but no ot| | her cores had the data. | | :OTHER_L3_HIT_SNOOP_NOT_NEEDED | | Counts miscellaneous requests, such as I/O and un-cacheable access| | es that hit a cacheline in the L3 where a snoop was not needed to | | satisfy the request. | | :STREAMING_WR_L3_HIT_ANY | | Counts streaming stores that hit a cacheline in the L3 where a sno| | op was sent or not. | | :HWPF_L1D_AND_SWPF_L3_HIT_ANY | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that hit a cacheline in the L3 where a snoop was s| | ent or not. | | :HWPF_L1D_AND_SWPF_L3_HIT_SNOOP_MISS | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that hit a cacheline in the L3 where a snoop was s| | ent but no other cores had the data. | | :HWPF_L1D_AND_SWPF_L3_HIT_SNOOP_NOT_NEEDED | | Counts L1 data cache prefetch requests and software prefetches (ex| | cept PREFETCHW) that hit a cacheline in the L3 where a snoop was n| | ot needed to satisfy the request. | | :HWPF_L2_RFO_L3_HIT_ANY | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was sent or not. | | :HWPF_L2_RFO_L3_HIT_SNOOP_HITM | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop hit in another cores caches, dat| | a forwarding is required as the data is modified. | | :HWPF_L2_RFO_L3_HIT_SNOOP_HIT_NO_FWD | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop hit in another core, data forwar| | ding is not required. | | :HWPF_L2_RFO_L3_HIT_SNOOP_MISS | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was sent but no other cores had | | the data. | | :HWPF_L2_RFO_L3_HIT_SNOOP_NOT_NEEDED | | Counts hardware prefetch RFOs (which bring data to L2) that hit a | | cacheline in the L3 where a snoop was not needed to satisfy the re| | quest. | | :HWPF_L2_DATA_RD_L3_HIT_ANY | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was sent or not. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_HITM | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop hit in another cores cach| | es, data forwarding is required as the data is modified. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop hit in another core, data| | forwarding is not required. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_MISS | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was sent but no other cor| | es had the data. | | :HWPF_L2_DATA_RD_L3_HIT_SNOOP_NOT_NEEDED | | Counts hardware prefetch data reads (which bring data to L2) that| | hit a cacheline in the L3 where a snoop was not needed to satisfy| | the request. | | :DEMAND_CODE_RD_L3_HIT_ANY | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was sent or not. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_HITM | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop hit in another co| | res caches, data forwarding is required as the data is modified. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_HIT_NO_FWD | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop hit in another co| | re, data forwarding is not required. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_MISS | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was sent but no o| | ther cores had the data. | | :DEMAND_CODE_RD_L3_HIT_SNOOP_NOT_NEEDED | | Counts demand instruction fetches and L1 instruction cache prefetc| | hes that hit a cacheline in the L3 where a snoop was not needed to| | satisfy the request. | | :DEMAND_RFO_L3_HIT_ANY | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was sent or not. | | :DEMAND_RFO_L3_HIT_SNOOP_HITM | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop hit in another cores caches, data forwarding| | is required as the data is modified. | | :DEMAND_RFO_L3_HIT_SNOOP_HIT_NO_FWD | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop hit in another core, data forwarding is not | | required. | | :DEMAND_RFO_L3_HIT_SNOOP_MISS | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was sent but no other cores had the data. | | :DEMAND_RFO_L3_HIT_SNOOP_NOT_NEEDED | | Counts demand reads for ownership (RFO) requests and software pref| | etches for exclusive ownership (PREFETCHW) that hit a cacheline in| | the L3 where a snoop was not needed to satisfy the request. | | :DEMAND_DATA_RD_L3_HIT_ANY | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was sent or not. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_HITM | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop hit in another cores caches, data forwarding is required as th| | e data is modified. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_HIT_NO_FWD | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop hit in another core, data forwarding is not required. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_MISS | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was sent but no other cores had the data. | | :DEMAND_DATA_RD_L3_HIT_SNOOP_NOT_NEEDED | | Counts demand data reads that hit a cacheline in the L3 where a sn| | oop was not needed to satisfy the request. | | :e=0 | | edge level (may require counter-mask >= 1) | | :i=0 | | invert | | :c=0 | | counter-mask in range [0-255] | | :intx=0 | | monitor only inside transactional memory region | | :intxcp=0 | | do not count occurrences inside aborted transactional memory regio| | n | | :u=0 | | monitor at user level | | :k=0 | | monitor at kernel level | | :period=0 | | sampling period | | :freq=0 | | sampling frequency (Hz) | | :excl=0 | | exclusive access | | :mg=0 | | monitor guest execution | | :mh=0 | | monitor host execution | | :cpu=0 | | CPU to program | | :pinned=0 | | pin event to counters | | :hw_smpl=0 | | enable hardware sampling | =============================================================================== Native Events in Component: sysdetect =============================================================================== Total events reported: 165