-
Notifications
You must be signed in to change notification settings - Fork 229
NehalemEX
Intel Nehalem EX Performance groups
The input file for the events on Intel Nehalem EX can be found here.
Since the Core2 microarchitecture, Intel® provides a set of fixed-purpose counters. Each can measure only one specific event.
Counter name | Event name |
---|---|
FIXC0 | INSTR_RETIRED_ANY |
FIXC1 | CPU_CLK_UNHALTED_CORE |
FIXC2 | CPU_CLK_UNHALTED_REF |
Option | Argument | Description | Comment |
---|---|---|---|
anythread | N | Set bit 2+(index*4) in config register | |
kernel | N | Set bit (index*4) in config register |
The Intel® Nehalem EX microarchitecture provides 4 general-purpose counters consisting of a config and a counter register.
Counter name | Event name |
---|---|
PMC0 | * |
PMC1 | * |
PMC2 | * |
PMC3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
kernel | N | Set bit 17 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register |
The Intel® Nehalem EX microarchitecture provides measuring of offcore events in PMC counters. Therefore the stream of offcore events must be filtered using the OFFCORE_RESPONSE registers. The Intel® Nehalem EX microarchitecture has two of those registers. Own filtering can be applied with the OFFCORE_RESPONSE_0_OPTIONS event. Only for those events two more counter options are available:
Option | Argument | Description | Comment |
---|---|---|---|
match0 | 8 bit hex value | Input value masked with 0xFF and written to bits 0-7 in the OFFCORE_RESPONSE register | Check the Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring and https://download.01.org/perfmon/NHM-EX. |
match0 | 8 bit hex value | Input value masked with 0xF7 and written to bits 8-15 in the OFFCORE_RESPONSE register | Check the Intel® Software Developer System Programming Manual, Vol. 3, Chapter Performance Monitoring and https://download.01.org/perfmon/NHM-EX. |
The Intel® Nehalem EX microarchitecture provides measurements of the memory controllers in the Uncore. The description from Intel®:
The memory controller interfaces to the Intel® 7500 Scalable Memory Buffers and translates read and write commands into specific Intel® Scalable Memory Interconnect (Intel® SMI) operations. Intel SMI is based on the FB-DIMM architecture, but the Intel 7500 Scalable Memory Buffer is not an AMB2 device and has significant exceptions to the FB-DIMM2 architecture. The memory controller also provides a variety of RAS features, such as ECC, memory scrubbing, thermal throttling, mirroring, and DIMM sparing. Each socket has two independent memory controllers, and each memory controller has two Intel SMI channels that operate in lockstep.
The Intel® Nehalem EX microarchitecture has 2 memory controllers, each with 6 general-purpose counters. They are exposed through the MSR interface to the operating system kernel. The MBOX and RBOX setup routines are taken from LIKWID 3, they are not as flexible as the newer setup routines but programming of the MBOXes and RBOXes is tedious for Westmere EX. It is not possible to specify a FVID (Fill Victim Index) for the MBOX or IPERF option for RBOXes.
Counter name | Event name |
---|---|
MBOX<0,1>C0 | * |
MBOX<0,1>C1 | * |
MBOX<0,1>C2 | * |
MBOX<0,1>C3 | * |
MBOX<0,1>C4 | * |
MBOX<0,1>C5 | * |
Option | Argument | Description | Comment |
---|---|---|---|
match0 | 34 bit address | Set bits 0-33 in MSR_M<0,1>_PMON_ADDR_MATCH register | |
mask0 | 60 bit hex value | Extract bits 6-33 from address and set bits 0-27 in MSR_M<0,1>_PMON_ADDR_MASK register |
For the events THERM_TRP_DN and THERM_TRP_UP you cannot measure events for all and one specific DIMM simultaneously because they program the same filter register MSR_M<0,1>_PMON_MSC_THR and have contrary configurations.
Although the events FVC_EV<0-3> are available to measure multiple memory events, some overlap and do not allow simultaneous measuring. That's because they program the same filter register MSR_M<0,1>_PMON_ZDP and have contrary configurations. One case are the FVC_EV<0-3>_BBOX_CMDS_READS and FVC_EV<0-3>_BBOX_CMDS_WRITES events that measure memory reads or writes but cannot be measured at the same time.
The Intel® Nehalem EX microarchitecture provides measurements of the Home Agent in the Uncore. The description from Intel®:
The B-Box is responsible for the protocol side of memory interactions, including coherent and non-coherent home agent protocols (as defined in the Intel® QuickPath Interconnect Specification). Additionally, the B-Box is responsible for ordering memory reads/writes to a given address such that the M-Box does not have to perform this conflict checking. All requests for memory attached to the coupled M-Box must first be ordered through the B-Box.
The memory traffic in an Intel® Nehalem EX system is controller by the Home Agents. Each MBOX has a corresponding BBOX. Each BBOX offers 4 general-purpose counters. They are exposed through the MSR interface to the operating system kernel.
Counter name | Event name |
---|---|
BBOX<0,1>C0 | * |
BBOX<0,1>C1 | * |
BBOX<0,1>C2 | * |
BBOX<0,1>C3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
match0 | 60 bit hex value | Set bits 0-59 in MSR_B<0,1>_PMON_MATCH register | For register layout and valid settings see Intel® Xeon® Processor 7500 Series Uncore Programming Guide |
mask0 | 60 bit hex value | Set bits 0-59 in MSR_B<0,1>_PMON_MASK register | For register layout and valid settings see Intel® Xeon® Processor 7500 Series Uncore Programming Guide |
The Intel® Nehalem EX microarchitecture provides measurements of the crossbar router in the Uncore. The description from Intel®:
The Crossbar Router (R-Box) is a 8 port switch/router implementing the Intel® QuickPath Interconnect Link and Routing layers. The R-Box is responsible for routing and transmitting all intra- and inter-processor communication.
The Intel® Nehalem EX microarchitecture has two interfaces to the RBOX although each socket contains only one crossbar router. Each RBOX offers 8 general-purpose counters. They are exposed through the MSR interface to the operating system kernel. The RBOX setup routine is taken from LIKWID 3.
Counter name | Event name |
---|---|
RBOX<0,1>C0 | * |
RBOX<0,1>C1 | * |
RBOX<0,1>C2 | * |
RBOX<0,1>C3 | * |
RBOX<0,1>C4 | * |
RBOX<0,1>C5 | * |
RBOX<0,1>C6 | * |
RBOX<0,1>C7 | * |
The Intel® Nehalem EX microarchitecture provides measurements of the LLC coherency engine in the Uncore. The description from Intel®:
For the Intel Xeon Processor 7500 Series, the LLC coherence engine (C-Box) manages the interface between the core and the last level cache (LLC). All core transactions that access the LLC are directed from the core to a C-Box via the ring interconnect. The C-Box is responsible for managing data delivery from the LLC to the requesting core. It is also responsible for maintaining coherence between the cores within the socket that share the LLC; generating snoops and collecting snoop responses to the local cores when the MESI protocol requires it.
The C-Box is also the gate keeper for all Intel® QuickPath Interconnect (Intel® QPI) messages that originate in the core and is responsible for ensuring that all Intel QuickPath Interconnect messages that pass through the socket’s LLC remain coherent.
The Intel® Nehalem EX microarchitecture has 8 CBOX instances. Each CBOX offers 6 general-purpose counters. They are exposed through the MSR interface to the operating system kernel.
Counter name | Event name |
---|---|
CBOX<0-7>C0 | * |
CBOX<0-7>C1 | * |
CBOX<0-7>C2 | * |
CBOX<0-7>C3 | * |
CBOX<0-7>C4 | * |
CBOX<0-7>C5 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
threshold | 5 bit hex value | Set bits 24-28 in config register | |
invert | N | Set bit 23 in config register |
The Intel® Nehalem EX microarchitecture provides measurements of the LLC-to-QPI interface in the Uncore. The description from Intel®:
The S-Box represents the interface between the last level cache and the system interface. It manages flow control between the C and R & B-Boxes. The S-Box is broken into system bound (ring to Intel® QPI) and ring bound (Intel® QPI to ring) connections.
As such, it shares responsibility with the C-Box(es) as the Intel® QPI caching agent(s). It is responsible for converting C-box requests to Intel® QPI messages (i.e. snoop generation and data response messages from the snoop response) as well as converting/forwarding ring messages to Intel® QPI packets and vice versa.
The Intel® Nehalem EX microarchitecture has 2 SBOX instances. Each SBOX offers 4 general-purpose counters. They are exposed through the MSR interface to the operating system kernel.
Counter name | Event name |
---|---|
SBOX<0,1>C0 | * |
SBOX<0,1>C1 | * |
SBOX<0,1>C2 | * |
SBOX<0,1>C3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register |
Option | Argument | Description | Comment |
---|---|---|---|
match0 | 64 bit hex value | Set bit 0-63 in MSR_S<0,1>_PMON_MATCH register | For register layout and valid settings see Intel® Xeon® Processor 7500 Series Uncore Programming Guide |
mask0 | 39 bit hex value | Set bit 0-38 in MSR_S<0,1>_PMON_MASK register | For register layout and valid settings see Intel® Xeon® Processor 7500 Series Uncore Programming Guide |
The Intel® Nehalem EX microarchitecture provides measurements of the power controller in the Uncore. The description from Intel®:
The W-Box is the primary Power Controller for the Intel® Xeon® Processor 7500 Series.
It provides one fixed-purpose counter to measure the clock frequency of the Uncore.
Counter name | Event name |
---|---|
WBOXFIX | UNCORE_CLOCKTICKS |
The Intel® Nehalem EX microarchitecture provides measurements of the power controller in the Uncore. The description from Intel®:
The W-Box is the primary Power Controller for the Intel® Xeon® Processor 7500 Series.
The Intel® Nehalem EX microarchitecture has one WBOX and it offers 4 general-purpose counters. They are exposed through the MSR interface to the operating system kernel.
Counter name | Event name |
---|---|
WBOX0 | * |
WBOX1 | * |
WBOX2 | * |
WBOX3 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register | |
threshold | 8 bit hex value | Set bits 24-31 in config register | |
invert | N | Set bit 23 in config register |
The U-Box serves as the system configuration controller for the Intel® Xeon® Processor E7 Family.
The Intel® Nehalem EX microarchitecture has one UBOX and it offers a single general-purpose counter. It is exposed through the MSR interface to the operating system kernel. ##### Counters
Counter name | Event name |
---|---|
UBOX0 | * |
Option | Argument | Description | Comment |
---|---|---|---|
edgedetect | N | Set bit 18 in config register |
-
Applications
-
Config files
-
Daemons
-
Architectures
- Available counter options
- AMD
- Intel
- Intel Atom
- Intel Pentium M
- Intel Core2
- Intel Nehalem
- Intel NehalemEX
- Intel Westmere
- Intel WestmereEX
- Intel Xeon Phi (KNC)
- Intel Silvermont & Airmont
- Intel Goldmont
- Intel SandyBridge
- Intel SandyBridge EP/EN
- Intel IvyBridge
- Intel IvyBridge EP/EN/EX
- Intel Haswell
- Intel Haswell EP/EN/EX
- Intel Broadwell
- Intel Broadwell D
- Intel Broadwell EP
- Intel Skylake
- Intel Coffeelake
- Intel Kabylake
- Intel Xeon Phi (KNL)
- Intel Skylake X
- Intel Cascadelake SP/AP
- Intel Tigerlake
- Intel Icelake
- Intel Icelake X
- Intel SappireRapids
- ARM
- POWER
-
Tutorials
-
Miscellaneous
-
Contributing