Skip to content

Commit

Permalink
add fixes
Browse files Browse the repository at this point in the history
Signed-off-by: Peter Jun Park <peter.park@amd.com>
  • Loading branch information
peterjunpark committed Jul 26, 2024
1 parent bcb858e commit b68ef41
Show file tree
Hide file tree
Showing 7 changed files with 116 additions and 111 deletions.
42 changes: 21 additions & 21 deletions docs/conceptual/l2-cache.rst
Original file line number Diff line number Diff line change
Expand Up @@ -132,14 +132,14 @@ This section details the incoming requests to the L2 cache from the
if only a single value is requested in a cache line, the data movement
will still be counted as a full cache line.

- Bytes per normalization unit
- Bytes per :ref:`normalization unit <normalization-units>`.

* - Requests

- The total number of incoming requests to the L2 from all clients for all
request types, per :ref:`normalization unit <normalization-units>`.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - Read Requests

Expand Down Expand Up @@ -221,22 +221,22 @@ This section details the incoming requests to the L2 cache from the
- The total number of L2 cache lines written back to memory for internal
hardware reasons, per :ref:`normalization unit <normalization-units>`.

- Cache lines per normalization unit
- Cache lines per :ref:`normalization unit <normalization-units>`.

* - Writebacks (vL1D Req)

- The total number of L2 cache lines written back to memory due to requests
initiated by the :doc:`vL1D cache <vector-l1-cache>`, per
:ref:`normalization unit <normalization-units>`.

- Cache lines per normalization unit
- Cache lines per :ref:`normalization unit <normalization-units>`.

* - Evictions (Normal)

- The total number of L2 cache lines evicted from the cache due to capacity
limits, per :ref:`normalization unit <normalization-units>`.

- Cache lines per normalization unit
- Cache lines per :ref:`normalization unit <normalization-units>`.

* - Evictions (vL1D Req)

Expand All @@ -245,15 +245,15 @@ This section details the incoming requests to the L2 cache from the
:doc:`vL1D cache <vector-l1-cache>`, per
:ref:`normalization unit <normalization-units>`.

- Cache lines per normalization unit
- Cache lines per :ref:`normalization unit <normalization-units>`.

* - Non-hardware-Coherent Requests

- The total number of requests to the L2 to Not-hardware-Coherent (NC)
memory allocations, per :ref:`normalization unit <normalization-units>`.
See the :ref:`memory-type` for more information.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - Uncached Requests

Expand Down Expand Up @@ -396,7 +396,7 @@ Metrics
- The total number of bytes read by the L2 cache from Infinity Fabric per
:ref:`normalization unit <normalization-units>`.

- Bytes per normalization unit
- Bytes per :ref:`normalization unit <normalization-units>`.

* - HBM Read Traffic

Expand Down Expand Up @@ -446,7 +446,7 @@ Metrics
:ref:`uncached memory <memory-type>` allocations on the
MI2XX.

- Bytes per normalization unit
- Bytes per :ref:`normalization unit <normalization-units>`.

* - HBM Write and Atomic Traffic

Expand Down Expand Up @@ -529,7 +529,7 @@ Metrics
* - Read Stall

- The ratio of the total number of cycles the L2-Fabric interface was
stalled on a read request to any destination (local HBM, remote PCIe
stalled on a read request to any destination (local HBM, remote PCIe®
connected accelerator or CPU, or remote Infinity Fabric connected
accelerator [#inf]_ or CPU) over the
:ref:`total active L2 cycles <total-active-l2-cycles>`.
Expand Down Expand Up @@ -571,7 +571,7 @@ transaction breakdown table:
:ref:`l2-request-flow` for more detail. Typically unused on CDNA
accelerators.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - Uncached Read Requests

Expand All @@ -581,7 +581,7 @@ transaction breakdown table:
uncached data are counted as two 32B uncached data requests. See
:ref:`l2-request-flow` for more detail.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - 64B Read Requests

Expand All @@ -590,7 +590,7 @@ transaction breakdown table:
:ref:`normalization unit <normalization-units>`. See
:ref:`l2-request-flow` for more detail.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - HBM Read Requests

Expand All @@ -599,7 +599,7 @@ transaction breakdown table:
:ref:`normalization unit <normalization-units>`. See
:ref:`l2-request-flow` for more detail.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - Remote Read Requests

Expand All @@ -608,7 +608,7 @@ transaction breakdown table:
:ref:`normalization unit <normalization-units>`. See
:ref:`l2-request-flow` for more detail.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - 32B Write and Atomic Requests

Expand All @@ -617,7 +617,7 @@ transaction breakdown table:
:ref:`normalization unit <normalization-units>`. See
:ref:`l2-request-flow` for more detail.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - Uncached Write and Atomic Requests

Expand All @@ -626,7 +626,7 @@ transaction breakdown table:
:ref:`normalization unit <normalization-units>`. See
:ref:`l2-request-flow` for more detail.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - 64B Write and Atomic Requests

Expand All @@ -635,7 +635,7 @@ transaction breakdown table:
:ref:`normalization unit <normalization-units>`. See
:ref:`l2-request-flow` for more detail.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - HBM Write and Atomic Requests

Expand All @@ -644,7 +644,7 @@ transaction breakdown table:
:ref:`normalization unit <normalization-units>`. See
:ref:`l2-request-flow` for more detail.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - Remote Write and Atomic Requests

Expand All @@ -654,7 +654,7 @@ transaction breakdown table:
:ref:`normalization unit <normalization-units>`. See
:ref:`l2-request-flow` for more detail.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

* - Atomic Requests

Expand All @@ -668,7 +668,7 @@ transaction breakdown table:
:ref:`fine-grained memory <memory-type>` allocations or
:ref:`uncached memory <memory-type>` allocations on the MI2XX.

- Requests per normalization unit
- Requests per :ref:`normalization unit <normalization-units>`.

.. _l2-fabric-stalls:

Expand Down
16 changes: 8 additions & 8 deletions docs/conceptual/local-data-share.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ The LDS statistics panel gives a more detailed view of the hardware:
read/write/atomics and HIP's ``__shfl`` instructions) executed per
:ref:`normalization unit <normalization-units>`.

- Instructions per normalization unit
- Instructions per :ref:`normalization unit <normalization-units>`.

* - Theoretical Bandwidth

Expand All @@ -112,7 +112,7 @@ The LDS statistics panel gives a more detailed view of the hardware:
executed. See the
:ref:`LDS bandwidth example <lds-bandwidth>` for more detail.

- Bytes per normalization unit
- Bytes per :ref:`normalization unit <normalization-units>`.

* - LDS Latency

Expand All @@ -136,44 +136,44 @@ The LDS statistics panel gives a more detailed view of the hardware:
- The total number of cycles spent in the :ref:`LDS scheduler <desc-lds>`
over all operations per :ref:`normalization unit <normalization-units>`.

- Cycles per normalization unit
- Cycles per :ref:`normalization unit <normalization-units>`.

* - Atomic Return Cycles

- The total number of cycles spent on LDS atomics with return per
:ref:`normalization unit <normalization-units>`.

- Cycles per normalization unit
- Cycles per :ref:`normalization unit <normalization-units>`.

* - Bank Conflicts

- The total number of cycles spent in the :ref:`LDS scheduler <desc-lds>`
due to bank conflicts (as determined by the conflict resolution hardware)
per :ref:`normalization unit <normalization-units>`.

- Cycles per normalization unit
- Cycles per :ref:`normalization unit <normalization-units>`.

* - Address Conflicts

- The total number of cycles spent in the :ref:`LDS scheduler <desc-lds>`
due to address conflicts (as determined by the conflict resolution
hardware) per :ref:`normalization unit <normalization-units>`.

- Cycles per normalization unit
- Cycles per :ref:`normalization unit <normalization-units>`.

* - Unaligned Stall

- The total number of cycles spent in the :ref:`LDS scheduler <desc-lds>`
due to stalls from non-dword aligned addresses per
:ref:`normalization unit <normalization-units>`.

- Cycles per normalization unit
- Cycles per :ref:`normalization unit <normalization-units>`.

* - Memory Violations

- The total number of out-of-bounds accesses made to the LDS, per
:ref:`normalization unit <normalization-units>`. This is unused and
expected to be zero in most configurations for modern CDNA accelerators.

- Accesses per normalization unit
- Accesses per :ref:`normalization unit <normalization-units>`.

Loading

0 comments on commit b68ef41

Please sign in to comment.